PDA

View Full Version : How to make interleave row rendering?



pango
11-19-2006, 08:39 PM
My program need a interleave row rendering function,it means program only need to render row 0,2,4,...,or row 1,3,5,...,The method used in my program is rendering the whole frame,copy data to a texture,then set the texture's sample filter as GL_NEAREST,and draw it all in a half height rectangle.But the method is slow because it need rendering whole frame,is there anyway can do interleave row renderig,and the performance is better than whole frame rendering?

Korval
11-19-2006, 11:17 PM
Why not just squash the rendering to half-height, rather than redrawing it as a texture?

k_szczech
11-20-2006, 01:33 AM
If you need to display rows 0,2,4 in one frame and then rows 1,3,5 in next frame using the same texture, then what you do is correct.
You could possibly gain some speed by rendering directly to texture (using FBO).

If you need to display different images every frame, then you could do what Korval suggests. Just add offset to GL_PROJECTON matrix to switch between even/odd rows before you render a frame.

pango
11-20-2006, 05:52 PM
Originally posted by Korval:
Why not just squash the rendering to half-height, rather than redrawing it as a texture? Because I need a whole frame image,but the odd row and even row transformed by different projection,so I need interleaved rendering

pango
11-20-2006, 06:07 PM
I do a "depth test" test in my program,before every rendering,I clear depth buffer with 2x2 image,the first row of the image is black,and second is white,so the every pixel that depth value below 0.5 is cancelled in rendering.But I found the performance gained is very small,it only speed up little in a complex scene,and has no help to a simple scene.Why?

gybe
11-20-2006, 06:55 PM
An interleave rendering method was describe in ShaderX 4: 2.1 Interlaced Rendering by Oles V. Shishkovtsov. Maybe it can help you.

pango
11-21-2006, 05:02 PM
Originally posted by gybe:
An interleave rendering method was describe in ShaderX 4: 2.1 Interlaced Rendering by Oles V. Shishkovtsov. Maybe it can help you. Thanks your reply,but I haven't the book in hand,can you tell what described in that book?

Humus
11-21-2006, 07:33 PM
I would suggest you use the stencil buffer. Do a first pass setting stencil to 1 in every other row. Then you can just render twice with the stencil test testing for 0, and 1 in the second pass.

If you don't need stencil for anything else, the stencil fill pass can be done once at initialization.

Komat
11-21-2006, 10:35 PM
Originally posted by pango:

Originally posted by Korval:
Why not just squash the rendering to half-height, rather than redrawing it as a texture? Because I need a whole frame image,but the odd row and even row transformed by different projection,so I need interleaved rendering You can probably tweak the projection matrix to shift the rendered result so the correct lines are rendered.

pango
11-22-2006, 12:55 AM
Originally posted by Humus:
I would suggest you use the stencil buffer. Do a first pass setting stencil to 1 in every other row. Then you can just render twice with the stencil test testing for 0, and 1 in the second pass.

If you don't need stencil for anything else, the stencil fill pass can be done once at initialization. Thanks your reply,but are you sure the speed of "stencil test" is faster than "depth test"?

pango
11-22-2006, 01:03 AM
Originally posted by Komat:

Originally posted by pango:

Originally posted by Korval:
Why not just squash the rendering to half-height, rather than redrawing it as a texture? Because I need a whole frame image,but the odd row and even row transformed by different projection,so I need interleaved rendering You can probably tweak the projection matrix to shift the rendered result so the correct lines are rendered. Thanks your reply,but what's the meaning of "tweak the projection matrix",do you mean I can write a vertex shader,and do tweak in it?

k_szczech
11-22-2006, 04:14 AM
Tweaking the projection matrix is just adding the offset I mentioned before:

are you sure the speed of "stencil test" is faster than "depth test"That depends if your application needs to use depth test for other purposes. If not, then use depth test - fill depth buffer (once) with proper values and use depth test (which is fast) - do not clear depth buffer every frame - just initialize it and leave it this way using glDepthMask(GL_FALSE).
If you use depth testing to render 3D geometry (and I guess you do), then stencil could prove faster since you will not have to fill depth buffer with that 2-pixel image every frame. Fill the stencil buffer once and re-use it when rendering, just as Humus suggested.

Komat
11-22-2006, 01:53 PM
Originally posted by pango:
Thanks your reply,but what's the meaning of "tweak the projection matrix",do you mean I can write a vertex shader,and do tweak in it? I was thinking about applying a half pixel offset to the screen space coordinates by multiplying the projection matrix with matrix with appropriate bias or with addition in vertex shader however after some drawing with pencil and paper I am not sure that this will work without some form of hw support. For example different mipmaps will be selected than in the unsquashed image and textures will be sampled at different coordinates.

pango
11-22-2006, 04:08 PM
[QUOTE]Originally posted by k_szczech:
[QB] Tweaking the projection matrix is just adding the offset I mentioned before:

glMatrixMode(GL_PROJECTION);
glLoadIdentity();
if (evenFrame)
glTranslatef(0.0f, 0.5f, 0.0f);
glFrustum / gluPerspectiveNote that this if for rendering half-height images.

Thanks your reply,but I don't think it is help for me,because I need interleaved compressed half-height image,means I need the row 0,2,4,...,not row 0,1,2,...

yooyo
11-24-2006, 01:26 PM
Seems that you want to render interlaced video!? Stencil is the way! Every odd line fill with 1 in stancil buffer. Later... enable stencil test and stencil func to be equal to 0 or 1 and do render.

pango
11-26-2006, 05:42 PM
Originally posted by yooyo:
Seems that you want to render interlaced video!? Stencil is the way! Every odd line fill with 1 in stancil buffer. Later... enable stencil test and stencil func to be equal to 0 or 1 and do render. Yes,what I want to do is render a scene and play it in TV,so I must render it interlaced.I had write a Stencil test version of my engine,but I found the performance has no increase than frame render version.I want to get a interlaced rendering method that is quicker than render a whole frame,is it possible?
I use a GF7800GT card,how about its "Stencil test" performance?

Komat
11-26-2006, 11:07 PM
Originally posted by pango:
I had write a Stencil test version of my engine,but I found the performance has no increase than frame render version.I want to get a interlaced rendering method that is quicker than render a whole frame,is it possible?
The problem might be that the HW does fast stencil/depth rejection test on rectangular blocks of pixels and can reject the block only if all pixels within it would be rejected. Otherwise it has to process the pixels and reject individual results so the pixel shading work is not avoided. Because in your case each second row is present, the fast rejection will probably do nothing.

pango
11-27-2006, 02:02 AM
Originally posted by Komat:

Originally posted by pango:
I had write a Stencil test version of my engine,but I found the performance has no increase than frame render version.I want to get a interlaced rendering method that is quicker than render a whole frame,is it possible?
The problem might be that the HW does fast stencil/depth rejection test on rectangular blocks of pixels and can reject the block only if all pixels within it would be rejected. Otherwise it has to process the pixels and reject individual results so the pixel shading work is not avoided. Because in your case each second row is present, the fast rejection will probably do nothing. Oh,What your said means that I can't find a method to do interlaced rendering faster?What I can do is to opatimize my engine to catch double fps?

ZbuffeR
11-27-2006, 04:45 AM
Halve rendering height, as already said.

yooyo
11-27-2006, 07:35 AM
Whatever you do, consider following issues:
- Your engine MUST have constant 50 or 60 fields per second (depending on PAL or NTSC)
- TV out on todays gaming cards reports 50 or 60 Hz
- When your render engine work in frame mode, scanline converter chip on card discard every even or odd row depending on current field status and order.
- If you want to render in field mode, then you must interleave fields an deliver constant 25 or 30 fps output. Each frame must be sent twice or do carefull timings in app and call SwapBuffers every 20 or 16.666 ms.
- Finally.. when your app start rendering you can't be sure that it start on even or odd field. So.. sometimes user may need to push "swith fields" button (if you app provide such function).

There is a another option ... render interleaved frame in offscreen buffer, apply RGB -> YUV422 shader, grab frame and send it to overlay mixer using DirectShow.

pango
11-27-2006, 04:02 PM
Originally posted by ZbuffeR:
Halve rendering height, as already said. Thanks for your remind,but half height rendering's image quality is not satisfied for me,because I must combine field images to frame image,the combination is copy odd & even image line to frame image interleaved,so the frame image is like this:
odd0
even0
odd1
even1
...

pango
11-27-2006, 04:09 PM
Originally posted by yooyo:
Whatever you do, consider following issues:
- Your engine MUST have constant 50 or 60 fields per second (depending on PAL or NTSC)
- TV out on todays gaming cards reports 50 or 60 Hz
- When your render engine work in frame mode, scanline converter chip on card discard every even or odd row depending on current field status and order.
- If you want to render in field mode, then you must interleave fields an deliver constant 25 or 30 fps output. Each frame must be sent twice or do carefull timings in app and call SwapBuffers every 20 or 16.666 ms.
- Finally.. when your app start rendering you can't be sure that it start on even or odd field. So.. sometimes user may need to push "swith fields" button (if you app provide such function).

There is a another option ... render interleaved frame in offscreen buffer, apply RGB -> YUV422 shader, grab frame and send it to overlay mixer using DirectShow. I don't use display card to play scene in TV,in my app,rendering a scene in offscreen buffer(I use pbuffer with FSAA),and use "glReadPixel()" to readout image,and push image to a video output card.So the scanline converter chip in display card is not useful for me.

yooyo
11-28-2006, 01:33 AM
OK.. try this...
render two half images on two off-screen buffers. Then render final image in full size off-screen buffer and use "masking texture" with size 1x2 pixels (0, 1). Use shader to combine these two half-images in big one. Idea is to stretch both fields 2x vertically and use smart shader to choose which row will be used depending on even or odd Y coordinate in shader.

In pseudocode:
- render upperfield in fbo0
- render lofwefiled in fbo1
- bind fbo0 to texture0 (upperfield in shader)
- bind fbo1 to texture1 (lowerfield in shader)
- bind mask_texture to texture2 (mask in shader, enable nearest filtering and turn on repeat)
- activate final-fbo
- render full-screen quad using following fragment shader:


sampler2D uppedfield;
sampler2D lowerfield;
sampler2D mask;
/* mask texture is 1x2 and it looks like:
000 or 0
255 1
*/

void main(void)
{
vec4 uf = texture2D(upperfield, gl_TexCoord[0].xy);
vec4 lf = texture2D(lowerfield, gl_TexCoord[0].xy);

// because of nearest filtering m will take only 0.0 or 1.0
float m = texture2D(mask, gl_TexCoord[0].xy).r;

gl_FragColor = mix(uf, lf, m);
}- grab final-fbo

- you need to play a bit with texture coordinates because shader work with normalized texture coordinates and it use 2d textures. In case that you use rect textures it is much easier
- you can use gl_FragCoord instead of mask texture, but it works only on NVidia cards. AFAIK, ATI card have problems with gl_FragCoord

ZbuffeR
11-28-2006, 02:27 AM
My idea is more along the lines of :

1) render half-height scene to texture A (odd lines)
2) render half-height scene to texture B (even lines)
3) render a quad to frame buffer that merge A and B texture according to texture I (black and white lines).

EDIT : as yooyo said in a much more precise way :)

pango
11-28-2006, 04:35 PM
Thanks yooyo and ZbufferR,but I think the final image combined with 2 half-height image is not satisfied for me,because the FSAA quality will be bad,as we know the "FSAA" algorithm need around near pixels,if I render half-height image,then row(N) pixel will operate with row(N-2) and row(N+2) pixel,not row(N-1) and row(N+1) pixel,so the final image's AA quality will not be good.

It is reason why I can't use half-height render,do you think it is right?

yooyo
11-29-2006, 01:23 AM
Nope... you are wrong. First, fields have different timestamp. So your render loop should be:
- update(time)
- render upperfield
- update(time+delta) // delta is 50/1000 or 60/1000ms
- render lowerfield
- combine upper and lower fields in fnal image
- grab and send to video-out card

So... AA should not access to vertical surround pixels because it lead to flickering on interlaced output device, but access to N-2 and N+2 is correct, because pixels belong to same timestamp.

Komat
11-29-2006, 02:30 AM
Originally posted by yooyo:
So... AA should not access to vertical surround pixels because it lead to flickering on interlaced output device, but access to N-2 and N+2 is correct, because pixels belong to same timestamp. He is comparing quality of "render full scale image and construct field using only odd/even lines" versus "render half sized image and use that as field" in which case there is quality difference.

pango
11-29-2006, 03:56 PM
Originally posted by Komat:

Originally posted by yooyo:
So... AA should not access to vertical surround pixels because it lead to flickering on interlaced output device, but access to N-2 and N+2 is correct, because pixels belong to same timestamp. He is comparing quality of "render full scale image and construct field using only odd/even lines" versus "render half sized image and use that as field" in which case there is quality difference. So Komat,do you think what I said is right?

Komat
11-29-2006, 10:39 PM
Originally posted by pango:
So Komat,do you think what I said is right? Partially. It is true that there will be quality degradation in the half height render however the explanation was not entirely correct.

The FSAA does not operate between rows of the final image, it has several vertical samples in each row and operates on them. The degradation comes from the fact that if the half scaled image is shown back in full size on the output device, each resulting "pixel" is effectively calculated from half of vertical samples per physical geometry when compared with pixel from image that was rendered in full size.

Additionally there is one thing that operates between consecutive lines of the final image and that will degrade in the half scaled rendering. It is a mipmap level selection. Rendering of half scaled image will likely result in selection of smaller (in texture dimensions) mipmap levels when compared to the full scale image.

yooyo
11-30-2006, 02:12 AM
@komat:
You are right about mipmaps.

@pango:
Im afraid, but you have to do a stencil way...

pango
12-04-2006, 11:06 PM
Hi,yooyo:
I had do a test that render half-height field image,and then combine the field images into frame image,but the result is not satisfied,the zigzag is obvious.Maybe it is the fault of I mistake what you said,so I post my pseudocode code:
// render odd field
- setup odd field's camera projection p1;
- render scene;
- copy image to tex1;

// render even field
- setup even field's camera projection p2;
- render scene;
- copy image to tex2;

- use a pixel shader to combine tex1 & tex2 into frame image;


Because my program simulate the camera in real world,so "p1" may not equal to "p2" when camera is zoomming or panning,but "p1" also can equal to "p2" when camera is not in action.I found when "p1" equals to "p2",the result image's zigzag is especially obvious,but when camera in action,because whole scene is in moving,so the zigzag problem is not clear,but it still exist.

ZbuffeR
12-05-2006, 12:41 AM
You should activate vsync.

Hum, what about a screenshot, for the still shot ?
If you skip the pixel shader part, and display tex1 then tex2 quickly, do you see a big change ? If no, then the problem is in the pixel shader. Post its code, and how you configure it.

yooyo
12-05-2006, 04:17 AM
@pango:
Post screenshot please.

Interlaced rendering is used to tweak framerate by cutting down vertical resolution. So.. in case of PAL standard you have 25 interleaved frames per second or 50 fields per second. all fields have different timestamp. Time offset between fields are 20 ms. This mean first field should have scene rendered at time 0ms, second field should have scene rendered at time 20ms, and so on (40, 60, 80, ...). Every two fields interleave and "composed" image send to video out card.

When I say "scene rendered at N ms" this mean everything should be animated, even viewer camera.

Unlike PAL (1000/25 ms), timeoffset between fields on NTSC standard is (1000/29.97) ms.

Zig-zag effect might be related with wrong field order. Examine your output video device and check is it work in Top Field First mode or in Bottom Field First mode (TFF or BFF).

macarter
12-05-2006, 10:08 AM
Interlaced motion video cameras do not produce a frame image. They produce a sequence of even and odd fields. There is no reason to render the combined fields into a frame image. Load each field separately into the output device or combine them in the CPU.

If rendering is both synchronous to the display device and aware of which field is next to be displayed (even or odd) then only one field (even or odd) needs to be rendered with each time step. Apparently your display system does not inform you which field is displayed so you have to compute both.

The even and odd field camera projections p1 and p2 should not be equal, even for a static scene. If they were then the even and odd lines would be identical. The odd lines(rows) should be offset vertically by a half line(row) from the even. This should be done with a skew transform.

pango
12-06-2006, 04:20 PM
Originally posted by yooyo:
@pango:
Post screenshot please.

Interlaced rendering is used to tweak framerate by cutting down vertical resolution. So.. in case of PAL standard you have 25 interleaved frames per second or 50 fields per second. all fields have different timestamp. Time offset between fields are 20 ms. This mean first field should have scene rendered at time 0ms, second field should have scene rendered at time 20ms, and so on (40, 60, 80, ...). Every two fields interleave and "composed" image send to video out card.

When I say "scene rendered at N ms" this mean everything should be animated, even viewer camera.

Unlike PAL (1000/25 ms), timeoffset between fields on NTSC standard is (1000/29.97) ms.

Zig-zag effect might be related with wrong field order. Examine your output video device and check is it work in Top Field First mode or in Bottom Field First mode (TFF or BFF). Thanks Yooyo,but my program's camera not animated in every field time.My program simulate the camera in studio,it simulate camera's pan & zoom,but you know the camera is not still in panning or zooming,so if pan or zoom angle is not changed from previous time,odd & even field's camera params should be the same,the image rendered should be same as frame rendering.

The field-order in my app is right,and I also test diffrent field rendering order(render odd in top-half,even in bottom-half,and render even in top-half,odd in bottom-half);

How to post my screenshot in this forum?

def
12-07-2006, 03:55 AM
Originally posted by pango:
Because my program simulate the camera in real world,so "p1" may not equal to "p2" when camera is zoomming or panning,but "p1" also can equal to "p2" when camera is not in action.I found when "p1" equals to "p2",the result image's zigzag is especially obvious,but when camera in action,because whole scene is in moving,so the zigzag problem is not clear,but it still exist. You will have to define more clearly what you call "zigzag".
To insert an image you will have to upload it somewhere on the web and use a link inside your message. Like this: http://www.opengl.org/images/Logos/OGLA_sm.gif
:D
If p1 and p2 are identical and there is no movement in your scene your interleaved frame should have a half hight resolution because the two fields are identical, no zigzagging as I understand the word would be visible.
But even if the camera is not moving, zooming, panning etc. p1 and p2 are never the same. Think of fields as full frames with alternating lines of black or not displayed. The camera needs to be offset for one of the fields.