Do we know how fragments are processed in a GL_NV_fragment_program? I am asking this because I am trying to use a calculated value in a subsequent equation in the same fragment program, but not for the same fragment, but a neighbor. I would like to do everything in one pass and not have to save the intermediate result to a texture and then do a second pass to get access to the offset fragments.Any recommendations?
As a related question… I am using the NV30 software emulator to run my openGL code, because I don’t have an FX card. My geforce3 does not support fragment programs. Can anyone guide me on what the frame rate penalty will be if I use two passes versus one - assuming that all is equal except I divide my fragment program instructions into two different programs? i.e. will my program run twice as slow because I am doing two passes instead of one?
Fragments are processed individually. All registers are reset to 0 when the fragment program starts. There’s no way to communicate results to neighbouring fragments in the same pass. Sorry.
Originally posted by al_bob: Fragments are processed individually. All registers are reset to 0 when the fragment program starts. There’s no way to communicate results to neighbouring fragments in the same pass. Sorry.
I guess it was wishful thinking that maybe we knew that it started processing fragments at the bottom left, frag[0,0] then proceeded left to right, frag[1,0], frag[2,0]…frag[width,0], then started on the next row with frag[0,1] etc. If I knew this then I could do some counting in my fragment program to keep track of neighbors of interest.
Rendering today is so fast because it’s massively parallel. There is no “this one first, then that one”. There is no constant number of pixels in flight at the same time and there is no constant ordering between parallel pixel batches that could be exposed.
That’s a good thing.
Originally posted by zeckensack: Rendering today is so fast because it’s massively parallel. There is no “this one first, then that one”. There is no constant number of pixels in flight at the same time and there is no constant ordering between parallel pixel batches that could be exposed.
That’s a good thing.
Should I just render my first pass to the frame buffer, copy it to a texture and do my second pass? Is there any faster way? In general if I do multiple passes before I display, is it best to render to the frame buffer and copy to texture?