PDA

View Full Version : Fullscreen quad invoking fragment shader twice along triangle seam?



jfslocum
08-07-2012, 09:05 AM
I'm trying to implement linked-list based transparency. The final stage is to performs alpha blending on a linked list of fragments for each pixel. The problem I'm having is that, when I try to render a fullscreen quad, the seam between the two triangles appears to have some overlap, which causes the fragment shader to be run twice on those pixels. This is a major problem, because it means that I can not assume the shader will be run once for each pixel, and I can't assume that only one shader instance will be accessing a pixel's data structures at a time.

I'm using what I think is a pretty standard technique for drawing the quad: currently, I'm calling glDrawElements(GL_TRIANGLES) on two triangles defined in clip space.

I pass in the following vertices to my vertex buffer:
{
{1.0f, 1.0f, 0.5f, 1.0f},
{-1.0f, 1.0f, 0.5f, 1.0f},
{1.0f, -1.0f, 0.5f, 1.0f},
{-1.0f, -1.0f, 0.5f, 1.0f}
}

And my index buffer contains:
{0, 1, 2, 1, 2, 3}


And here's my vertex shader:



#version 420

layout(location=0) in vec4 in_position;
layout(location=1) in vec2 in_tex;

smooth out vec2 texture_coords;

void main()
{
gl_Position = in_position;
texture_coords = in_tex;
}



How do I know the shader is being run twice on the seam? I wrote a test shader that atomically increments a pixel in an image corresponding to the fragment's pixel coordinates. The diagonal of the image has all '2's while everywhere else is 1. When I change the order of the vertices, the '2's appear along the other diagonal.

I've tested this on a radeon HD 7970 and HD 5450 on Ubuntu 11.04 and a Geforce 670 GTX on Mint 13, and it shows up in all of those cases. Curiously, when I try it on windows, the problem does not appear.

I'm wondering if anyone has run into a similar problem before, has advice on how to define the triangles such that the seams are removed, or a different method of rendering the fullscreen quad.

aqnuep
08-07-2012, 10:25 AM
I have no idea how this could happen but why don't you render a full screen triangle instead of a full screen quad? Just render a triangle large enough that it covers all pixels. This way you definitely cannot experience any seams.

jfslocum
08-07-2012, 11:49 AM
I just tried that. I end up with 1 or 2 lines along which the fragment shader appears to run twice. I'm *guessing* that what's going on is the fixed function pipeline clips the triangle and converts it into several smaller ones that fill up the screen, but there still seems to be some overlap.

dukey
08-08-2012, 02:59 PM
ortho projection ? Perspective projection is the only thing I can think of that would cause this.

ScottManDeath
08-08-2012, 03:37 PM
If you have MSAA enabled, you might want to look into centroid sampling for texture coordinates.

tonyo_au
08-09-2012, 04:32 AM
I've tested this on a radeon HD 7970 and HD 5450 on Ubuntu 11.04 and a Geforce 670 GTX on Mint 13, and it shows up in all of those cases. Curiously, when I try it on windows, the problem does not appear.


I noticed the same thing in Windows but it was a couple of releases of driver ago. I haven't checked with the most recent drivers

jfslocum
08-09-2012, 06:36 AM
ortho projection ? Perspective projection is the only thing I can think of that would cause this.

No projection at all; I'm passing the vertices in NDC. Take a look at my vertex shader.



If you have MSAA enabled, you might want to look into centroid sampling for texture coordinates.

I'm not using MSAA currently.



I noticed the same thing in Windows but it was a couple of releases of driver ago. I haven't checked with the most recent drivers

That's good to hear. Maybe this will be fixed in the future.


Thanks for the suggestions, guys.

Ilian Dinev
08-09-2012, 11:44 AM
Could you test with this in the fragment shader:


layout(early_fragment_tests) in;

jfslocum
08-09-2012, 01:16 PM
Could you test with this in the fragment shader:


layout(early_fragment_tests) in;


No apparent effect.

jfslocum
08-10-2012, 09:34 AM
I tried compiling with clang instead of gcc - still the same. So I'm pretty confused as to what's going on. It seems strange that both NVidia and ATI drivers would have the same problem specifically on linux.

ReaperSMS
08-10-2012, 09:54 AM
How exactly are you writing to the other image? If it's not through the blend unit, it might be incrementing whenever the fragment program is run, not whenever it actually produced a pixel. If so, that would explain the doubling up, as on most chips fragment programs get run in 2x2 blocks -- thus running the fragment program twice on every pixel along the diagonal.

The clipping shouldn't be generating new triangles as long as your fullscreen tri stays within a reasonable range. IIRC, for +/- 1 NDC space, the vertices you need are (-1, -1) (3, -1) (-1, 3). http://www.gamedev.net/topic/568915-fullscreen-triangle/ has a bit more detail on it.

Alfonse Reinheart
08-10-2012, 10:46 AM
when I try to render a fullscreen quad, the seam between the two triangles appears to have some overlap, which causes the fragment shader to be run twice on those pixels. This is a major problem, because it means that I can not assume the shader will be run once for each pixel, and I can't assume that only one shader instance will be accessing a pixel's data structures at a time.

That should not be possible. All writes from any fragment shaders that execute outside of a triangle's boundaries should be discarded. What is the fragment shader you use to detect this?

mbentrup
08-10-2012, 11:24 AM
That should not be possible. All writes from any fragment shaders that execute outside of a triangle's boundaries should be discarded. What is the fragment shader you use to detect this?

If MSAA is enabled, you would get two fragments with each having one half of the samples masked. You could enable per-sample shading plus atomic counters to see if MSAA is active.

Alfonse Reinheart
08-10-2012, 02:17 PM
Yes, but that's both normal and to be expected.

AlexN
08-10-2012, 02:53 PM
Is the overlapping region a single pixel wide, or is it sometimes a couple pixels wide and blocky? If a single pixel, that sounds like a precision issue. If more than one pixel and blocky, it's likely that you're seeing an artifact from the GPU invoking your fragment shader in 2x2 (or larger) groups of pixels for efficiency. Normally this is not visible because the extra fragments are discarded, but if you make accesses to main memory then there can be side effects.

Either way, the easiest solution may be to change how you render your full screen quad. Try using a single triangle that is larger than your viewport and fully contains it. This is typically more efficient anyway as it avoids the wasted fragments along the seam between two triangles due to the pixel grouping mentioned above.

Edit: I take back the bit about accessing main memory from a helper fragment having side effects. Looks like that is not supposed to happen, though there could be a bug you a hitting if so. From http://www.opengl.org/registry/specs/ARB/shader_image_load_store.txt

(22) If implementations run fragment shaders for fragments that aren't covered by the primitive or fail early depth tests (e.g., "helper pixels"), how does that interact with stores and atomics? RESOLVED: Stores will have no effect. Atomics will also not update memory. The values returned by atomics are undefined.

Dan Bartlett
08-10-2012, 04:02 PM
Does having them wound the same direction have any effect?

eg. Instead of:
{0, 1, 2, 1, 2, 3}
using
{0, 1, 2, 1, 3, 2}.

AFAICT if either way is used only one polygon should produce a fragment along the common edge, so you'll probably need to wait for a driver fix. Triangles wound the same direction may be a more tested path though.

mhagain
08-11-2012, 06:56 AM
Have you tried this with drawing your fullscreen quad as a single triangle strip using glDrawArrays?

Alfonse Reinheart
08-11-2012, 09:37 AM
Either way, the easiest solution may be to change how you render your full screen quad. Try using a single triangle that is larger than your viewport and fully contains it. This is typically more efficient anyway as it avoids the wasted fragments along the seam between two triangles due to the pixel grouping mentioned above.


Have you tried this with drawing your fullscreen quad as a single triangle strip using glDrawArrays?

This was suggested in the second post of the thread, and he tried it in the third.

mhagain
08-11-2012, 10:24 AM
This was suggested in the second post of the thread, and he tried it in the third.

Sigh.

Triangle strip, not triangle.

As well as potentially providing a solution, a strip (not a single triangle) can provide useful diagnostic info for the OP too, such as confirmation of the MSAA theory mentioned above.

All of this is coming from the perspective of trying to help the OP rather than score points off other posters.

Alfonse Reinheart
08-11-2012, 10:54 AM
I missed a word; sorry about that.

Though I don't see how it matters if it's a strip or two triangles. I seriously doubt the rasterizer would treat it any differently. It's not like it even sees anything but individual triangles. Multisampling would work the same way as well, regardless of whether it's a strip or individual triangles.

Indeed, I would be very concerned about the validity of the rendering pipeline if turning it into a strip had any effect at all.

jfslocum
08-14-2012, 12:55 PM
That should not be possible. All writes from any fragment shaders that execute outside of a triangle's boundaries should be discarded. What is the fragment shader you use to detect this?

Yeah, that's why I concluded that the triangles overlap, because the fragment shader should have any effect outside of the triangle.

This is my [abridged] frag code:



#version 420
#include <shaderCommon.h> //defs for 'red' 'green' etc
#include <gtest/AtomicCounter.h> //defs for binding locations

layout(binding = AC_ATOMIC_BINDING, offset = 0) uniform atomic_uint buffer_index_counter;

layout(binding = AC_IMAGE_EXCHANGE_UNIT, r32ui) coherent uniform uimage2D ex_image;

uniform uint clear_value;
uniform uint viewport_height;
uniform uint viewport_width;

out vec4 out_color;

void main
{
ivec2 pixel_coords = ivec2(int(gl_FragCoord.x),
int(gl_FragCoord.y));

uint count = atomicCounterIncrement(buffer_index_counter);
uint val = imageAtomicExchange(ex_image,
pixel_coords,
uint(gl_FragCoord.x));

if((val == uint(gl_FragCoord.x)) && (val != clear_value)) {
out_color = white;
}
else {
out_color = green;
}
}


Note that prior to the shader invocation, ex_image contains `clear_value` in every pixel. So if this shader runs once on a pixel, it should output green, but if it runs twice, it should output white.
I would expect the output to be pure green, as the fragment shader should only be run once per pixel. Indeed, in windows, I do get pure green.

https://dl.dropbox.com/u/11552862/White-line.png

Strangely, the line appears to be anti-aliased (try zooming way in using an image-editing program).



If MSAA is enabled, you would get two fragments with each having one half of the samples masked. You could enable per-sample shading plus atomic counters to see if MSAA is active.

This is definitely a line (heh) of thought worth pursuing. I'm not familiar with MSAA though. How would the expected number of samples differ between MSAA and a buffer where the triangles simply overlap?

And why might GL be running in MSAA mode when I create a context with a non-MS buffer and never explicitly enable it?



Is the overlapping region a single pixel wide, or is it sometimes a couple pixels wide and blocky? If a single pixel, that sounds like a precision issue. If more than one pixel and blocky, it's likely that you're seeing an artifact from the GPU invoking your fragment shader in 2x2 (or larger) groups of pixels for efficiency. Normally this is not visible because the extra fragments are discarded, but if you make accesses to main memory then there can be side effects.

Either way, the easiest solution may be to change how you render your full screen quad. Try using a single triangle that is larger than your viewport and fully contains it. This is typically more efficient anyway as it avoids the wasted fragments along the seam between two triangles due to the pixel grouping mentioned above.

Edit: I take back the bit about accessing main memory from a helper fragment having side effects. Looks like that is not supposed to happen, though there could be a bug you a hitting if so. From http://www.opengl.org/registry/specs/ARB/shader_image_load_store.txt

(22) If implementations run fragment shaders for fragments that aren't covered by the primitive or fail early depth tests (e.g., "helper pixels"), how does that interact with stores and atomics? RESOLVED: Stores will have no effect. Atomics will also not update memory. The values returned by atomics are undefined.

This is certainly something I can check for. I could render a triangle with the texture as a render target, then render again with the texture bound as an image that I write too.


Does having them wound the same direction have any effect?

eg. Instead of:
{0, 1, 2, 1, 2, 3}
using
{0, 1, 2, 1, 3, 2}.

AFAICT if either way is used only one polygon should produce a fragment along the common edge, so you'll probably need to wait for a driver fix. Triangles wound the same direction may be a more tested path though.


No effect.


Have you tried this with drawing your fullscreen quad as a single triangle strip using glDrawArrays?

I'm planning on preparing a self-contained test program to submit with a bug report. I'll give this a try tomorrow while I'm doing that.


Thanks for all the help and suggestions, guy.

Alfonse Reinheart
08-14-2012, 07:50 PM
And why might GL be running in MSAA mode when I create a context with a non-MS buffer and never explicitly enable it?

That's an easy question to answer: your driver settings panel has switches that can force applications to use anti-aliasing for the main framebuffer.

The way to prevent this is to create render targets manually yourself. But you should be able to detect it by using `glGetIntegerv(GL_SAMPLE_BUFFERS);` when the default framebuffer is bound to the GL_DRAW_FRAMEBUFFER. It should be 0 if multisampling is not available.

jfslocum
08-15-2012, 12:26 PM
`glGetIntegerv(GL_SAMPLE_BUFFERS);` returns 1.
`glIsEnabled(GL_MULTISAMPLE);` returns false.

Alfonse Reinheart
08-15-2012, 01:35 PM
So what happens if you create your own renderbuffers and render to those?