PDA

View Full Version : ARB_separate_shader_objects NVIDIA crash



Leith Bade
02-06-2011, 03:24 PM
OK, so I implemented the pipeline objects using a shared uniform block...

But now I get a segmentation fault!

It is triggered when I use glUseProgramStages, but the segmentation fault occurs when the program is exiting inside the NVIDIA OpenGL implementation as it's shared libary gets unloaded.

Here is my stack trace:
#0 0x7ffff28b03e9 ??() (/usr/lib/libnvidia-glcore.so.270.18:??)
#1 0x7ffff28998eb ??() (/usr/lib/libnvidia-glcore.so.270.18:??)
#2 0x7ffff28af85f ??() (/usr/lib/libnvidia-glcore.so.270.18:??)
#3 0x7ffff294a738 ??() (/usr/lib/libnvidia-glcore.so.270.18:??)
#4 0x7ffff294a873 ??() (/usr/lib/libnvidia-glcore.so.270.18:??)
#5 0x7ffff28b08ea ??() (/usr/lib/libnvidia-glcore.so.270.18:??)
#6 0x7ffff2906ab1 ??() (/usr/lib/libnvidia-glcore.so.270.18:??)
#7 0x7ffff2aa7a30 ??() (/usr/lib/libnvidia-glcore.so.270.18:??)
#8 0x7ffff2a8937d ??() (/usr/lib/libnvidia-glcore.so.270.18:??)
#9 0x7ffff2a89231 ??() (/usr/lib/libnvidia-glcore.so.270.18:??)
#10 0x7ffff524ad35 ??() (/usr/lib/libGL.so.1:??)
#11 0x7ffff7dedcac _dl_fini() (dl-fini.c:248)
#12 0x7ffff4271c12 __run_exit_handlers(status=0) (exit.c:78)
#13 ( *__GI_exit(status=0) (exit.c:100)
#14 0x7ffff4257ac4 __libc_start_main(main=<value optimized out>, argc=<value optimized out>, ubp_av=<value optimized out>, init=<value optimized out>, fini=<value optimized out>, rtld_fini=<value optimized out>, stack_end=0x7fffffffe8f8) (libc-start.c:252)
#15 0x404ee9 _start() (../sysdeps/x86_64/elf/start.S:113)

As you can see I am running the latest BETA NVIDIA driver (270.18) on Ubuntu 9.04 x64, with a GeForce 9400 GT.

I am going to try switching to the latest stable driver (260.19.36) and see if it stays...

Has anyone else run into a bug or a crash with ARB_separate_shader_objects on NVIDIA (or ATI)?

Leith Bade
02-06-2011, 03:39 PM
Nope, still get the bug on 260.19.36:
#0 0x7ffff28c9eb9 ??() (/usr/lib/libnvidia-glcore.so.260.19.36:??)
#1 0x7ffff28b539b ??() (/usr/lib/libnvidia-glcore.so.260.19.36:??)
#2 0x7ffff28c920f ??() (/usr/lib/libnvidia-glcore.so.260.19.36:??)
#3 0x7ffff2963c88 ??() (/usr/lib/libnvidia-glcore.so.260.19.36:??)
#4 0x7ffff2963dc3 ??() (/usr/lib/libnvidia-glcore.so.260.19.36:??)
#5 0x7ffff28c9440 ??() (/usr/lib/libnvidia-glcore.so.260.19.36:??)
#6 0x7ffff291fbb1 ??() (/usr/lib/libnvidia-glcore.so.260.19.36:??)
#7 0x7ffff2abf510 ??() (/usr/lib/libnvidia-glcore.so.260.19.36:??)
#8 0x7ffff2aa128d ??() (/usr/lib/libnvidia-glcore.so.260.19.36:??)
#9 0x7ffff2aa1141 ??() (/usr/lib/libnvidia-glcore.so.260.19.36:??)
#10 0x7ffff524dcd5 ??() (/usr/lib/libGL.so.1:??)
#11 0x7ffff7dedcac _dl_fini() (dl-fini.c:248)
#12 0x7ffff4277c12 __run_exit_handlers(status=0) (exit.c:78)
#13 ( *__GI_exit(status=0) (exit.c:100)
#14 0x7ffff425dac4 __libc_start_main(main=<value optimized out>, argc=<value optimized out>, ubp_av=<value optimized out>, init=<value optimized out>, fini=<value optimized out>, rtld_fini=<value optimized out>, stack_end=0x7fffffffe8f8) (libc-start.c:252)
#15 0x404ee9 _start() (../sysdeps/x86_64/elf/start.S:113)

Where do we go to email NVIDIA about bugs in Linux OpenGL drivers (both stable and beta ones)?

Leith Bade
02-06-2011, 04:01 PM
OK, some other wierdness I have noticed with the output (before it crashes).

It seems that my output color uniform in the fragment shader is not set, it is outputting a 100% transparent triangle rather than the blue it should.

If I don't glUseProgramStages my fragment program I get a black triangle so my program seems to compile OK.

Leith Bade
02-06-2011, 04:21 PM
I have done some tests:
I tried adding #extension ARB_separate_shader_objects : enable to my shaders but it did not change anything.

Reading through the NVIDIA read me under "Chapter 9. Known Issues", heading "libGL DSO finalizer and pthreads" it talks about issues with multithread OpenGL (which I do not do) and mentions the environment variable "__GL_NO_DSO_FINALIZER". If I set this variable to "1" the issue goes away, which confirms that it is some sort of bug in NVIDIA's resource freeing code.

Leith Bade
02-06-2011, 04:59 PM
I have sent an email to linux-bugs@nvidia.com, so hopefully they will get back to me today or tomorrow.

Leith Bade
02-06-2011, 05:10 PM
I figured out what was causing problems with my color uniform: I was using glUniform instead of the new glProgramUniform.

The crash remains though, but everything works until I exit the program which is OK for now considering I can disable the crash with the mentioned environment variable.

ZbuffeR
02-06-2011, 05:15 PM
Thanks for sharing your findings.

Leith Bade
12-06-2011, 04:18 PM
I am shocked to discover that THIS BUG HAS STILL NOT BEEN FIXED AFTER 10 MONTHS!!!

For reference I am now using Ubuntu 10.10 with the current stable driver version of 290.10

I have also now got a GeForce 480

Do NVIDIA even use a bug tracker for this stuff?

Alfonse Reinheart
12-06-2011, 05:22 PM
Did you submit a bug report on it, with a reproducible test case?

Leith Bade
12-06-2011, 06:25 PM
Yeah I did give them source code, shaders, crash logs, coredump of the crash etc.

In March I got this reply:
"Thanks for reporting this. The problem has been identified and fixed; however, the fix will not make it into the 270.xx driver series, but the next series after that. In the meantime, please continue to use __GL_NO_DSO_FINALIZER as a workaround."

I had assumed at the time it would be fixed in the 275 series and have not had a chance to check it since. Having look through the changelogs for 270 through 290 I see no mention of this bug yet plenty of other bugs were fixed.

If they have already got a patch to fix the bug why have they not merged it into their public drivers?

Leith Bade
12-08-2011, 07:24 PM
After talking with NVIDIA it turned out to be a bug in my code... I memcpy'd past the end of a mapped vertex buffer.

Sorry to NVIDIA for making it look like they didn't fix it. However it would be nice to get more feedback from OpenGL for these sorts of easy to make mistakes.

I suggested to the NVIDIA developer adding a check for buffer overflows with the glMapBuffer calls when a debug context is enabled. The driver could then report the problem via GL_ARB_debug_output.

Dark Photon
12-08-2011, 07:46 PM
After talking with NVIDIA it turned out to be a bug in my code... I memcpy'd past the end of a mapped vertex buffer. ...

I suggested to the NVIDIA developer adding a check for buffer overflows with the glMapBuffer calls when a debug context is enabled. The driver could then report the problem via GL_ARB_debug_output.
Have you run valgrind on your code? What does it say for this case?