PDA

View Full Version : NVIDIA releases OpenGL 4.2 drivers



Khronos_webmaster
08-08-2011, 09:22 AM
NVIDIA is proud to announce the immediate availability of OpenGL 4.2 drivers for Windows and Linux.

You will need any one of the following Fermi based GPU to get access to the full OpenGL 4.2 and GLSL 4.20 functionality:
Quadro Plex 7000, Quadro 6000, Quadro 5000, Quadro 4000, Quadro 2000, Quadro 600 GeForce 500 series (GTX 590, GTX 580, GTX 570, GTX 560 Ti, GTX 560, GTX 550 Ti, GT 545, GT 530, GT 520) GeForce 400 series (GTX 480, GTX 470, GTX 465, GTX 460 SE v2, GTX 460 SE, GTX 460, GTS 450, GT 440, GT 430, GT 420, 405)
For OpenGL 2 capable hardware, these new extensions are provided:
ARB_compressed_texture_pixel_storage (also in core OpenGL 4.2) ARB_conservative_depth (also in core OpenGL 4.2) ARB_internalformat_query (also in core OpenGL 4.2) ARB_map_buffer_alignment (also in core OpenGL 4.2) ARB_shading_language_420pack (also in core OpenGL 4.2) ARB_texture_storage (also in core OpenGL 4.2)
For OpenGL 3 capable hardware, these new extensions are provided:
ARB_base_instance (also in core OpenGL 4.2) ARB_shading_language_packing (also in core OpenGL 4.2) ARB_transform_feedback_instanced (also in core OpenGL 4.2)
For OpenGL 4 capable hardware, these new extensions are provided:
ARB_shader_atomic_counters (also in core OpenGL 4.2) ARB_shader_image_load_store (also in core OpenGL 4.2) ARB_texture_compression_bptc (also in core OpenGL 4.2)
The drivers and extension documentation can be downloaded from http://developer.nvidia.com/object/opengl_driver.html

mbien
08-08-2011, 10:51 AM
does this driver support OpenCL 1.1?
(edit: yes it does)

Alfonse Reinheart
08-08-2011, 07:48 PM
The site seems to be down. Is this a server hitch or something else?

Piers Daniell
08-08-2011, 08:54 PM
It appears to be working for me now. Is it okay for you now?

Alfonse Reinheart
08-08-2011, 09:22 PM
Yeah, it's back up now. Thanks!

przemo_li
08-08-2011, 11:26 PM
GJ Nvidia!
Unfortunatly you released mobile 400 too late for me :(

But keep going!

KRONOS
08-09-2011, 11:11 AM
Well, my program stopped working. I found out that the problem is in this call:

glVertexAttribPointer(0, 2, GL_FLOAT, false, 24, (void*)0);//GL_INVALID_OPERATION
And I checked to make sure I have a buffer object bound to GL_ARRAY_BUFFER before making the call.
I'm using a core 3.2 profile.

Piers Daniell
08-09-2011, 02:02 PM
Make sure you have a valid VAO bound. For core profiles the spec requires a VAO to be bound, otherwise it will result in INVALID_OPERATION when glVertexAttribPointer or other vertex functions are called. We used not check for this, but now we do to be more spec compliant. This doesn't apply to the compatibility profile.

KRONOS
08-09-2011, 02:20 PM
You're right, the spec requires it. The driver never complained before... :confused:

Chris Lux
08-10-2011, 02:06 AM
when using the binding layout qualifier for images as follows:


#version 420 core

layout(rgba16ui, binding = 0) writeonly uniform uimage2D _stuff;
...

the following glsl error is generated:


error C1315: can't apply layout to global variable '_stuff'


This is legal according to the GLSL 4.20 spec, but in the ARB_shader_image_load_store extension the binding is not listed.

Groovounet
08-10-2011, 02:43 AM
You're right, the spec requires it. The driver never complained before... :confused:

It was a known an famous NVIDIA drivers bug. It's good to see it fixed despite that VAOs are nothing else than annoying most of the time.

A workaround and quick fix for you application is to create and bind a VAO at the beginning, (right after the context creation) you can then forget about it.

No the ultimate fixed but it get you application running again with no other change.

Groovounet
08-10-2011, 03:05 AM
A little bit of fun, the following image show Fermi rasterizer pattern using the atomic counter:
http://img714.imageshack.us/img714/3051/04152.png

Chris Lux
08-10-2011, 04:43 AM
I have the following problem with immutable textures: I create a mipmapped 2d texture using glTexStorage2D, requesting the full amount of mip map levels. After this is upload the data of the single mip maps using glTexSubImage. The problem is the resulting texture objects does not have any mip maps. When i exchange the glTexStorage call by calls to glTexImag2D for all mip map levels the texture is correctly initialized with all mip maps filled.

NOT working:


glTexStorage2D(object_target(),
init_mip_levels,
util::gl_internal_format(in_desc._format),
in_desc._size.x, in_desc._size.y);
// no error reported
for (unsigned i = 0; i < init_mip_levels; ++i) {
math::vec2ui lev_size = util::mip_level_dimensions(in_desc._size, i);
const void* init_lev_data = in_initial_mip_level_data[i];
glTexSubImage2D(object_target(),
i,
0, 0,
lev_size.x, lev_size.y,
gl_base_format,
gl_base_type,
init_lev_data);
}
// still no errors reported


working:


for (unsigned i = 0; i < init_mip_levels; ++i) {
math::vec2ui lev_size = util::mip_level_dimensions(in_desc._size, i);
glTexImage2D(object_target(),
i,
util::gl_internal_format(in_desc._format),
lev_size.x, lev_size.y,
0,
gl_base_format,
gl_base_type,
0);
}

// no error reported
for (unsigned i = 0; i < init_mip_levels; ++i) {
math::vec2ui lev_size = util::mip_level_dimensions(in_desc._size, i);
const void* init_lev_data = in_initial_mip_level_data[i];
glTexSubImage2D(object_target(),
i,
0, 0,
lev_size.x, lev_size.y,
gl_base_format,
gl_base_type,
init_lev_data);
}
// still no errors reported

mfort
08-10-2011, 09:16 AM
I have the following problem with immutable ...


Well, it works in my code. I've implemented it today.

The only road block was the incompatible internal format. It requires sized format. Such as GL_RGBA8 instead of GL_RGBA. The documentation states that correctly. (My fault).

I am going to test the performance changes ....

Chris Lux
08-10-2011, 09:38 AM
Could you check that you really see trilinear or anisotropic filtering on your tests? I always see the base level as if I had only a single level texture in every test I did using sampler objects and plain texture parameters.

Groovounet
08-10-2011, 09:48 AM
Works well for me like what follows but I haven't use the sampler object with it yet:


gli::texture2D Image = gli::load(TEXTURE_DIFFUSE);

glPixelStorei(GL_UNPACK_ALIGNMENT, 1);

glGenTextures(1, &amp;TextureName);
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, TextureName);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_SWIZZLE_R, GL_RED);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_SWIZZLE_G, GL_GREEN);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_SWIZZLE_B, GL_BLUE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_SWIZZLE_A, GL_ALPHA);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_BASE_LEVEL, 0);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAX_LEVEL, 1000);
glTexStorage2D(GL_TEXTURE_2D, GLint(Image.levels()), GL_RGBA8, GLsizei(Image[0].dimensions().x), GLsizei(Image[0].dimensions().y));

for(std::size_t Level = 0; Level < Image.levels(); ++Level)
{
glTexSubImage2D(
GL_TEXTURE_2D,
GLint(Level),
0, 0,
GLsizei(Image[Level].dimensions().x),
GLsizei(Image[Level].dimensions().y),
GL_BGRA, GL_UNSIGNED_BYTE,
Image[Level].data());
}

glPixelStorei(GL_UNPACK_ALIGNMENT, 4);

Chris Lux
08-10-2011, 10:24 AM
Ok i triple checked using sampler objects and no sampler objects with essentially the code by Groovounet. I still do not get trilinear filtering (checked with colored mip levels), i can clearly see massive texture aliasing and only see the date from level 0. When switching to the glTexImage loop of my original post everything works as expected.

I am on Windows 7 x64 using the 280.28 driver. The context is a 4.2 core profile context (also checked with compatibility).

Alfonse Reinheart
08-10-2011, 12:29 PM
It requires sized format.

It does? Oh thank God. It's about time they shoved those unsized formats out the door.

Chris Lux
08-10-2011, 01:28 PM
Ok, Groovounet my friend ;), you have the exact same problem... you did not test what i was describing:

1. your dds image did not contain mipmaps!
2. you didn't even try to enable trilinear filtering!

you can find attached a modified sample and sample image which clearly shows the problem i described.

the fun part is this:


#if 1
glTexStorage2D(GL_TEXTURE_2D, GLint(Image.levels()), GL_RGBA8, GLsizei(Image[0].dimensions().x), GLsizei(Image[0].dimensions().y));
#else
for(std::size_t Level = 0; Level < Image.levels(); ++Level)
{
glTexImage2D(
GL_TEXTURE_2D,
GLint(Level),
GL_RGBA8,
GLsizei(Image[Level].dimensions().x),
GLsizei(Image[Level].dimensions().y),
0,
GL_BGRA, GL_UNSIGNED_BYTE,
0);
}
#endif

Groovounet
08-10-2011, 02:20 PM
LOL, i made a couple of tests and I was about to write that I had the same problem but you beat me that it! :p

pbrown
08-10-2011, 02:54 PM
when using the binding layout qualifier for images as follows:


#version 420 core

layout(rgba16ui, binding = 0) writeonly uniform uimage2D _stuff;
...

the following glsl error is generated:


error C1315: can't apply layout to global variable '_stuff'


This is legal according to the GLSL 4.20 spec, but in the ARB_shader_image_load_store extension the binding is not listed.

It is legal. The "binding" part isn't covered in the ARB_shader_image_load_store extension, because the binding feature for samplers and images is in ARB_shading_language_420pack.

I'll be investigating further tomorrow, but it appears that the problem involves a combination of qualifiers. A test shader I wrote based on your example above, fails with the declaration as above, but compiles successfully without the "writeonly" qualifier.

mfort
08-11-2011, 12:19 AM
Could you check that you really see trilinear or anisotropic filtering on your tests?
I've visually verified that mipmaping and anisotropic filtering works fine with glTexStorage2D. But there is a difference between your and my code. I load only the base level and generate all the others using glGenerateMipmap. I do not use sampler objects.

Chris Lux
08-11-2011, 12:52 AM
I've visually verified that mipmaping and anisotropic filtering works fine with glTexStorage2D. But there is a difference between your and my code. I load only the base level and generate all the others using glGenerateMipmap. I do not use sampler objects.
Yes. I can confirm that glGenerateMipmap works!

Further this allows for the following workaround to be able to load custom mipmap levels:



glTexStorage2D(object_target(),
init_mip_levels,
util::gl_internal_format(in_desc._format),
in_desc._size.x, in_desc._size.y);
// no error reported
for (unsigned i = 0; i < init_mip_levels; ++i) {
math::vec2ui lev_size = util::mip_level_dimensions(in_desc._size, i);
const void* init_lev_data = in_initial_mip_level_data[i];
glTexSubImage2D(object_target(),
i,
0, 0,
lev_size.x, lev_size.y,
gl_base_format,
gl_base_type,
init_lev_data);
// WORKAROUND /////////////////////////////////////
if (i == 0) glGenerateMipmap(object_target());
}
// still no errors reported


Ugly yes, but a workaround to play with immutable textures until we get a bug fix. Which i am sorry to say lately takes very long, but lets hope for the final OpenGL 4.2 driver!

Edit: Until we require mipmapped integer textures... Then this workaround can not work, as glGenerateMipmap does not work on integer textures. So back to the old way to init our textures for now :p.

@nvidia: which release is targeted for the OpenGL 4.2 final implementation? r285?

-chris

mfort
08-11-2011, 01:43 AM
I can also report that usage of glTexStorage2D has very negative impact on performance.

It looks like the glTexStorage2D does not allocate the mipmap levels at all. Later when calling glGenerateMipmap the driver realizes that there are no mipmap levels and allocates them by doing the same patch work as the older drivers. Essentially loading the texture back to memory, reallocating it with mipmaps and loading it back again. This round trip takes about 40ms for some large textures.

@nvidia: could you confirm this bug?

Groovounet
08-11-2011, 02:10 AM
@mfort It's maybe really early to expect performance.

pbrown
08-12-2011, 08:52 AM
@Chris Lux: I've dug into our GLSL compiler and root caused your issue related to the "binding" layout qualifier and have a fix undergoing testing. As I mentioned above, you may be able to get further with current drivers if you omit the "writeonly" qualifier.

I expect that one of my colleagues will be looking at the TexStorage* issue described here soon.

Chris Lux
08-13-2011, 02:44 AM
@Chris Lux: I've dug into our GLSL compiler and root caused your issue related to the "binding" layout qualifier and have a fix undergoing testing. As I mentioned above, you may be able to get further with current drivers if you omit the "writeonly" qualifier.
great to hear...


I expect that one of my colleagues will be looking at the TexStorage* issues described here soon.
fixed that ;). there are two issues that need to be addressed, the mipmap access problem and the performance problem. the texture storage should immediately be allocated at the glTexStorageXD call to be an improvement over the old way to allocate texture.

will there be an updated 4.2 dev driver or will we have to wait for the public release?

Piers Daniell
08-15-2011, 10:15 AM
We'll fix these issues asap. There will be a new developer driver released as soon as these issues are fixed.

Piers Daniell
08-24-2011, 11:34 AM
Updated OpenGL 4.2 drivers can be found at the usual location:
http://developer.nvidia.com/opengl-driver

The new 280.36 driver addresses at least the following issues reported in this thread:
1) glTexStorage has been fixed to correctly created all mipmap levels.
2) Storage is allocated upfront when glTexStorage is called to address a performance issue that Chris reported.
3) The problem with mixing the binding layout qualifier and writeonly with images has been fixed so the shader text "layout(rgba16ui, binding = 0) writeonly uniform uimage2D _stuff;" now compiler correctly.

Chris Lux
08-25-2011, 01:28 AM
Thanks for the update Piers!

In my tests the bugs to no occur anymore ;). Great work with the quick fixes...

mfort
08-30-2011, 07:42 AM
Thanks Piers. Driver 280.36 is improvement.
But there is still some weird performance issue. When I use the glTexStorage2D with no mipmaps then I get decent performance improvement over R275. Mostly because the first glTexSubImage after creating the texture does not suffer any slowdown (more here: http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&amp;Main=58030&amp;Number=3010 23).

But when I enable the mipmaps then creating textures with glTexStorage2D is about 10 times slower then calling glTexImage2D(..., NULL) for each level. The slowdown in previous drivers was caused by glGenerateMipmap. In this drivers any simple call to OGL stalls a bit. (5-7ms)

Alfonse Reinheart
08-30-2011, 10:24 AM
What do you mean by "enable the mipmaps"? The way glTexStorage2D should be used is that it is pretty much the first thing you call on a texture object.

mfort
08-30-2011, 12:09 PM
What do you mean by "enable the mipmaps"? The way glTexStorage2D should be used is that it is pretty much the first thing you call on a texture object.
I "enabled" it my app. In the end it means I requested to allocate certain number of mipmap levels in OGL and later using mipmap texture sampling. So nothing special actually.

Chris Lux
08-30-2011, 01:07 PM
I still have issues with the binding layout specifier. For example a simple shader:



#version 420 core

in per_vertex {
vec3 tex_coord;
} v_in;

// attribute layout definitions ///////////////////////////////////////////////////////////////////
layout(location = 0, index = 0) out vec4 out_color;

layout(binding = 0) uniform sampler2D tex_image;
layout(binding = 1) uniform sampler3D tex_volume;

main()
{
//vec4 c = texture(tex_image, v_in.tex_coord.xy);
vec4 c = texture(tex_volume, v_in.tex_coord);

out_color = c;
}


On the client side i just bind the textures with according sampler objects to the unit 0 and 1 without setting the according uniform values to 0 an 1. When using the 2D sampler in the shader i get the results, but when using the 3D texture and thus eliminating the 2D sampler in the link process i get no results.

Alfonse Reinheart
08-30-2011, 04:24 PM
BTW, you should know that if a fragment shader has only one user-defined output variable, it will automatically be bound to location=0, index=0. So there's no need to explicitly state it.

Chris Lux
08-31-2011, 12:47 AM
I know that, but i like it explicit.

Piers Daniell
09-01-2011, 09:06 AM
@mfort
> But when I enable the mipmaps then creating textures with glTexStorage2D is about 10 times slower then calling glTexImage2D(..., NULL) for each level.

With the latest 280.36 driver the glTexStorage2D will allocate the hardware surfaces upfront instead of deferring it to the glTexSubImage2D calls. The older style glTexImage2D(,,NULL) calls won't create the hardware surface.

Overall, calling glTexStorage2D and then glTexSubImage2D for each level should be about the same performance as glTexImage2D(,,NULL) then glTexSubImage2D for each level. And they should both be faster than using glGenerateMipmaps or glTexParameter(GL_GENERATE_MIPMAP, TRUE). The advantage of glTexStorage2D is that the surface allocation happens immediatly and the texture is immutable (except for image data).

If this is not the performance what you're seeing in your app, could you paste some code/pseudo-code to explain where performance is falling short. Thanks.

Piers Daniell
09-01-2011, 09:07 AM
@Chris Lux
I'll investigate this bug and respond soon.

Chris Lux
09-01-2011, 11:28 AM
@Chris Lux
I'll investigate this bug and respond soon.

Thanks for your efforts, fortunately it was a problem on my end. Everything now seems to work as expected...

-chris

Chris Lux
09-01-2011, 12:30 PM
With the latest 280.36 driver the glTexStorage2D will allocate the hardware surfaces upfront instead of deferring it to the glTexSubImage2D calls. The older style glTexImage2D(,,NULL) calls won't create the hardware surface.
I am also seeing some strange performance hits when using TexStorage. In the following log example i am generating two volume textures and one 1D texture:
- using TexStorage3D to allocate the texture image including mipmaps
- then i upload the first level using TexSubImage3D
- after that i use GenerateMipmap to generate the missing mipmaps

As you can see the first volume is completed very fast. Then the second takes a huge amount of time for TexStorage3D. What is _most_ interesting is that TexStorage+TexSubImage on the 1D texture after the two volume textures takes also extremely long...

This was tested on Windows 7 x64, GeForce GTX 480 1.5GiB, 280.36. The times were taken using QueryPerformanceCounter.



volume_data::volume_data(): loading raw volume...
> allocating texture storage (dimensions: (501 401 576), format: R_8, mip-level: 10, size : 110.358MiB)...
> allocating texture storage done. (elapsed time: 0.000s)
> uploading source mip-level 0...
> uploading source mip-level 0 done. (elapsed time: 0.059s)
> generating mip-levels 1 - 10 ...
> generating mip-levels 1 - 10 done. (elapsed time: 0.000s)
volume_data::volume_data(): loading raw volume done.

volume_data::volume_data(): generating pre-multiplied volume...
> generating color and alpha lookup tables...
> generating color and alpha lookup tables done. (elapsed time: 0.000s)
> starting pre-multiplication...
> pre-multiplication done. (elapsed time: 0.430s)
> allocating texture storage (dimensions: (501 401 576), format: RGBA_8, mip-level: 10, size : 441.433MiB)...
> allocating texture storage done. (elapsed time: 7.278s)
> uploading source mip-level 0 ...
> uploading source mip-level 0 done. (elapsed time: 3.884s)
> generating mip-levels 1 - 10 ...
> generating mip-levels 1 - 10 done. (elapsed time: 0.002s)
volume_data::volume_data(): generating pre-multiplied volume done.

volume_data::volume_data(): generating color map...
> generating color map texture data...
> generating color map texture data done. (elapsed time: 0.001s)
> allocating texture storage and uploading texture data (dimensions: 256, format: RGBA_8, mip-level: 1, size : 1.000KiB)...
> allocating texture storage and uploading texture data done. (elapsed time: 10.247s)

mfort
09-02-2011, 04:53 AM
@Piers - I've also done some investigation about glTexStorage. The situation is more complicated then I thought. I was able to reproduce this performance problem in my test app. The key thing that hurts the performance is binding the texture object between glTexStorage and glTexSubImage.

This works fast:


glTexStorage2D(GL_TEXTURE_2D, numLevels, GL_RGBA8, textureWidth, textureHeight);
glBindBuffer(GL_PIXEL_UNPACK_BUFFER_ARB, texturePBO);
glTexSubImage2D(target, 0, 0, 0, textureWidth, textureHeight,GL_BGRA_EXT, GL_UNSIGNED_BYTE, NULL);


This is slow:



glTexStorage2D(GL_TEXTURE_2D, numLevels, GL_RGBA8, textureWidth, textureHeight);

glBindTexture(target, textureObj); // This makes it slow

glBindBuffer(GL_PIXEL_UNPACK_BUFFER_ARB, texturePBO);
glTexSubImage2D(target, 0, 0, 0, textureWidth, textureHeight,GL_BGRA_EXT, GL_UNSIGNED_BYTE, NULL);


I'd like to note that there is no such slowdown when creating the texture by calling glTexImage2D(..., NULL) for each mipmap level.

I believe there are some sanity checks inside glBindTexture but due to some bug the driver does something that should not be done.

BTW. I measure the time from creating the texture up to the drawing a test triangle. This way I am sure that all the asynchronous processing is really done. When I do not rebind the texture then it takes 0.5ms. When I rebind it then it takes 7.5ms. My test hw is GTX260 @ WinXP x64. I get similar results on other computers with different NV cards. Driver 280.36.

Piers Daniell
09-02-2011, 09:43 AM
@Chris
Any chance I could get a more detailed repro for your case?

@mfort
I think I can easily recreate your case. I'll investigate soon.

Piers Daniell
09-08-2011, 10:49 AM
@mfort
I was able to reproduce this issue and I have now fixed the bug. I'll try to get a new driver released soon. Hopefully this same fix will help Chris too.

Chris Lux
09-09-2011, 12:16 AM
@mfort
I was able to reproduce this issue and I have now fixed the bug. I'll try to get a new driver released soon. Hopefully this same fix will help Chris too.

Hello Piers,
i sent you a PM a few days ago with a repro binary and some source code. Did you get to take a look?

robotech_er
09-12-2011, 11:00 AM
when place memoryBarrier() in shader code, the driver reports that "error C7531: global function memoryBarrier requires "#extension GL_EXT_shader_image_load_store : enable" before use.while i sure specify the version number like this at the very beginning:

#version 420 compatibility

is it a bug or do i miss something? thanks in advance.
i use the 32bit version of this driver (280.36).

Alfonse Reinheart
09-12-2011, 11:08 AM
That's not the first time I've seen NVIDIA's drivers mistake core functionality for something in an extension. I had to use `#extension GL_EXT_gpu_shader4 : enable` just to get gl_PrimitiveID to work in a geometry shader.

Piers Daniell
09-12-2011, 04:03 PM
I have posted updated OpenGL 4.2 developer preview drivers to the usual location: http://developer.nvidia.com/opengl-driver. (http://developer.nvidia.com/opengl-driver)

The new version is 280.47 for Windows and 280.10.01.04 for Linux.

Amongst other things, this new driver addresses the following:
1) glTexStorage performance issue should be fixed.
2) Atomic counter performance has been substantially improved.
3) Issue with gl_PerVertex interface block redeclaration in vertex shader has been fixed.
4) Fixed issue with atomic counters and glBufferData(,,NULL), where the buffer object wasn't created properly.

Enjoy!

Piers Daniell
09-12-2011, 04:39 PM
@robotech_er
Thanks for the bug report. This problem with memoryBarrier() is an oversight. It will be fixed in a future driver.

@Alfonse Reinheart
Hmm, gl_PrimitiveID has apparently been broken since GLSL 150. Will be fixed in a future driver.

Piers Daniell
09-12-2011, 04:40 PM
@Chris Lux
Sorry I haven't got back to you on this yet. I had some trouble downloading from dropbox at work. If you get a chance, please see if 280.47 fixes this issue. In the meantime I'll try to get your package when I'm outside the work firewall.

robotech_er
09-12-2011, 10:05 PM
@Alfonse Reinheart & @Piers Daniell, thanks.

another weird thing, use imageLoad() do vtf like thing, at first everything works right:

#version 420 compatibility

layout(r32ui) readonly uniform uimage2D tex_height;

void main()
{
ivec2 itex_coord = ivec2(gl_Vertex.xz);
float height = float(imageLoad(tex_height, itex_coord).x);
vec4 newVertexPos = gl_Vertex * vec4( 100.0, 1.0, 100.0, 1.0) + vec4( 0.0, height, 0.0, 0.0);
gl_Position = gl_ModelViewProjectionMatrix * newVertexPos;
}

then , need another imageLoad operation after the original one:

#version 420 compatibility

layout(r32ui) readonly uniform uimage2D tex_height;
layout(r32ui) readonly uniform uimage2D tex_offset; //new added image

void main()
{
ivec2 itex_coord = ivec2(gl_Vertex.xz);
float height = float(imageLoad(tex_height, , itex_coord).x);

uint val = imageLoad(tex_offset, itex_coord).x; //the second image access, if comment out, everything works right

vec4 newVertexPos = gl_Vertex * vec4( 100.0, 1.0, 100.0, 1.0) + vec4( 0.0, height, 0.0, 0.0);
gl_Position = gl_ModelViewProjectionMatrix * newVertexPos;
}

and now everything messes up, it seems like the return value of the first imageLoad() is kind of undefined value, but obviously the two imageLoad() operations are totally unrelative. why? thanks.
i have updated the driver to the latest 280.47.

Chris Lux
09-13-2011, 12:09 AM
@Piers Daniell

I found the problem with my example. The main problem with the timing is, that per default the nvidia driver enables threading optimizations. This defers the execution of some calls. In my example the GenerateMipmap call takes very long for the volume textures and it seems it is deferred until the next TexStorage call. After disabling the threading optimizations GenerateMipmap shows the expected longer execution times. I think i can work around this by building the mipmaps myself in a real world application (this beeing only a test).

Then again THANKS Piers for the great efforts bringing us beta drivers after fixing issues. I think this is something that nvidia should keep up, always having some OpenGL developer drivers available after some serious fixes or additions of new features.

Edit: Any news on the planned availability of GL_ARB_cl_event and cl_khr_gl_event extensions?

Thanks
-chris

Piers Daniell
09-13-2011, 07:59 AM
The first beta from the new r285 family of driver has just been posted to nvidia.com as version 285.27. Note that this specific driver does not contain the gl_PerVertex fix, atomic counter performance or glTexStorage performance fix that is in 280.47. So if you need these fixes, please stay with 280.47. The next r285 driver will contain all these fixes. Otherwise the new 285.27 driver has all the OpenGL 4.2 goodness that 280.36 has.

robotech_er
09-13-2011, 08:37 PM
@Alfonse Reinheart & @Piers Daniell, thanks.

another weird thing, use imageLoad() do vtf like thing, at first everything works right:

#version 420 compatibility

layout(r32ui) readonly uniform uimage2D tex_height;

void main()
{
ivec2 itex_coord = ivec2(gl_Vertex.xz);
float height = float(imageLoad(tex_height, itex_coord).x);
vec4 newVertexPos = gl_Vertex * vec4( 100.0, 1.0, 100.0, 1.0) + vec4( 0.0, height, 0.0, 0.0);
gl_Position = gl_ModelViewProjectionMatrix * newVertexPos;
}

then , need another imageLoad operation after the original one:

#version 420 compatibility

layout(r32ui) readonly uniform uimage2D tex_height;
layout(r32ui) readonly uniform uimage2D tex_offset; //new added image

void main()
{
ivec2 itex_coord = ivec2(gl_Vertex.xz);
float height = float(imageLoad(tex_height, , itex_coord).x);

uint val = imageLoad(tex_offset, itex_coord).x; //the second image access, if comment out, everything works right

vec4 newVertexPos = gl_Vertex * vec4( 100.0, 1.0, 100.0, 1.0) + vec4( 0.0, height, 0.0, 0.0);
gl_Position = gl_ModelViewProjectionMatrix * newVertexPos;
}

and now everything messes up, it seems like the return value of the first imageLoad() is kind of undefined value, but obviously the two imageLoad() operations are totally unrelative. why? thanks.
i have updated the driver to the latest 280.47.


i am sure that it is not the driver problem. the same code works alright under another app.

Sincere apologize for the waste of time....

Mahesh Kondraju
09-18-2011, 06:32 PM
Hi,

Just want to know whether Nvidia GeForce GT 555M support's the new opengl 4.2 drivers ?

cheers,
Mahesh Kondraju

Aleksandar
09-19-2011, 12:56 AM
Why don't you try and find out by yourself? :)

GT 555M is based either on GF108 or on GF106 (only Lenovo Y570p/Y560p uses GF108, but it's the slowest model), so it supports GL4.2 completely. The only problem that might arise is the existence of only desktop drivers for GL4.2. Try to download tweaked drivers from laptopvideo2go, or wait until the regular once become published.

megaes
09-19-2011, 02:30 PM
Hi,

Setting of binding point for UBO through GLSL doesnt work for me. Is it my problem or driver bug? I'm using the core profile and "classic" shader initialization. And my OS is Vista x64

Piers Daniell
09-23-2011, 08:35 AM
Could you post a minimal shader and GL call sequence that shows the problem.

megaes
09-24-2011, 08:02 AM
Could you post a minimal shader and GL call sequence that shows the problem.
Ok. Here is my pseudo code for GL 4.1

There are 2 UBOs, which are declared in shader


layout(std140,row_major) uniform Common {
float fTime;
mat4 mProjection;
};
layout(std140,row_major) uniform Transformations {
mat4 mWorldView[1024];
};

After linking of shaders i manually specify binding points for my UBOs in GL code


glUniformBlockBinding(m_hGLShaderProgram,glGetUnif ormBlockIndex(m_hGLShaderProgram,"Common"),0);
glUniformBlockBinding(m_hGLShaderProgram,glGetUnif ormBlockIndex(m_hGLShaderProgram,"Transformations"),1);

But in GL 4.2 there is ability to specify binding points directly in shader. So my shader for GL 4.2 looks like



layout(std140,row_major,binding=0) uniform Common {
float fTime;
mat4 mProjection;
};
layout(std140,row_major,binding=1) uniform Transformations {
mat4 mWorldView[1024];
};

And for me binding through shader doesnt work, i still must specify binding points through GL code

megaes
11-03-2011, 09:55 PM
Excuse me, but what about my problem?! :)

Ok, i know, my English sucks, i will explain the problem using another words:

I have an OpenGL context with OpenGL 4.2 core profile, i don't use GL_ARB_separate_shader_objects and i can't set
the uniform block binding index through GLSL using the binding layout qualifier

Version 4.20 binding doesnt work for me
http://www.opengl.org/wiki/Uniform_Buffer_Object

I am working under Vista x64 and i have 460GTX

Groovounet
11-04-2011, 02:29 AM
Excuse me, but what about my problem?! :)

Ok, i know, my English sucks, i will explain the problem using another words:

I have an OpenGL context with OpenGL 4.2 core profile, i don't use GL_ARB_separate_shader_objects and i can't set
the uniform block binding index through GLSL using the binding layout qualifier

Version 4.20 binding doesnt work for me
http://www.opengl.org/wiki/Uniform_Buffer_Object

I am working under Vista x64 and i have 460GTX

Strange, that's something that works for me I believe... you brought me a doubt now. Erm.

megaes
11-04-2011, 08:17 AM
Strange, that's something that works for me I believe... you brought me a doubt now. Erm.

But you use GL_ARB_separate_shader_objects and i dont