Core slower than Compatibility (Linux)

Hi everyone.
I’ve run into an issue with AMD Linux drivers (11.05). When I initialise my test application to use the 4.1 Core profile, we get approx 44 frames per second. However, when I initialise the app to use the compatible profile, I get 52 fps. The difference is striking (18%) for the exact same code, minus the context creation code. My only thoughts are that the Core profile does more API checking, hence why it’s slower than the compatible profile. The final visual output is identical, the only difference is the context creation code.

[CORE]
OpenGL renderer: AMD Radeon HD 6300 series graphics
OpenGL version: 4.1.11005 Core profile context
OpenGL Shading Language: 4.10
fps: 44

[COMPATIBILITY]
OpenGL renderer: AMD Readeon HD 6300 series graphics
OpenGL version: 4.1.11005 Compatibility Profile Context
OpenGL Shading Language: 4.10
fps: 52

The profile selection code is derived from the Wiki source:
http://www.opengl.org/wiki/Tutorial:_OpenGL_3.0_Context_Creation_(GLX)

Anyone have any ideas how I can prevent the core profile from being slower than the compatible profile?

As awful as this sounds: does it really matter? I can see where one uses the core profile in development builds, but in general for release, whatever makes it go faster, eh?

For what it is worth NVIDIA has made repeated noises that as far as performance is concerned, a core profile not only does not help, but potentially hurts. I don’t remember AMD saying that, but it looks like their driver on Linux behaves that way too.

On the other hand, I target GL3/4 core profile because:

  1. I seem to remember seeing/reading that Mac OS-X with GL3 is core only.

  2. Likely when GL3 (or higher) is brought to a new platform, it will be core profile (this is what the EGL spec advises). Though I freely admit, that I am staring at the embedded world and if it is GL3 or if GLES3 is made I have no clue.

But what I release to the world all depends. Shudders… different startup code paths depending on GPU and driver version.

It appears as if the driver writers took the shortcut of implementing the Core profile ABOVE the legacy driver. It checks the API calls, then forwards the call to the legacy layer. The extra checks and indirection will never be faster than calling the legacy layer directly.

I guess market realities meant that they could release a Core profile driver in 1 month vs 6 months doing the proper thing.

I think the big case for GL3/4 core profile is for new vendors of hardware, those that do not already have a GL implementation on which to build. But on desktop it is AMD or NVIDIA… though I guess in theory you could say Intel as well for GL3 now… but each of these already have a GL implementation.

The embedded world, if it ever gets GL3, then we might see something nicer… maybe…

Can you provide us with a link or a citation for this claim, please?

Can you provide us with a link or a citation for this claim, please? [/QUOTE]
http://www.slideshare.net/Mark_Kilgard/gtc-2010-opengl Page 97 is one.

Thanks!

If it’s slower, it’s the implementation responsibility, not the specification.

This is just an illegitimate claim. The problem might be that only Apple has a real available OpenGL 3.2 core implementation.

Yes, I’ve found Apple less forgiving for non-compliance to the spec. One thing I’ve found on Lion is that my lighting equations are oversaturated (always white) on Lion (core profile), while on the exact same hardware under Bootcamp and Linux (core profile) everything is fine. The exact same shader (minus the in/attribute conversion and fragcolour) works fine on all other hardware, both fixed function and core profile. But that is another issue altogether which I cannot debug at this point in time (Linux is production platform).

One thing I’ve found on Lion is that my lighting equations are oversaturated (always white) on Lion (core profile), while on the exact same hardware under Bootcamp and Linux (core profile) everything is fine.

How do you know that that’s not a problem with 10.7? It’s too new to dismiss such things as being the fault of older, more reliable implementations.

How do you know that that’s not a problem with 10.7? It’s too new to dismiss such things as being the fault of older, more reliable implementations.

I was trying to be diplomatic :slight_smile: But 9 times out of 10, I do end up discovering a problem with my implementation. I’m just playing the odds (my code is wrong 90% of the time), but I do agree with you the other platforms have well tested and mature implementations. The shaders are stock from the ES 2.0 book, and they also work on the iPhone, so I do strongly suspect Apple’s core profile does something funny.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.