PDA

View Full Version : MRT : Massive performance hit on ATI hardware



bobvodka
03-18-2007, 07:54 AM
I'm currently doing a project which is doing some GPGPU via OpenGL and I appear to have found a path which causes a pretty massive slow down when using MRT on my X1900XT (WinXP x64, Cat7.2)

The part which is sucking all the resources is the main pass of the algorithm; It makes 4 reads from one texture, 1 read from another and writes out 2 vec4s to two 32bit floating point RGBA textures.

When the rendertarget and source textures are 40*40 in size the max speed of the program is approx. 40fps.

I'm not doing a great deal of ALU ops so I figured I was bandwidth limited somewhere, as I can't reduce the number of reads I was using I instead turned off one of the writes; fps shot up to around 800fps...

From here I refactored the program, pushed in 2 extra passes and reduced all outputs to single render targets and the final fps came out at ~710fps.

Now, while I'm happy with the improvement I'm left wondering why on earth I took such a massive performance loss in the first place? I'm pretty sure the same kinda thing isn't seen in D3D so is this another sign of poor ATI OGL drivers? or maybe it was something I did wrong?

Relivent C++ code included below, I've omitted the shaders simply because the new stuff does the same as the old just split over an extra couple of passes which leads me to belive the problem is either in my setup or some other state management;


// All textures created for RTT are setup with this function
void SetupRenderTarget(const int size)
{
glTexImage2D(GL_TEXTURE_2D,0, GL_RGBA32F_ARB,size, size, 0, GL_RGBA, GL_FLOAT,NULL);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
}

void TLMFullShader::GenerateTLMData()
{
glPushAttrib(GL_COLOR_BUFFER_BIT | GL_VIEWPORT_BIT); // Save the clear colour and viewport

// Setup the view for orthographic projection.
camera_.setMatricies();
// Switch to render target
rendertarget_.activate();

glClearColor(0.0f,0.0f,0.0f,0.0f);
glViewport(0,0,vertsperedge_,vertsperedge_);
glClampColorARB(GL_CLAMP_VERTEX_COLOR_ARB, GL_FALSE);
glClampColorARB(GL_CLAMP_READ_COLOR_ARB, GL_FALSE);
glClampColorARB(GL_CLAMP_FRAGMENT_COLOR_ARB, GL_FALSE);

// Setup for pass 1
glActiveTexture(GL_TEXTURE1);
glBindTexture(GL_TEXTURE_2D, energySource_); // Setup the energy source map
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, drivingMap_); // Setup driving map texture

// Setup the RT output
GLenum pass1RT[2] = { GL_COLOR_ATTACHMENT0_EXT, GL_COLOR_ATTACHMENT1_EXT};
rendertarget_.attachRenderTarget(heightMap_, 0);
rendertarget_.attachRenderTarget(energySink_, 1);
glDrawBuffers(2,pass1RT);
rendertarget_.checkStatus();
pass1_.use();
pass1_.sendUniform("energySource",1); // setup the sampler for the energy source map
pass1_.sendUniform("drivingMap",0);
pass1_.sendUniform("step",1.0f/float(vertsperedge_));
// Draw quad here
DrawQuad(0.0f, 0.0f, 1.0f, 1.0f);

// now we need to copy the height map to an VBO for later rendering
glBindBuffer(GL_PIXEL_PACK_BUFFER_ARB, heightBuffer_);
glReadBuffer(GL_COLOR_ATTACHMENT0_EXT);
glReadPixels(0,0,vertsperedge_,vertsperedge_,GL_BG RA, GL_FLOAT,NULL); // copy to VBO
glBindBuffer(GL_PIXEL_PACK_BUFFER_ARB,0);
rendertarget_.detachRenderTarget(heightMap_, 0);
rendertarget_.detachRenderTarget(energySink_, 1);

... // rest off passes from here on out have no real effect on fps, above causes the major speed hit
}Blend is off, depth test is also off.
'rendertarget' is just a thin wrapper over an FBO
'pass1' is just a thin-ish wrapper over a GLSL Program object.
glReadPixels isn't the problem as the speed hit goes away when the DrawQuad() call is commented out with everything else enabled.

So, dodgy drivers?
Hardware limit?
Bad setup?
Other ideas?

Malte Clasen
03-22-2007, 07:11 AM
Originally posted by bobvodka:
I'm currently doing a project which is doing some GPGPU via OpenGL and I appear to have found a path which causes a pretty massive slow down when using MRT on my X1900XT (WinXP x64, Cat7.2)
Same here. The drivers recompile _all_ shaders each time the number of render targets change. So in your case the shaders are recompiled twice per frame. This is a known problem (at least since 10/2006), but devrel@ati.com told me that they won't fix it.

Btw, can anyone here say whether this also applies to XP x86 or Vista x86/x64 ?

Malte

bobvodka
03-22-2007, 08:05 AM
ouch... well, that certainly explains it, thanks for clearing that up.

Kinda sucks that they aren't going to fix it either as it basically makes MRT unusable on ATI hardware.

This could be the straw which sends me back to NV...

Jan
03-22-2007, 12:39 PM
WTF?

I haven't use MRT so far. Does this mean, that ATI drivers ALWAYS recompile all shaders, when you use MRT and switch the buffers?

That would make it useless, indeed. And i was hoping to do deferred rendering sometime... :(

Jan.

bobvodka
03-22-2007, 12:49 PM
Well, to directly quote the reply I've just got from ATI/AMD's Devrel;



You are correct, the problem with using MRTs in OpenGL is a known issue and we have no plans to fix it in our current driver. Although there is a chance that this could be implemented in a different way in our Vista driver, we do not have any official word. Thus, I would suggest that you find another way to implement your project.
So, it might be fixed in the Vista drivers (but honestly, I wouldn't count on it) you can pretty much count it out for XP's drivers.

Congrats to AMD, they just lost a graphics card customer and I'll be happy to advise others against AMD/ATI branded cards in the future :)

Korval
03-22-2007, 01:12 PM
You could just wait for Longs Peak to see if that fixes it.

bobvodka
03-22-2007, 01:22 PM
I could, however I wanted to start using it now, not in 6 to 8 months time when drivers appear and are working.

knackered
03-22-2007, 01:41 PM
Thus, I would suggest that you find another way to implement your project.
Or drop support for ATI cards, like we have.
Seriously, I can't even begin to imagine nvidia taking that attitude with developers.

Korval
03-22-2007, 04:05 PM
To be fair, how many GL applications use MRT anyway? The main GL applications (Doom3 engine, Maya, XSI, some CAD stuff) don't. So really, there is little incentive to proceed with fixing it.

Especially with Longs Peak coming around the corner.

bobvodka
03-22-2007, 04:20 PM
OK, accepted, but then why even support the MRT extension to GLSL? Because it's practically un-usable in it's current state.

It's not even like the non-power-two extension, where if you stick to a few guidelines it's usable on current hardware...

Korval
03-22-2007, 04:35 PM
OK, accepted, but then why even support the MRT extension to GLSL?Is it in OpenGL 2.1? If so, there's your answer.

Jan
03-22-2007, 04:39 PM
If i remember correctly, ATI was first to support MRT, at all. Didn't they even have an ATI specific extension (ATI_DrawBuffers, or so)?

That was long before OpenGL 2.1. Pretty pointless, IMO.

Jan.

Eric Lengyel
03-22-2007, 05:45 PM
Yes, ATI was the first to have MRT, and it was exposed as an ATI-specific extension.

ATI's attitude in the above post from their devrel is the same inexcusable attitude I've gotten from them for years now. OpenGL and people who use it seem to be nothing more than an annoyance to ATI. Getting an existing feature to work well or correctly is hard enough, but try getting them to implement an important feature like EXT_packed_depth_stencil or EXT_framebuffer_multisample and you'll learn a lesson in futility. (I did see that EXT_packed_depth_stencil is in the extensions string under Vista, but their implementation is completely broken and unusable.)

Nvidia, on the other hand, has continually shown amazing support for OpenGL. Just look at all those awesome G80 extensions! And they actually work the way they're supposed to. Life would be much better if I could just drop ATI support like knackered did.

Korval
03-22-2007, 06:15 PM
Here's the funny thing.

ATi/AMD have two chairpersons on subgroups within the Kronos GL ARB. They have the chairs for the Ecosystem and Shading Language subgroups.

And yet they have what is probably the weakest implementation of said shading language.

I've always been under the assumption that ATi is busy preparing for Longs Peak. That once writing a GL implementation becomes more manageable, they'll get better drivers.

At this point though, I don't know. Will they even bother to support Longs Peak? Will they still support 2.1 after LP hits?

ATi's a big question mark, and their ineptitude is really holding OpenGL back.

Humus
03-22-2007, 10:24 PM
Let me clarify the answer from devrel above. There's no plan to fix it in the current driver, that is, the legacy driver that currently ships on XP. However, the new driver that currently ships on Vista is a totally redesigned driver built from scratch, the famous "OpenGL rewrite" that's been rumoured on the net for quite a while. I don't know if the same problem exists on that driver, my gut feeling would be that it probably doesn't, but I haven't tried so I can't say for sure right now, but if it does, then certainly it will be fixed in that driver. This new driver currently only ships on Vista, but soon enough it'll ship for all platforms and hardware and the legacy driver will be retired. The legacy driver has been on the backburner for quite a while now while the majority of the driver team has been working on the new driver. Since this project hasn't been public until very recently it could easily have been perceived from the outside that ATI stopped putting effort into OpenGL, while the truth was actually the opposite. Rewriting a driver from scratch is a major undertaking, and the project has been going on for a couple of years. During this time guys like me have had a hard time defending the fact that certain issues would not get immediate attention. While it certainly did not help me do my job, in the long run the new driver will be a better foundation to build our GL implementation on and won't have some of the architectural problems of the legacy driver and the situation should improve now that the new driver is out in the wild and developers start using it.

Basically, what I'm saying is that while I easily understand that it could have looked that way, it's certainly not the case that ATI doesn't care about OpenGL, it's just that rewriting the driver has been a massive task and unfortunately (like most software projects) has taken longer than originally projected. The good news is that the new driver is now out there and I encourage everyone trying it out on Vista if you get a chance.

Jan
03-23-2007, 03:13 AM
That's very good news.

I really hope, that LP is out soon and that ATI's driver will support it shortly after that. OpenGL 2.1 is a mess and for me it does not make sense to begin writing a new renderer with it.

Jan.

knackered
03-23-2007, 03:41 AM
A very frank, honest and encouraging answer, humus. Thanks.

Overmind
03-23-2007, 04:06 AM
This new driver currently only ships on Vista, but soon enough it'll ship for all platforms and hardware and the legacy driver will be retired.Does that men we'll get usable linux drivers within finite time? :p

bobvodka
03-23-2007, 09:06 AM
Basically, what I'm saying is that while I easily understand that it could have looked that way, it's certainly not the case that ATI doesn't care about OpenGL, it's just that rewriting the driver has been a massive task and unfortunately (like most software projects) has taken longer than originally projected. The good news is that the new driver is now out there and I encourage everyone trying it out on Vista if you get a chance.
Well, to be honest, someone somewhere should have said something to the developers, heck even that reply from devrel could have been better than the vague 'yeah, it might have been fixed, we don't know' which I got. It comes across as not caring and in this game PR and talking to the devs is worth a lot; basically a more definative answer such as 'there is a new version in the works, which will be released soon and for all platforms, which will address this issue' would have been enough.

Now, to be fair, I've just checked the state of things out wrt the Vista driver and while it leaves me with some screen corruption (which given the newness of the drivers and the arch. is fair enuff.. and it goes away when a repaint is forced by dragging a window around) MRT, FBO and GLSL appear to work at a decent framerate (although I think there might be a z-buffer issue in the 7.2 drivers as I'm pretty sure I could see into a cube I shouldn't have been able to see in to, I'll swap to XP x64 to confirm at some point), so I'll refrain from yelling about this for now.

Still, this is a good reason behind why it's important to talk to the developers, lets face it if I hadn't have kicked up a fuss like this about it we woudln't even no now would we?

Humus
03-23-2007, 06:21 PM
Originally posted by Overmind:
Does that men we'll get usable linux drivers within finite time? :p The new driver is cross-platform and Linux support has been one of the important goals from the project start of the new driver, rather than an afterthought as in the legacy driver. I'm not going to give any guarantees as I personally haven't even tried the new driver on Linux yet, but since it's built on the same code from the start I think it's reasonable to expect roughly the same quality as the Windows driver, even though of course the driver model is different between different OSes and there are other OS specific pecularities, so there may still be Linux specific issues in the future as well (as well as Windows specific ones of course).

Humus
03-23-2007, 06:42 PM
Originally posted by bobvodka:
Well, to be honest, someone somewhere should have said something to the developers, heck even that reply from devrel could have been better than the vague 'yeah, it might have been fixed, we don't know' which I got. It comes across as not caring and in this game PR and talking to the devs is worth a lot; basically a more definative answer such as 'there is a new version in the works, which will be released soon and for all platforms, which will address this issue' would have been enough.

Now, to be fair, I've just checked the state of things out wrt the Vista driver and while it leaves me with some screen corruption (which given the newness of the drivers and the arch. is fair enuff.. and it goes away when a repaint is forced by dragging a window around) MRT, FBO and GLSL appear to work at a decent framerate (although I think there might be a z-buffer issue in the 7.2 drivers as I'm pretty sure I could see into a cube I shouldn't have been able to see in to, I'll swap to XP x64 to confirm at some point), so I'll refrain from yelling about this for now.

Still, this is a good reason behind why it's important to talk to the developers, lets face it if I hadn't have kicked up a fuss like this about it we woudln't even no now would we? I mostly agree. We have been communicating this to tier-1 developers though, but not to the general public. The rumour of the "OpenGL rewrite" has been out there for years though, but like many other internal processes, we don't always comment on what's going on inside the company. I agree though that we should have been more forthcoming in telling the less privileged developers as well, such as on forums like this. I have several times raised concerns internally about short-term goals vs. long-term. It's been a hard time for myself as I've had to balance the interests of ATI versus the interests of the developers I work with. One reason why the project has mostly been kept secret for so long is that occasionally deadlines were pushed forward. Again, not unheard of in the software industry, but it makes it a bad idea to talk too much about the new driver when schedules can change and promises would be broken. Also, telling a developer that a bug will be fixed in the new driver is not useful if that driver won't be released for another year. It's also a bad idea to reveal such an interal project before we know its status. In the event that the project would fail and be cancelled and focus shifted back to the original driver, it would not look good outside if we had told the public that a new redesigned driver was in the works.

Since the driver was just recently added to the Vista driver package (previously no ICD was included at all since the legacy driver doesn't support Vista) we have now started telling people about it. I agree that it would have been useful for people to know earlier about this project. However, we didn't want to give any promises until it was actually shipped.

Jan
03-24-2007, 03:21 AM
I think this secrecy is very understandable. It does make sense not to reveal such things too early.

The only problem is the answer from developer relations. It should have been more like "we know about it and we are working on fixing it, but it may take some time". Maybe your PR guys and the ones doing the responses should have a word. Wrong/misleading/badly phrased responses can be very damaging. You know, developers are the guys who are asked often, what cards to buy...

Jan.

PkK
03-24-2007, 04:57 AM
Does that men we'll get usable linux drivers within finite time? :p There's the free DRI drivers for everything up to X850 XT. glsl support will probably appear next month (there's a glsl branch in Mesa which is to be merged soon).
Since ATI does not give any documentation to them they had to reverse-engineer everything.
The drivers work on Linux, FreeBSD and probably some more free OSes.

Philipp