PDA

View Full Version : How to get around limitations of glslang on Radeon 9700



djharr
03-12-2004, 01:13 PM
I am trying to write a shader for the Radeon 9700 using glslang, but I am continually running into a problem where the shader runs in software because I have exceeded the number of available ALU instructions. What are some ways of breaking the thing up into smaller pieces? If you create a couple of modules and compile them separately, then link them and call the functions from main, does that allow you to have a slightly bigger program? I would go ahead and write this in a couple of passes, but I have no place to store the intermediate calculation values, and I would hate to have to go to all the trouble of initializing another float pbuffer to hold the intermediates. I just spent a bunch of time getting a double buffered pbuffer working so I could avoid the speed hit of a context switch, and I would really like to avoid using another buffer. I suppose if I could get my pbuffer to have three or more buffers, that might work, but I have never seen any code that actually works with that.

In any case, any pointers would be highly appreciated.

David

V-man
03-13-2004, 09:04 AM
Update your driver often and maybe you'll get lucky http://www.opengl.org/discussion_boards/ubb/smile.gif

I wrote a simple program and I exceeded the number of temporaries. I tried to rearrange a long equation I had but still no dice.

The native # of temps is suppose to be 32 for FP, and I think it should not had exceeded this.

Exceeding the max ALU number is easy on Radeon (for FP). It's 64 only.

Pierre B.
03-22-2004, 02:45 PM
you can try to use Ashli; it is compiler from ATI which allows to virtualize ressources like register usage or number of instructions.

http://www.atitech.ca/developer/ashli.html

Pierre B.

djharr
03-22-2004, 03:57 PM
Yeah, I looked at Ashli. The problem is that I am writing to and from float pbuffers (doing numerical, rather than graphical calculations), and I didn't see any good way to impart that information to the program. When I tried to plug my fragment program in anyway, it crashed, hard. So, I am back to looking at about 7-8 passes, with using a lot of auxiliary buffers to hold intermediate results. Yuck.

David