PDA

View Full Version : Updated GLSL noise implementation



StefanG
09-15-2005, 08:51 AM
I finally got around to including a very nice contribution from "mrbill" in my GLSL noise() implementations, first published December 2004 in this forum. Because the new version is a bit faster and also simpler to use, as it avoids one texture lookup, here's the link again.

My GLSL noise implementation (http://www.itn.liu.se/~stegu/simplexnoise/)

The zip archive includes a precompiled Windows binary for testing, with full source of course. Reports on frame rates for current hardware would be appreciated. Leave the window at its default size, read the number in the title bar and post a reply here, along with information on which graphics card you used.

Humus
09-15-2005, 02:50 PM
1104fps on X850XT. I got a message about "Failed to locate gradTexture uniform variable" though, but it appeared to work anyway.

Heady
09-15-2005, 10:37 PM
Hmmm, get only 0.2 fps.
Doesn't work on my 9600XT in HW-Mode.
Get the same error which Humus also have.

PS. the "old"-version looks very strange, I see a lot of "facets".

sqrt[-1]
09-15-2005, 11:11 PM
990 fps Geforce 6800 (plain)

Fastian
09-15-2005, 11:16 PM
Runs in software on 9700 pro. I guess the program istructions are greater than R300 gen can handle :(

StefanG
09-16-2005, 12:41 AM
I got a message about "Failed to locate gradTextureYeah, the main program behaves like that, sorry. That texture is only used for 4D noise, it is never accessed for 2D and 3D noise and thus optimised away by the SL compiler.

StefanG
09-16-2005, 12:53 AM
Runs in software on 9700 pro. I guess the program istructions are greater than R300 gen can handle I suspect the problem is with the large amount of dependent texture lookups. ATI R300 has a rather tight limit for that, while most Nvidia chips are a bit more flexible in that particular respect.

Low-end GeForce 5xxx series can run this (although not very fast), but many ATI 9xxx cards have problems, I know that and I am sorry to say that I have no solution to offer.

You could try the 2D noise instead. Just comment out the active "n=snoise()" line at the very end of the .frag file, and uncomment one of the 2D versions instead. 2D noise might work, because it does not make quite as many dependent texture lookups as 3D and 4D noise.

StefanG
09-16-2005, 01:07 AM
PS. the "old"-version looks very strange, I see a lot of "facets". This is, I'm sad to say, another ATI-specific problem. People report various problems with the 9xxx series running this quite complicated shader. I have no ATI hardware myself, but if anyone has a fix, I'd be happy to update my code.

BTW, the old version is what I posted here in December 2004, I just kept it on the web for reference. Please don't use it, it's slower and uses one extra texture unit.

Fastian
09-16-2005, 02:54 AM
Originally posted by StefanG:

You could try the 2D noise instead. Just comment out the active "n=snoise()" line at the very end of the .frag file, and uncomment one of the 2D versions instead. 2D noise might work, because it does not make quite as many dependent texture lookups as 3D and 4D noise.Tried the snoise 2D function. Getting around 137 fps on my 9700 pro at a resolution of 1360X1024.
Nice :)

Hlz
09-16-2005, 04:40 PM
235 fps on a GeForce FX 5900 (135 fps for the old demo). Nice boost :-).

Humus
09-16-2005, 06:29 PM
Originally posted by StefanG:
I suspect the problem is with the large amount of dependent texture lookups. ATI R300 has a rather tight limit for that, while most Nvidia chips are a bit more flexible in that particular respect.It's the same limitation on R420 so that's not it. It's also not a limit on "dependent texture lookups" as is a common misunderstanding. It's a limit on number of indirections. You can have up to 31 texture fetches on R300, which also means you can have up to 31 dependent texture reads. But the indirection chain can't be longer than 4.

Anyway, the improvements for the R420 are longer shaders and no limit on texture fetches. Since the number of texture fetches doesn't seem to be more than 31, I have to say it's instruction count.

StefanG
09-18-2005, 07:00 AM
235 fps on a GeForce FX 5900 (135 fps for the old demo). Nice boost :-).The speedup is not quite that spectacular. The old demo used 4D noise by default, the new one uses 3D noise. The new version is about 20% faster and uses one texture unit less. Sorry about the confusion.

StefanG
09-18-2005, 07:18 AM
It's also not a limit on "dependent texture lookups" Sorry for my sloppy wording. I meant "number of texture indirections". I actually knew that,
it's just that I wasn't used to thinking along the right lines. Thanks for the correction!
Looking at it, I realise that the indirection chains are never longer than two steps, even
for 4D noise, so I agree, that can't be it.

So, I suppose 3D and 4D noise now hit the 160 instructions limit. Sad. My old version (using a second texture lookup for some things that are now implemented in code) ran fine. I wasn't aware of that I had made the demo incompatible with older ATI chips. On my Nvidia hardware, the ASM output for 3D noise amounts to less than 70 instructions, so I thought it would be OK. Apparently I am doing something in the code that is not ATI friendly and requires more instructions on ATI hardware.

PigeonRat
09-27-2005, 06:02 PM
3d classic noise works fine. 3d simplex noise, however, like for the other R300 users in this thread, gets about 0.1fps.

StefanG
09-28-2005, 06:45 AM
OK, that pins it down to a shader length problem. My old version should run on R300, because it uses a small 1D texture as a lookup table for the simplex traversal order instead of Mr. Bill's clever but instruction-intensive function for it. The old version is still in the directory I linked to above.