GLSL noise fail? Not necessarily!

I read this somewhat disheartening summary in the slides from the recent GDC presentation by Bill Licea-Kane:

“Noise - Fail!”

However, that does not need to be the case any longer. Recent development by Ian McEwan at Ashima Art has given us a new take on hardware friendly noise in GLSL:

https://github.com/ashima/webgl-noise

It might not seem like much, but his algorithm has all the hardware-friendly properties you want, some of which my old GLSL simplex noise demo was missing. In summary, it’s fast, it’s a simple include (no dependencies on texture data or uniform arrays), it runs in GLSL 1.20 and up (OpenGL 2.1, WebGL) and it scales well to a massively parallel execution because there are no memory access bottlenecks.

Concerning this, I would like to get in touch with some people in the Khronos GLSL workgroup. I was last involved in this around 2003-2004, and my contact list is badly outdated. Are any of the good people in the GLSL WG reading this? My email address is “stegu@itn.liu.se”, if you want to keep this private. Just please respond, as I think this is great news.

/Stefan Gustavson

Great stuff. I will try it. Thanks for the info.

For those who want an easy to run demo:

http://www.itn.liu.se/~stegu/simplexnoise/GLSL-noise-ashima.zip

With the default window size, the sphere covers about 70K
pixels, so multiply the frame rate with 70,000 to get the
number of noise samples per second.
On my ATI Radeon HD 4850, I get 5700 FPS, which translates
to about 400 Msamples/s. Whee!

Windows and Linux compatible source code. (Untested on
Linux, but it should compile and run without changes.)
Windows binary (.exe) supplied for your convenience.
Uses only OpenGL 2.1 and GLSL 1.20, so it should compile
under MacOS X 10.5 as well, if you either run it from the
command line or create an application bundle and change the
file paths for the shader files to point to the right place,
e.g. “…/…/…/GLSL-ashimanoise.frag” instead of
“GLSL-ashimanoise.frag”. You also need the library GLFW
to compile the demo yourself (see www.glfw.org).

I tried the Windows binary on Linux through Wine and it run great. The only “problem” is that the compiler issue a warning for the fragment shader: WARNING: 0:252: warning(#288) Divide by zero error during constant folding.


 vec4 ip = 1.0 / vec4(pParam.w*pParam.w*pParam.w, 
                       pParam.w*pParam.w, 
                       pParam.w,0.);

Of course, changing the last parameter of vec4 to any number than 0 remove the warning and don’t modify the noise texture on the sphere.

Good catch. Of course the 0. should be 1., although it does not really come into play in the calculations. I’ll make sure to tell Ian and ask him to update his code as well.

When I run GLSL-ashimanoise demo I get ‘Fragment shader compile error:’ message but program runs ok.
(Windows 7, Geforce GTX 470).

That is probably the bug mentioned above. I have now fixed that in the demo. I have also cleaned up the GLSL code a little.

My own C code was also cleaned up. It’s still a hack, but it’s not quite as ugly anymore.

Nice work.

“Noise - Fail!”

So the “return 0.0” wasn’t due to IP issues but plain old laziness? Interesting twist.

Because of my long standing interest in noise, I have had some insight into the painful and drawn-out process of implementing noise() in GLSL. I would venture a guess that the problems have not been primarily because of licensing or patent issues, but for a lack of a good enough candidate, and a resulting fear of premature standardization.

A noise() function that gets implemented as part of GLSL needs to be very hardware friendly. Previous attempts have been lacking in at least some respects. Ian’s code removes two memory accesses in a very clever way, by introducing a permutation polynomial and creating an elegant mapping from an integer to a 2D, 3D or 4D gradient. This is the first time I have seen a clear candidate for a standard noise() function that both runs well as stand-alone shader code and scales well to a massively parallel hardware implementation.

Also, a standard noise() implementation will need to remain reasonably stable over time. You can’t expect people to create real time shaders using one version of noise() only to have them look different when a slightly better but different version shows up in the next generation of hardware.

In short, a standard noise() needs to be hardware and software friendly, to enable an efficient implementation in silicon but also allow for a shader fallback with good performance. A standard also needs to be good enough to keep around for a long time. This code delivers on both accounts, I think.

Very interesting.

one version of noise() only to have them look different when a slightly better but different version shows up

To me this is the biggest problem.
Indeed, I saw some problems in Renderman-compliant pipelines because noise() was differently implemented.

Having a ‘custom noise done in GLSL code’ makes your shader deterministic and more portable.

Having a ‘custom noise done in GLSL code’ makes your shader deterministic and more portable.

Agreed, and this is what this code does well, right now. The fact that it also manages with GLSL 1.20 amazes me, because that means a direct portability across the board: OpenGL, WebGL, OGL ES.

For me, the ideal situation would be to have a choice between a “shader noise”, where you had complete control and perfect repeatability across platforms, and a ten times faster “hardware noise” that came at about the same cost as a texture lookup, to use for shaders with lots of noise components. In software shading, it is both common and useful to have dozens of noise components in a shader.

(Anybody from the Khronos GLSL workgroup reading this?)

a ten times faster “hardware noise” that came at about the same cost as a texture lookup, to use for shaders with lots of noise components.

I seriously doubt that IHVs are going to start building hardware components for doing noise computations. Hardware makers have been moving away from dedicated hardware functionality for some time, to the point where texture unit hardware on some platforms doesn’t even do filtering logic anymore.

Anybody from the Khronos GLSL workgroup reading this?

And what if they are? The ARB only controls the specification, and the specification already has noise functions and explains what they should do. The ARB cannot make implementers implement the noise function with any particular algorithm; the IHVs have to do that themselves.

The reason I want to get in touch with the GLSL workgroup was mentioned in my original post. I had a long and interesting discussion with them a few years back on hardware-friendly noise algorithms. That time, it stumbled partly on not having a good enough algorithm to recommend as the standard, partly on not having enough processing power in a typical low end GPU. Circumstances have now changed, and I would like them to at least consider reopening the discussion. The GLSL specification is currently way too unspecific on what to implement for noise(), and nobody seems to want to take the first step. I think a recommendation and a reference implementation would do some good towards making that happen.

You are of course right in saying that nobody can force HW manufacturers to support noise in GLSL, but at least they can be handed a well proven algorithm that is recommended as the standard, and get a reference implementation in GLSL code with good performance that could be silently included by the driver when a call to noise() is found in a shader. The market could take it from there. If people start using noise for visual complexity, hardware will be designed to speed it up. Seeing how tremendously important noise is in software shading, I think it could be pretty useful for hardware shading as well.

You are probably right in counting out a custom hardware “noise unit”, at least for the near future, but one way forward from here would simply be to allow a shift in balance between texture lookup bandwidth and ALU speed for upcoming generations of GPU hardware. Procedural texturing means you no longer have to scale the texture access bandwidth with the number of GPU execution units, because many execution units can be put to good use without using any texture bandwidth at all.

(Edit: checking my mail logs, it seems I actually had that noise discussion with the OpenGL ES workgroup, not the GLSL workgroup. Sorry for any confusion. My point still stands: now is a good time to discuss this in more detail.)

I think a recommendation and a reference implementation would do some good towards making that happen.

Neither of which is appropriate for the OpenGL specification. Saying that algorithm X must be used limits GL implementations, and the OpenGL spec does its best to avoid that kind of specificity. The most the spec says is that the results should be no different than if algorithm X were used. This is why the spec is generally lax about what anisotropic filtering really does, as well as has a lot of leeway for multisampling implementations.

I’m all for the IHVs using this for their noise function (though they’ll have to work out how compatible it is with the Artistic License 2.0). It would be great if they would actually go and implement the noise functions rather than have them return zero. But I’m not in favor of the ARB putting in the spec itself that this algorithm must be used for noise functions. That sets a dangerous precedent, and also may run afoul of IP.

The market could take it from there.

That’s something I don’t understand. Sure, noise-based textures can be useful in certain circumstances. But noise is not going to be the basis of most textures in, for example, games. So even if you had a single-cycle noise function, it’s still not going to produce results that are as good for most cases as a well-drawn texture.

Procedural texturing means you no longer have to scale the texture access bandwidth with the number of GPU execution units, because many execution units can be put to good use without using any texture bandwidth at all.

It also means that you have to do anisotropic filtering yourself. For many textures (diffuse maps and such), I’m not sure I consider that a good tradeoff.

Now, one thing that could make it something of a good tradeoff is the current and upcoming series of on-CPU GPUs (Intel’s various “bridges” and AMD’s Fusion). Texture memory bandwidth takes a significant hit, so the best way to compensate is to increase shader complexity to compensate. Also, these really impact deferred rendering.

You seem to underestimate the utility of procedural shading.
True, I am a procedural geek and may be biased the other way, but you can’t fully emulate a procedural pattern with a drawn texture. (The reverse is also true - they solve different problems.) Procedural patterns give you unlimited resolution, infinite texture size without tiling, arbitrary non-repeating pattern animation, enormous flexibility and variation without redrawing a bitmap, analytic derivatives to simplify and improve anisotropic filtering, and a very compact means for expressing visually complex patterns. Noise is seldom used by itself, but it is a very good complement to bitmap textures and more regular procedural patterns like contours and periodic patterns. Turbulent phenomena like water, smoke, fire, stone, dirt and mountains can also be done better if cheap procedural noise is in the toolbox. And you can save a lot on storage and texture memory bandwidth if you use it right. (I could go on, but you get the picture.)

Noise is a fundamental and very heavily used part of software shading, and the visual complexity of offline rendered SFX is very much due to procedural noise. Software shaders use drawn textures too, but procedural methods are very popular as an alternative and a complement.

All I’m saying is nothing has happened for a decade now, so perhaps the spec needs a more clear pointer to what should actually be implemented for the currently broken part of GLSL that is noise(). No vendor is implementing it to spec. I would like that to change, and I would be willing to spend significant amounts of work on it.

Regarding your concerns for IP problems, this is software, so patents are not a concern. If the current license is unsuitable, copyrights can be renegotiated (the author is very much in the loop here) or worked around by a re-implementation. There are no trade secrets, because the code is published openly. The underlying math is not protectable. “Noise” is a generic word and not a registered trademark. What problems do you see? And why should we avoid discussing how to make things better just because there might be IP concerns? If we are so afraid of taking a step forward, we will never get anywhere.

infinite texture size without tiling

Admittedly, I’m not exactly fully versed on noise functions, but don’t the various different methods of computation become less stable as you get farther from the origin? It would be interesting to see how far you can go from the origin with this noise function before it starts not returning good results.

All I’m saying is nothing has happened for a decade now, so perhaps the spec needs a more clear pointer to what should actually be implemented for the currently broken part of GLSL that is noise(). No vendor is implementing it to spec.

True, but they’re not doing it because the specification is wrong or bad or poorly specified, Everyone knows what the noise functions should do, and the spec provides a reasonable description of this while allowing IHVs freedom to implement different algorithms.

They initially didn’t implement them because older hardware was flat-out incapable of it. Even using this algorithm, I don’t think there are enough ALU instructions on a Radeon 9700 or even a GeForce FX to actually make it work. Now on actually good hardware, it’s more a matter of nobody caring. Work is spent on things people use, not things people might use.

Also, GLSL is not a decade old. It’s only been around a half-decade.

To be honest, I would go so far as to say that “noise” is not something that IHVs should be providing at all. It’s just too high-level. In general, you want your noise-based images to be cross-platform. And the OpenGL spec will not (and should not) guarantee a specific noise implementation.

You seem to underestimate the utility of procedural shading.[…]

One should add that, if the noise functions were actually implemented from the start, it would likely be widely used. We’d be discussing the need of consistency across hardware/vendors instead because noise would be so important to us.

don’t the various different methods of computation become less stable as you get farther from the origin

Even if they do, it would still be at much larger scales than a noise lookup image would currently allow.

concerns for IP problems

Ken Perlin’s patent probably covers this as it is pretty broad, and I’m sure he specifically registered it so that others could implement noise functionality freely. My earlier comment was only half serious, I don’t think IP is an issue.

This is kind of a chicken and egg problem. Before noise is available nobody will use it, and there is no way of knowing for certain what people will do with it if it is made available. However, looking at its vast popularity with the RenderMan crowd, it seems like a pretty good idea to just go ahead and implement it. Recommending an algorithm and providing a reference implementation with good performace is a good start. Even if Khronos might not be the entity to formally decide on such matters, it is now time to open up the discussion. (Which is exactly what we are doing now, by the way.) My concern here is that classic Perlin noise was never standardized, and I have seen a lot of the problems caused by that. We could spare the GLSL crowd from repeating the same mistakes.

I was unclear on that “decade of not having noise in hardware”. GLSL is not yet a decade old, but shader-capable hardware was introduced in 2002. I have been doing hardware accelerated procedural textures since then.

Your remark on “infinite size” is certainly correct. The size of the useful support domain ultimately depends on a floating point precision or fixed point range, but that is the case even for vertex position data and ordinary interpolated texture coordinates, so I was equating “floating point precision limited” to “infinite” to get a point across without going into a lot of detail. My apologies if I came across as bending the truth or being sloppy.

One should add that, if the noise functions were actually implemented from the start, it would likely be widely used.

Noise simply was not practical until relatively recent hardware (GL 3.x level). Even if the hardware could have done it before, it would have taken up most of your available ALUs, killing performance. And even on modern hardware, you’d need a fairly beefy GPU to be able to use it freely without dropping performance.

This is kind of a chicken and egg problem. Before noise is available nobody will use it

I disagree with that to an extent. If someone wanted to use noise, they have been free to implement it, whether with this algorithm or with another. And while this algorithm is certainly less resource-intensive than previous examples, it’s still going to compile down to a lot of ALUs.

Therefore, use of it is primarily governed by performance. If this algorithm spurs people to use more noise, it will only be because it is faster than previous ones.

If this algorithm spurs people to use more noise, it will only be because it is faster than previous ones.

Not quite. Please read my original post. First and foremost, this version is a lot more convenient to use, as it is a pure computation without lookup tables. You just include a piece of code in your shader and call a function - no textures to create and load, no uniform arrays to initialize. This is a big improvement over previous versions. It is a true novelty and what I would consider the key feature. The algorithm is actually somewhat slower than my old demo on current mid-range hardware, but it scales better to the massive parallelism in today’s high-end hardware, where memory bandwidth is a bottleneck.

you’d need a fairly beefy GPU to be able to use it freely without dropping performance.

Please be reasonable in your demands on a noise algorithm. Noise can be very useful even if it competes for resources with other rendering tasks. It simply makes some things look better, and it can be worth the effort. Hardware rendering is mostly a tradeoff between quality and speed, and procedural shading is not a magic exception. Noise is available as one possible tool when building a shader, but of course it requires some resources.

I agree that until now, we have not quite seen the levels of GPU performance where you could allow routine use of procedural noise, but the situation is improving rapidly, and memory is becoming the bottleneck, further adding to the benefits of procedural shading.

Before you criticize the algorithm for requiring too much ALU resources to be useful, please look at the code. The number of computations required is not as huge as you may think. Benchmarking this particular implementation on a GeForce GTX560, I clocked it to around 500 million 3D noise samples per second, with no texture resources being used. That gives plenty of headroom for other more traditional shading tasks as well, don’t you think?

I stand firmly by my opinion that procedural shading is a smart thing to do in many situations, and that using it more would create a slightly different and easier path forward for future GPU hardware. Texture bandwidth could become less of a problem.