First and foremost, this version is a lot more convenient to use, as it is a pure computation without lookup tables. You just include a piece of code in your shader and call a function - no textures to create and load, no uniform arrays to initialize. This is a big improvement over previous versions.
If you're making a high-performance application, it's going to be inconvenient to you in many ways. The overhead of setting up a texture or uniform array will be negligible compared to the general issues of managing a high-performance rendering engine.

Or, to put it another way, the inconvenience of using textures or uniform arrays or whatever is not the reason why noise functions have not gained widespread use in shaders. Performance is the reason.

Please be reasonable in your demands on a noise algorithm. Noise can be very useful even if it competes for resources with other rendering tasks. It simply makes some things look better, and it can be worth the effort. Hardware rendering is mostly a tradeoff between quality and speed, and procedural shading is not a magic exception. Noise is available as one possible tool when building a shader, but of course it requires some resources.
All I'm saying that the resources/performance it requires is not paid for by the quality improvements as of yet. Not for applications that need every GPU cycle they can get.

Before you criticize the algorithm for requiring too much ALU resources to be useful, please look at the code. The number of computations required is not as huge as you may think. Benchmarking this particular implementation on a GeForce GTX560, I clocked it to around 500 million 3D noise samples per second, with no texture resources being used. That gives plenty of headroom for other more traditional shading tasks as well, don't you think?
Let's take your 500 million samples per second number. Divide that by 60 frames per second; you get 8.3 million samples per frame. Divide that by a quite common 1920x1080 resolution, and you get 4 samples per image pixel. It's even worse if you go up to 2560x1600, where you drop to two samples per pixel.

That pretty much requires deferred rendering now, since you can't afford to have more than 4x overdraw. It also means that you don't have the resources to do much anisotropic filtering, so you're going to get quite a bit of aliasing in your texture.

And this doesn't even take into account processor resources dedicated to other things, like lighting, vertex processing, and so forth. So in order to use even 1 noise sample per image pixel, you have to sacrifice 25% of the hardware's shader resources.

The GTX 560 is upper-midgrade hardware; most graphics hardware is considerably slower. Obviously, graphics hardware gets faster all the time, but the performance from noise simply isn't there yet. Not unless you focus solely on the high end.

So I stand by my statement: "you'd need a fairly beefy GPU to be able to use it freely without dropping performance."