Per-pixel shading without using hardware

Keermalec · September 19, 2001, 4:12am

Hi, I am a bit confused about per-pixel shading, by that I mean modifying the colour of individual pixels. I would like to learn to do this without special buffers or special hardware-supported routines. For example, to anti-alias my scene in a 100x100 pixel window, I want every pixel to have the average colour of the pixels around it. Illustration with a piece of code would be very much appreciated.

Keer

ffish · September 19, 2001, 4:44am

Here’s one method you could use:
(i) glReadPixels the screen(ii) Perform a filter on each pixel using a 2D convolution filter
(iii) glDrawPixels
This would be extremely slow though - definitely not real-time. The imaging subset of OpenGL 1.2 supports 2D convolution filters in hardware on high-end hardware (but not on consumer graphics cards). I can’t think of a method that doesn’t involve extensions, though.

Hope that helps.

Bob · September 19, 2001, 4:51am

Enabling FSAA in the control panel of your graphics board (if supported that is) is also an option. The program itself can’t control it, so it might not be a very good solution, but anyways

Overmind · September 19, 2001, 9:30am

What graphics hardware do you have? When you have got a geforce 1 or 2 you can activate NV20 emulation with the geforce tweak utility (search the web for it), then you get all features of gf3 in software.

The problem with it is that everything except vertex programs it is EXTREMELY slow (but probably faster than reading the framebuffer and doing the operation yourself). Don’t be surprised when you get framerates of 2-3 seconds per frame.

Keermalec · September 20, 2001, 6:39am

I have a GeForce 256 actually, but I am looking for a solution that will work on any graphics card, which is why I am trying to avoid anything hardware-related. I find it hard to believe that with today’s processors (> 1 GHz) we cannot emulate hardware with software solutions.

Working at the pixel level (as opposed to polygon level) is probably the only way to high-quality imaging and that is why I want to learn how to do it. I like to believe shadows, reflections, bump mapping, and anti-aliasing should all be possible without specialised hardware…

I will look into glreadpixels and gldrawpixels to see what sort of framerate I get there.

Bob · September 20, 2001, 12:23pm

Don’t be fooled by a high clock frequency. A high clock frequenct does not have to mean a fast CPU. If the CPU takes several cycles just to execute a single instruction, using a very bad architecture, 1 GHz can be REALLY bad. But that’s not the problem here. Todays graphics hardware is highly specialized to perform only one thing, drawing triangles as fas as possible. It uses a highly optimized architecture, and cannot be compared with a generic architecture as i386 or whatever it’s called.

If you have a hard time believing that a software cannot emulate a hardware with acceptable speed, just execute a Direct3D demo, demonstrating some cool features, and select the reference drivers and see for yourself. The difference in speed between the reference drivers and your hardware drivers is the difference in speed between the actual hardware and the emulator.

The things you described, shadows, reflections, bump mapping are all resource-demanding tasks, and literally requires specialized hardware to be usefull.

pATChes11 · September 21, 2001, 12:58pm

“using a very bad architecture, 1 GHz can be REALLY bad.” Yep, just like Intel’s pos P4

“The difference in speed between the reference drivers and your hardware drivers is the difference in speed between the actual hardware and the emulator.” Not necessarily. The speed (or lack thereof) of the reference driver reflects the inefficiency of the CPU when it comes to 3D graphics. Just as the CPU can’t handle graphics though, the GPU can’t handle the CPU’s instructions. Each has a purpose, so use it that way. If you’re worried about not being able to run a 3D program on other machines, then write your own rasterizer, because D3D’s reference driver just SUCKS. I have seen software raytracers that run much faster than D3D’s reference driver ever will. And OpenGL runs much faster than D3D

You can tell OpenGL to run in software mode… I think it has something to do with GENERIC_ACCELERATED. I’ve never really needed to use that constant for other than making sure someone has hardware OpenGL support though, so I wouldn’t know. But if you really wanna make sure it runs in software on your GeForce 256, just enable the accumulation buffer in the PFD. That way, you’ll get a real clue as to how slow CPU 3D graphics can really be. If you want the ultimate in compatibility, use DirectDraw and make some crappy 2D engine.

As for the antialiasing, what I would like to be able to do is have OpenGL draw to an invisible memory DC, then read the pixels and process then as desired. If you want to do something like nVidia’s HRAA, you could use two memory contexts (the second one would be a pixel bigger on both axis), draw to one, and make a procedure that calls glLoadIdentity(); with glTranslatef(0.5f, 0.5f, 0.5f);. Be sure that your drawing coordinate bounds are like that of the size of your memory DC though, or it won’t work quite like that. Just make sure that the slightly bigger DC draws with a half pixel offset to the upper left. Anyway, once you’ve done that, read a pixel from the first DC, and then read the four pixels that surround it from the second DC. Now, here’s the trick, and the reason that the HRAA technique makes the image blurry: They averaged all the pixels as they are, but the pixels from the 2nd DC should be 1/4th the significance each, because they’re also used in three other final pixels. Do that, and you’ll have “perfect,” “speedy” AA.

Keermalec · November 8, 2001, 5:24am

Patches, that last bit on writing and reading from a an array of pixel data is exactly the sort of thing I am trying to do. Where do I start? I need pointers to some code that shows how to read and write pixel data from a memory location to the screen.

harsman · November 8, 2001, 6:17am

If all you want is a way to write to the framebuffer, you should check out TinyPTC and OpenPTC. They are light weight toolkits for getting frame buffer access. Then you can generate pixels using any algorithm you like (photon mapping, NPR, scan line polygon renderers) and just chuck them at the screen using PTC.

Keermalec · November 8, 2001, 11:32am

Woohoo sounds great! I’m looking for it right now.

Keermalec · November 8, 2001, 11:46am

Hm I can find the linux version but all links to the Win32 port seem to be dead. gaffer.org itself is not answering at all. Can someone send me the latest win32 code? It must be only around 550 Kb.

keer210@hotmail.com

zed · November 8, 2001, 11:47am

for a fast blur (i use this)
glReadPixels
do the blur on these pixels
glTexSubImage2D(…) update the texture with the blurred pixels
draw textured quad

obviously it makes a huge difference what size area u want to do. fullscreen will prolly be to slow but i use it on 128x128 images @ 100+fps on a slowish machine

Keermalec · November 9, 2001, 5:14am

Ok, gaffer.org is up again and I downloaded openPTC. It seems quite a powerful library in terms of what it can do in software mode, but it does look slower than openGL, if I base myself on the examples. I’ll see what I can do with it. Thanks.

kha · November 9, 2001, 6:45am

What you will end up with if you do this kind of thing is a blur definitly not an FSAA. FSAA is quite hard to perform in software mode. There are however some quick and dirty (and quite easy) way to implement what you can call FSAA.

The easiest way is to render the scene in a backbuffer(subsample), then move the camera slightly and re render the scene in an other buffer. Finally mix up the two buffers and bring it up to frame buffer. I don’t know how to do that in OpenGL though (at some very rare times I miss direct3D a bit). But expect a frame drop around 60 to 75%.

This is in fact the real way to do good antialiasing (Except that most professional tool will use at the very least 16 subsamples).

Another method is to create a buffer bigger than you need. (for rendering 800x600 use 800x1200 or 1600x1200) (supersampling) then render in this one and “shrink it” using any
method you want (By the way there are very good converion matrixes used by wavelets codecs but they are slow might we say). A pure average methode Pf=P1+P2/2 gives quite good result, but tends to blur edges.

While I am here if someone have some code about rendering in a buffer ,reading a buffer, modifying a bufer and so on I ust admit I am quit interressed there

kha