PDA

View Full Version : FFT/drawpixel speed test



JoeMac
02-18-2002, 07:34 AM
Looking for volunteers to test a piece of code for speed. It combines FFT (high pass) with Read/drawpixels so don't be surprise if the scene looks a little odd. I have to make a recommendation for a video card to run this on, so any tests would be appreciated - specifically I'd like to test Radeon 7500/Geforce 2, and up.
Instructions: Run Maze2.exe, type in any initials(up to 8 chars), then use 'D' as the input file. Escape key (or entering the opposite alcove) will exit the maze.
Download at http://www.cs.dal.ca/~macinnwj/downloads/FFTMaze.zip
Thanks!
Joe

Diapolo
02-18-2002, 07:48 AM
It showed 0,5 fps on my GeForce3 with 27.30?

Diapolo

Sundy
02-18-2002, 11:38 PM
I get 0.18 fps on my Geforce2 MX
-Sundar

JoeMac
02-19-2002, 06:22 AM
I get the picture. I know my Fourier Transform isn't very fast, but I was hoping that some cards would deal with the read/draw pixels a little better. Any Radeon's out there?
Joe

yakuza
02-19-2002, 07:24 AM
I've got a Radeon 7200 (aka 64DDR vivo), but I'm at work now. I will test for you when I get home this evening.

edit: With Humus' input below, I doubt the feedback on my 7200 is necessary. http://www.opengl.org/discussion_boards/ubb/smile.gif

[This message has been edited by yakuza (edited 02-19-2002).]

Humus
02-19-2002, 08:02 AM
0.45 fps on a Radeon 8500.

jwatte
02-19-2002, 02:20 PM
Why are you reporting only the card name? Chances are, the read is reasonably fast, and the bottleneck is the FFT; thus your CPU and memory speeds would matter much more.

Have you profiled the FFT to measure how long it takes? Compared it to the readpixels? And, to make sure it's readpixels, and not rendering, have you put a glFinish() before you start timing ReadPixels?

JoeMac
02-19-2002, 03:36 PM
Part of the bottleneck is the FFT, but the following is what I get on my 1.2G Athlon with a Radeon VE
No draw/readpixels or FFT - 85 fps (monitor refresh)
draw/readpixels, no FFT - 3 fps
draw/readpixels and FFT 0.25 fps
Good point with machine differences on the FFT though. I've reposted the same program with read/drawpixels without the FFT at:
http://www.cs.dal.ca/~macinnwj/downloads/FFTDemo.zip

Hopefully this will be a better test of the card's role.
Thanks to everyone who tested (and will re-test) http://www.opengl.org/discussion_boards/ubb/smile.gif
Joe

thewizard75
02-19-2002, 08:37 PM
System:
Dual PIII 800Mhz
Elsa Gladiac (Geforce 2 GTS 32MB)

FFTMaze: 0.238 fps
FFTDemo: 6.9-7.5 fps (oscillates)

Hope this helps...

JoeMac
02-20-2002, 06:17 AM
Yakuza: I'd like to see the numbers anyway. If the 7200 does nearly the sam job, I'll save the extra money. Its starting to look like none of these cards do a decent job on glreadpixels.
Joe

yakuza
02-20-2002, 06:23 AM
Ok, sorry I didn't get a chance to try it last night, I kind of figured it was unnecessary. But I'll be glad to try it tonight, and post my results.

barthold
02-20-2002, 09:32 AM
How big are the images you are trying to read and write with draw/readPixels? The bigger the better, the overhead of setting up a transfer is pretty much constant, thus less noticeable for bigger images. Make sure you pick an image format that is supported by the harware directly so that the driver doesn't have to convert it.

Wildcat 5510:
FFTdemo 7.9 fps
FFTmaze 0.29 fps

This is on a 1.4 GHz P4

Barthold
3Dlabs

heiman
02-21-2002, 03:23 AM
Dell Dual 600 MHz
Wildcat 4110
FFTMaze 0.19 FPS
FFTDemo 6.2 - 6.6 FPS

Dell 8100 Inspiron 1.2 GHz
nVidia GeForce2Go
FFTMaze 0.36 FPS
FFTDemo 10.2 FPS

This is the first time that I have seen a laptop with a GeForce2Go outperform a desktop workstation with a Wildcat card.

Regards,
Scott

[This message has been edited by heiman (edited 02-21-2002).]

[This message has been edited by heiman (edited 02-21-2002).]

jwatte
02-21-2002, 09:57 AM
Make sure you ReadPixels with the GL_BGRA format, and the appropriate packing/alignment modes set so that the hardware can just dump a stream of data into your buffer (this means reading the entire texture, too, for alignment).

yakuza
02-21-2002, 12:11 PM
Alright, I tested it this morning before I went to work.

Radeon 7200
Athlon Tbird 700@900
Maze: 0.26 fps
Demo: 7.6 fps

Hope this helps. http://www.opengl.org/discussion_boards/ubb/smile.gif

JoeMac
02-21-2002, 01:40 PM
Thanks for the tests everyone.
JWattte, Barthold, I've posted my processing code. I actually use GL_RGB, and as far as I can tell, it should be supported. Any advice on optimizing would be welcome.
Joe

int
CApp::ProcessPixels()
{
//prep the array to read the pixels
//for(int i=0;i<(4*640*480);i++)
// Pixels[i]=0;
glPixelStorei(GL_UNPACK_ALIGNMENT, 1);
glPixelStorei(GL_PACK_ALIGNMENT, 1);
glReadBuffer(GL_BACK);
glReadPixels(0,0,WinX,WinY,GL_RGB,GL_UNSIGNED_BYTE ,Pixels);
if(Pixels != NULL)
{

glDrawBuffer(GL_BACK);
glPushMatrix();//modelview
glMatrixMode(GL_PROJECTION);
glPushMatrix();
//now set to 2D
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
glOrtho(0, ScrWidth, 0, ScrHeight, -1, 1);
GLubyte Temp = 0;
//take average of pixels and convert to greay scale
for(int i=0;i<(WinX*WinY);i++)
{
Temp=(Pixels[(i*BYTES_PER_PIXEL)]
+Pixels[(i*BYTES_PER_PIXEL)+1]
+Pixels[(i*BYTES_PER_PIXEL)+2])/3;
//GPixels[i%WinX][i/WinX]=Temp;
}
//FFT Here
//put them back into Pixels
for(int i=0;i<(WinX*WinY);i++)
{
//Temp = GPixels[i%WinX][i/WinX];
Temp = FFImage->data[i%WinX][i/WinX];
Pixels[(i*BYTES_PER_PIXEL)]=Temp;
Pixels[(i*BYTES_PER_PIXEL)+1]=Temp;
Pixels[(i*BYTES_PER_PIXEL)+2]=Temp;
}
//disable textures, and draw to back buffer
TextureMgr::StopTextures();
glRasterPos2i(0,0);
glDrawPixels(WinX,WinY,GL_RGB,GL_UNSIGNED_BYTE,Pix els);
glMatrixMode(GL_PROJECTION);
glPopMatrix();
glMatrixMode(GL_MODELVIEW);
glPopMatrix();
}

}

[This message has been edited by JoeMac (edited 02-27-2002).]

jwatte
02-21-2002, 02:54 PM
"supported" and "optimal" are two different things. Try reading as GL_BGRA, or as GL_BGR with a pixel size of 4 bytes. That just might make it faster.

pleopard
02-21-2002, 07:40 PM
Late but here it is ...

ASUS A7M266 / AMD TBird 1.4 / 512Mb DDR / GF3

FFTMaze : ~0.5
FFTDemo : ~8.6

JoeMac
02-22-2002, 05:06 AM
Fraction of a fps difference going though all combinations that I could think of, including GL_LUMINANCE : 1-4 Byte alignment as appropriate.
Joe

Sundy
02-22-2002, 08:32 AM
Do not even think about using OpenGL if you want to read back the frame buffer... er.. unless you want to take a snapshot of your framebuffer for a screenshot or something.
-Sundar

zed
02-22-2002, 12:27 PM
one thing is clear the read/draw pixels is NOT the bottleneck the FFT code is
any halfdcent card (eg tnt1) can do a readpixels of 10million pixels a second (very conservative number)

barthold
02-27-2002, 09:18 AM
I do not know what the FFT program does exactly, but the read/drawPixels numbers look a bit slow to me. A Wildcat 5110 can do drawPixels at around 500 Mb/sec (Megabytes) for a variety of pixel formats. It'll do readPixels at about 200 Mb/sec.

Now say you're drawing and reading a full screen (1024x768 RGB). That is roughly 2.5Mb of data. Reading and drawing that should take 2.5/200 + 2.5/500 seconds, or about 57 fps.

Getting 8 fps on FFTdemo is way off from that number. Thus that makes me wonder what is going on. The effectiveness of draw/readPixels will go down if you read many small images (there is overhead to setup the DMA transfers, turn around the bus etc). Maybe there is some data copying going on in FFTdemo, or extra computation that should not be there. I would suggest to use Vtune and see where most of the time is spent.

Barthold

JoeMac
02-27-2002, 05:38 PM
Well, with the function above, I get 8 fps , without it I get 60+ (synched). And this is 640x480, GL_RGB for the window and the readpixels.
I realize that readpixels has a high bandwidth, but I'm guessing that its the setup cost (latency) that's killing mesince it has to be done every refresh. I will test a few times and see what happens, but does anyone see anything in my code that shouldn't be there?
Joe

zeckensack
02-27-2002, 06:03 PM
Are you entirely sure that you have a plain RGB buffer without destination alpha? That would cause a format mismatch and force the driver to do pixel conversions. Just a thought ...

zeckensack
02-27-2002, 06:20 PM
Anyway
*FFTMaze 0.19 fps
*FFTDemo 1.95~2.25 fps

On Radeon DDR 32, Athlon 700

Rob The Bloke
02-27-2002, 06:37 PM
Dual 2Ghz Zeon, 1Gb Ram, FireGL 2, WinXP

Crashes.

barthold
02-27-2002, 09:29 PM
Try a test loop that does readPixels in a buffer, then immediately calls drawPixels on the same buffer. Then flush and call swap. That should give you a good measure of how fast you can read and draw images. I would also leave the (un)pack alignment at its default (which is 4 I believe)

The code you posted has some computation in it, like computing the average etc. Take all that out.

Barthold