PDA

View Full Version : PCI express 2.0, glCompressedTexImage2DARB



niklasod
05-05-2009, 10:46 AM
I have glTexSubImage2D with both PCI express 2.0 and PCI express 1.0 and i don't get any performance increase on a 2.0 system.

I use Nvidia Quadro FX cards. Any ideas of why it is not faster?

niklas

MalcolmB
05-05-2009, 11:01 AM
Isn't the compression done on the CPU with that command? So most of the cost will be the CPU work, not the upload, so faster PCIE won't make much difference.

_NK47
05-07-2009, 02:21 AM
how do you measure performance anyways?

Dark Photon
05-07-2009, 06:07 AM
I have glTexSubImage2D with both PCI express 2.0 and PCI express 1.0 and i don't get any performance increase on a 2.0 system.

I use Nvidia Quadro FX cards. Any ideas of why it is not faster?
You are not bus-bandwidth bound, ...OR you got cheated on the PCIx 2.0 system. ;) My bets are on the former.

But seriously, on the latter, make sure that you're comparing apples-to-apples. That is, make sure you started with 16-lane PCIx 1.0, and your other system is 16-lane PCIx 2.0. Also verify that the device you're plugging in supports that many lanes on that bus standard.

Now back to the former (you're not bus-bandwidth bound): MalcolmB already gave you the likely biggest reason. When you ask the driver to do compression, it does it on the CPU. Meanwhile your PCIx bus is just sitting there bored and patiently waiting for you to pump some data.

Also, you'll likely not be able to tap the extra bandwidth in your new bus using just the gl*TexSubImage calls directly to a texture, even when your input data is already in the right format. You'll probably have to use transfers through PBOs (http://www.opengl.org/registry/specs/ARB/pixel_buffer_object.txt) to get the bandwidth up to 3.2GB/sec+ on PCIx v1.0 (16-lane) or 6.4GB/sec+ on PCIx v2.0 (16-lane).

niklasod
05-08-2009, 02:08 PM
Hi, Thanks for all the answers and sorry for the confusion..

My topic is wrong,. i do not have the CPU compress any data and to my knowledge it use GL_RGB and it should not have to reshuffle any data.

I measure the time before and after the blocking call and compare it with 2 separate systems. One that reports 2.0 in GPUz (http://www.techpowerup.com/gpuz/) and one that reports 1.0 .
They have the exact same time ( about 3ms for my texture ) i have used the same code on different cards and i do get much better performance on Quadro cards compared to geforce cards, however no improvement in higher bus speeds.

here is a code snippet
glTexImage2D(GL_TEXTURE_2D, 0, GL_LUMINANCE8, gdat.camerax, gdat.cameray, 0, GL_LUMINANCE, GL_UNSIGNED_BYTE, NULL);


I have tried to use GL_RED, GL_GREEN and GL_BLUE and they all have the same performance. So i was hoping getting faster bus speed would help.