-
fast pbo with ,nvidia
Hi,
I need to speed up my pbo with a P4 computer and a xeon computer... Is there any opinion about fast memcpy that we can use with pbo on windows compiling with visual c++ 2003. Is it a good idee to use memcpy_amd with intel architecture?
regards,
elezaza
-
Advanced Member
Frequent Contributor
Re: fast pbo with ,nvidia
You can try to use glBufferSubData call instead of
Map/memcpy/Unmap sequence.
Driver can be optimized to detect CPU and decide which internal memcpy to use for mem transfer.
In my video player test app, changing from Map/memcpy/Unmap sequence to glBufferSubData produce lower average CPU time (2-5%).
If you want to do readback, you can use glGetBufferSubData instead of Map/memcpy/Unmap sequence. There is a bit speedup.
Test benches on my test machine (NV 6800GT, FW 76.45, P4-3.2/HT):
Map/memcpy/Unmap = 476 MB/sec
glGetBufferSubData = 484 MB/sec
yooyo
-
Senior Member
OpenGL Pro
Re: fast pbo with ,nvidia
If you want to perform fast memory copy, use MMX instructions with prefetch reading and non-cashed writing(MOVNTQ instruction if I'm not mistaken). AMD has a paper on it. Notice that it only has sence with very large memory regions, at least 100Mb
-
Junior Member
Regular Contributor
Re: fast pbo with ,nvidia
Does mmx etc. operations have any meaningful application in connection with memory-mapped vertex/pixel-buffers? I would assume they were only useful for system-memory transfers?
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules