PDA

View Full Version : OpenGL vs DirectX: vertex buffer performance



jimmiwalker2
07-21-2004, 08:25 AM
Hi all!

I have run several OpenGL rendering tests, and found out that vertex buffer object is the one I have to use (how surprising :) ). But I have a little problem. If I do the very same program using DirectX vertex buffers, it's clear that d3d is much-much faster.


glBindBufferARB(GL_ARRAY_BUFFER_ARB, vertexbufferid);
glVertexPointer(3, GL_FLOAT, 0, 0);

glBindBufferARB(GL_ARRAY_BUFFER_ARB, colorbufferid);
glColorPointer(3, GL_FLOAT, 0, 0);

glBindBufferARB(GL_ELEMENT_ARRAY_BUFFER_ARB, indexbufferid);That's, what I do in the init after uploading data to the videomemory, and in every frame I siply call:


glDrawElements(GL_TRIANGLES, polycount, GL_UNSIGNED_INT, 0);On Ati cards, d3d is 0 to 50% faster, on nVidia the d3d VBO can be up to 3-4 times faster! Is there a faster way to do this with OpenGL or are d3d drivers this much faster?

The data structure in buffers are the same in both cases. Also it looks like ogl is much more CPU dependant.

If I want performance, should I use d3d instead? I really don't want to do that, please help me...

skynet
07-21-2004, 09:22 AM
glColorPointer(3, GL_FLOAT, 0, 0);

Try something different here. Either supply 4-byte encoded RGBA or use another (generic) Array for supplying the colors. I bet, this should solve your problem. Also, try to put both attributes into the same buffer.

JustHanging
07-21-2004, 10:11 AM
Also, make sure you create your vbo as static, it should give the best performance in your case. And of course if we're talking about a couple of polys and hundreds of frames per second, just forget it and test with bigger data...

-Ilkka

jimmiwalker2
07-21-2004, 10:17 AM
skynet: thanx, using unsigned char for colors helped a lot.

How do I put both attributes into the same buffer?

(FYI: It's tested with a modell of 44000 vertices.)

-NiCo-
07-21-2004, 12:53 PM
Look for InterleavedArrays() in the opengl specification. I believe C4UB_V3F is the right format for your application.

Greetz,

Nico

jimmiwalker2
07-21-2004, 02:33 PM
Thank you all. Well, the d3d port wasn't done by me, and the one who done it made it with interleaved arrays. Now that my code uses this as well, seems like OpenGL has taken the lead - on nVidia cards there is only a small performance plus, but on Radeon OpenGL really defeats DirectX in this case.

l_belev
07-21-2004, 06:02 PM
Note that you are not constrained by the pre-defined interlaced combinations only (in fact they are rarely used in practice since the atandard makers don't bother to add new ones as new vertex attributes are being defined).

You may just use the normal glVertexPointer/glColorPointer/etc funcs and give to them appropriately adjusted pointers and strides. This way you may hand-tune the exact format of your vertex data.

SeskaPeel
07-22-2004, 02:02 AM
And be sure to use up to date drivers for nvidia boards. Their older drivers (version 40) did not optimize correctly VBO, whereas version 50 fixed a lot of problems.

SeskaPeel.

jimmiwalker2
07-22-2004, 04:01 AM
Originally posted by SeskaPeel:
And be sure to use up to date drivers for nvidia boards. Their older drivers (version 40) did not optimize correctly VBO, whereas version 50 fixed a lot of problems.Yes, that was one of our problems. After updating to the current driver, everything ran fine.

Btw: how do I do custom interleaved format with glXxxArray? I mean if I select a pointer, I must use 0 to access current binded buffer. Where can I give the offset?

Jesse Hall
07-22-2004, 07:56 AM
Interleaved arrays example:

struct Vertex {
float pos[3];
unsigned char color[4];
};

glVertexPointer(3, GL_FLOAT, sizeof(Vertex), 0);
glColorPointer(4, GL_UNSIGNED_BYTE, sizeof(Vertex), 12);The color values are 12 bytes into each Vertex, and there are sizeof(Vertex) bytes between each color value.

Jesse Hall
07-22-2004, 08:02 AM
Sorry, I guess you probably understood that. With VBO, you don't have to specify 0 as the offset. If a VBO is bound, the "pointer" parameter is used as an offset from the beginning of the VBO. So pointer==0 means start at the beginning, and pointer==12 means start 12 bytes into the VBO. Using a VBO or not is determined by whether a VBO is bound, not by the pointer parameter.

jimmiwalker2
07-22-2004, 08:29 AM
Well, now I understand :) Thank you.

martinho_
07-22-2004, 11:40 AM
Using glDrawRangeElements is always faster than using glDrawElements, since you give more info to the driver.

jimmiwalker2
07-22-2004, 02:47 PM
Originally posted by martinho_:
Using glDrawRangeElements is always faster than using glDrawElements, since you give more info to the driver.Well, I tried dlDrawRangleElements on modells from 200 polys to 500 thousend polys, but it didn't seem any faster to me.

Korval
07-22-2004, 04:34 PM
DrawRangeElements is not always faster. It may be faster, depending on the implementation. Worst-case, it goes at the same speed as DrawElements.

SirKnight
07-22-2004, 04:39 PM
Originally posted by jimmiwalker2:

Originally posted by martinho_:
Using glDrawRangeElements is always faster than using glDrawElements, since you give more info to the driver.Well, I tried dlDrawRangleElements on modells from 200 polys to 500 thousend polys, but it didn't seem any faster to me.Well it COULD be in certain cases but it's not going to be a guarantee that it will always give better performance. Although it's probably safe to go ahaid and always use it so when your app does do things right there will be an inrease in performance. I havn't seen any difference either but I still use DrawRange... anyway. :D

-SirKnight

SirKnight
07-22-2004, 04:41 PM
Korval beat me to it. You want a cookie now? :D

-SirKnight

jimmiwalker2
07-22-2004, 05:47 PM
In what cases is DrawRangeElements faster? What does the driver use the information about amount of verices for? I believe you it can be faster - but why exactly?

nystep
07-22-2004, 10:54 PM
DrawRangeElements used to be faster with classic vertex arrays. In fact, when you want to draw the content of a vertex array, the driver copies it first in a faster memory zome from the central memory. DrawRangeElements enables the drivers to skip data during the copy. But it's no longer usefull with static VBO since all the data is already in the video memory.

have a nice day,
nystep

Ysaneya
07-22-2004, 11:18 PM
glDrawElements is useful when you are CPU limited. It prevents the driver from having to parse the index array on the CPU to calculate what is the maximum index, which is required since when you specify a vertex pointer via glVertexPointer, you don't specify the amount of vertices to send to the video card.

Since the size is already specified when creating a VBO, glDrawRangeElements is pretty much obsolete with VBOs.

Y.

martinho_
07-23-2004, 08:45 AM
DrawRangeElements is not always faster. It may be faster, depending on the implementation. Worst-case, it goes at the same speed as DrawElements. Ok, I should have said that of DrawRangeElements is always faster or as fast as DrawElements. In any case there is nor reason for using DrawElements. Personally I have experimented impressive speed gains with classic arrays and DrawRangeElements.

martinho_
07-23-2004, 09:09 AM
Since the size is already specified when creating a VBO, glDrawRangeElements is pretty much obsolete with VBOs.
But it's no longer usefull with static VBO since all the data is already in the video memory.You both are not right. In the the NVidia paper "Using Vertex Buffer Objects" you can read things like this:


This combination of memory usage can help the memory manager balance between
three kinds of memory: system, AGP and video.So the buffer is not always in video memory.

And in the same paper:


Use glDrawRangeElements Instead of glDrawElements
Using range elements is more efficient for two reasons:

- If the specified range can fit into a 16-bit integer, the driver can optimize the
format of indices to pass to the GPU. It can turn a 32-bit integer format into a
16-bit integer format. In this case, there is a gain of 2.

- The range is precious information for the VBO manager, which can use it to
optimize its internal memory configuration.So it's always better to use DrawRangeElements, because al least it will be as fast as DrawElements.

knackered
07-23-2004, 09:22 AM
You should perhaps take a hint from d3d (yet again). In d3d there simply *isn't* the option to *not* specify range information in a DrawIndexedPrimitive call. Also, let's face it, it's trivial information for the app to gather. Just do it, and forget about it.

CRAPtor
07-25-2004, 11:06 AM
Definitely, Direct3D has hugely greater vertex buffers performance.

Humus
07-25-2004, 02:18 PM
Uhm, no. Done correctly, performance is about the same on both APIs. Done wrong, performance sucks on both APIs.

CRAPtor
07-28-2004, 09:27 AM
Originally posted by Humus:
Uhm, no. Done correctly, performance is about the same on both APIs. Done wrong, performance sucks on both APIs.You right, both APIs is about the same performance.