PDA

View Full Version : VAO probs :)



Ozzy
03-15-2002, 12:14 AM
I'm using strips and i'm implementing VAO into my rasteriser.

Unfortunately, my vertice look like this: ;))

typedef struct SVR_VERTEX
{
VR_SHORT x; // $0
VR_SHORT y; // $2
VR_SHORT z; // $4
VR_SHORT rienz; // $6

VR_CHAR nx; //non transformed normal (-128,127) $8
VR_CHAR ny; // $9
VR_CHAR nz; // $0a
VR_CHAR rien; // $0b

VR_BYTE r; //Fantastic colors.. $0c
VR_BYTE g; // $0d
VR_BYTE b; // $0e
VR_BYTE a; // $0f

union {
VR_UV texCoord[4]; // $10
struct{
VR_FLOAT u0,v0; // $10,$14
VR_FLOAT u1,v1; // $18,$1c
VR_FLOAT u2,v2; // $20,$24
VR_FLOAT u3,v3; // $28,$3c

};
};

}VR_VERTEX;

as u see, i'm *only* using floating point values for my texCoords.. :)
and note that primitives sizes can varry depending on the texture channels used.. (minPrimSize $10, maxPrimSize $30)

-------------------------------------------------------------------------------------------------------------

Below the code needed to store prims onboard.. (simply using GL_STATIC_ATI)

size = sizePrim * pInfos->nb;
pInfos->object = glNewObjectBufferATI(size, pList, GL_STATIC_ATI);

-------------------------------------------------------------------------------------------------------------

and the rendering stuff..
static VR_VOID displayVertexListGlATI(VR_LONG handle,VR_PRIM_TYPES type,VR_DWORD start,VR_DWORD nb)
{
VR_VERTEX *pList;
VR_LIST_INFOS *pInfos;
VR_DWORD sizePrim,offset,maxChannels,channel;
VR_DWORD object;

pInfos = &pListInfos[handle];
pList = (VR_VERTEX*) pInfos->pList;

if (pList == NULL) return;

sizePrim = vrRasterPrimSize[pInfos->bits.uvChannels];
object = (VR_DWORD) pInfos->plistCache;

glEnableClientState(GL_VERTEX_ARRAY);
glArrayObjectATI(GL_VERTEX_ARRAY, 3, GL_SHORT, sizePrim , object, 0);

if (pInfos->bits.useColors){
glEnableClientState(GL_COLOR_ARRAY_EXT);
glArrayObjectATI(GL_COLOR_ARRAY_EXT, 4, GL_UNSIGNED_BYTE, sizePrim, object, 0x0c);
}
if (pInfos->bits.useNormals){
glEnableClientState(GL_NORMAL_ARRAY_EXT);
glArrayObjectATI(GL_NORMAL_ARRAY_EXT, 3, GL_BYTE, sizePrim, object, 0x8);
}

if (pInfos->bits.useTexCoords){
if (pInfos->bits.uvChannels > rasterInfos.caps.nbTextureUnits)
maxChannels = rasterInfos.caps.nbTextureUnits;
else
maxChannels = pInfos->bits.uvChannels;
offset = 0x10;
for (channel = 0;channel < maxChannels;channel++){
glClientActiveTextureARB(GL_TEXTURE0_ARB+channel);

glMatrixMode(GL_TEXTURE);
glLoadMatrixf((float*)&currentMatrix[channel]);

glEnableClientState(GL_TEXTURE_COORD_ARRAY);
glArrayObjectATI(GL_TEXTURE_COORD_ARRAY, 2, GL_FLOAT, sizePrim, object, offset);
offset += 8;
}
}

glDrawArrays(primtypes[type],start,nb);

glDisableClientState(GL_VERTEX_ARRAY);

if (pInfos->bits.useColors)
glDisableClientState(GL_COLOR_ARRAY_EXT);

if (pInfos->bits.useNormals)
glDisableClientState(GL_NORMAL_ARRAY_EXT);

if (pInfos->bits.useTexCoords){
for (channel = 0;channel < maxChannels;channel++){
glClientActiveTextureARB(GL_TEXTURE0_ARB+channel);
glDisableClientState(GL_TEXTURE_COORD_ARRAY);
}
}
}

-------------------------------------------------------------------------------------------------------------

Ok, nothing terrific! :) i'm using glDrawArrays bcoz my prims are triangleStrips and i think that i'm using
correctly all the glArrayObjectATI stuff..
So what? well, nothing is displayed! what i'm suspecting is that VAO mechanism has got probs with my vertex data structure,
i mean when the structure is not using FLOAT values (the most common way around in sample code etc..) it doesn't work..
Anyway, i have also modified the simpleVAO sample from ATI to use SHORT values for coords and it seems that the
same problem occurs. :)=

Anybody has noticed this previously? i'm using latest beta drivers from ATIDevrel..

thx!

ad!:=) cool 'gallery' to look at for coffee &| cigarette time -> http://www.orkysquad.org/main.php

Ozzy
03-16-2002, 06:36 AM
eh? did i miss anything? http://www.opengl.org/discussion_boards/ubb/smile.gif

Ozzy
04-18-2002, 09:43 PM
For those interested if any...

I've played with a modified version of simpleVAO sample to make sure that it was *not* a silly bug from me ;)

here are the results:
using GL_INT works (thus integers works! )
but:
using vertex coords with GL_SHORT (no display, misinterpreted data??)
using vertex coords GL_BYTE (crash!)

Finally some feedback from ATI:
"I just ran the sample and found that with everything running normally I get nothing rendered, is this what you get? Further testing I found that our own SW ICD renders properly. Further still I found that
without TCL everything renders properly. This indicates to me that it is a driver bug and I've submitted the bug formally as EPR 64242 for you. I'll let you know the results of the driver team investigation as I get updates."

just have to wait for next beta drivers and keep fingers crossed! ;))

Ysaneya
04-18-2002, 10:48 PM
I'll add that between my "alpha buffer doesn't work with blending" and "tex env combine scale is bugged" list of driver bugs. I'm the first to admit ATI's response is very good, but unfortunately, their drivers are still incredibly bugged, up to the point i cannot use any serious advanced feature in our application. There's still a lot of work left, ATI !

Y.

davepermen
04-18-2002, 10:56 PM
possibly you should just get some other drivers.. as far as i know, humus has no problems with the pixelpipeline stuff, means fragmentshaders and blending and rendertexture etc.. and the scaling-bug is solved, too, i think..

vao is evolving i guess, so its not yet finished..

kieranatwork
04-18-2002, 11:01 PM
Evolving, yes that's the word. How polite.
It's been 'evolving' for quite a while now, hasn't it?

Ozzy
04-18-2002, 11:21 PM
i was *not* discussing about ATI drivers development roadmap guys! ;)
*only* VAO is the subject eh? :)

Ysaneya
04-19-2002, 05:39 AM
Yes, these bugs have been fixed in the beta drivers. These drivers are not public yet, so it makes no difference for the customer, heh ?

Back to the topic (sorry Ozzy). I never got VAO to work under Win98. In Win2k it works, but AFAIK they crash my machine when using dynamic arrays in special conditions. I have not debuggued yet, so it *might* be a code problem, not a driver one. Though i wouldn't be that surprised..

Y.

Ozzy
09-18-2002, 08:01 AM
For those interested here is my end of the story concerning the ATIVertexArrayObject bug and performance test VS VertexArrayRangeNV.
http://www.orkysquad.org/main.php?id=lir...ember&year=2002 (http://www.orkysquad.org/main.php?id=lire&templ=templCode&nom=vince&cat=Code&month=September&year=2002)

that's it. :)

PH
09-18-2002, 08:12 AM
What exactly are you saying http://www.opengl.org/discussion_boards/ubb/smile.gif ? Are you saying that VAO is slower than not using VAO or that VAR is simply faster ?
My own tests shows VAO ( on 8500, not LE ) to be at the least comparable to GeForce3's VAR in terms of performance.

One thing I noticed in the past..using the 8500 in a lowend system ( P3 500mhz, AGP2x ) will show poor performance compared to the GeForce3. I was initially disappointed with my 8500 until I upgraded the rest of the system.

Ozzy
09-18-2002, 08:38 AM
I'm saying that VAO is far slower than VAR but faster than CVA (hopefuly for an implementation which is supposed to store vertice onboard)
Now, maybe you're right and CPU is too much involved in the VAO implementation.
Anyhow, who knows? but my question would be why the CPU should be so much involved when geometry is stored onboard and then it is supposed to become a GPU problem. http://www.opengl.org/discussion_boards/ubb/wink.gif

Btw, upgrading your system has generally a significant boost on performances using CVA bcoz they're loaded from the bus to the board. but it isn't the case when using static data onboard.

That's it! and here are the perfs on my system. http://www.opengl.org/discussion_boards/ubb/smile.gif



[This message has been edited by Ozzy (edited 09-18-2002).]

Ozzy
09-18-2002, 09:06 AM
I just want to add that these benchs show abnormals results when u think that VAO was written against the VAR from NV.
Now, considering the CPU overhead maybe the VAO stores only in AGP memory instead of VRAM but this is pure speculation. Only ATI could answer. Moreover, i would like to know why does it take so long to them to fix this silly bug in the VAO. :((

V-man
09-18-2002, 11:07 AM
The whole issue is not being able to use GL_SHORT, GL_INT, GL_BYTE? Only float?

Whoaaa! deja vue baby!

I think that came up almost a year ago.

Anyways, why not use floats for vertex & tex? The card will probably convert to floats anyway.

V-man

Ozzy
09-18-2002, 11:55 AM
Fore sure i know about this strange feeling of deja vu! just read the date of the first post ;)

1 float = 4bytes, 1 short = 2 bytes, 1 byte = ?

now multiply by ??? what u want. which of them will take less memory?
this is a quite important consideration when u need to store static data. Moreover i admit the importance of floating point values for calculations but definitely *not* for storage.
anyhow, vertex arrays are supposed to handle different data types that's all, that's an unresolved bug for a long time now.

Humus
09-18-2002, 12:39 PM
What happends is most likely that it goes into software mode and converts shorts and bytes into floats, I'd guess the hardware doesn't support those formats natively.
I have noticed remarkable performance increases by going from normal vertex arrays to VAO, but then I'm only using floats.

Ozzy
09-18-2002, 01:07 PM
Originally posted by Humus:
What happends is most likely that it goes into software mode and converts shorts and bytes into floats, I'd guess the hardware doesn't support those formats natively.

Yep possibly, this should be performed while creating the VertexArrayObject then. Btw, Geforce boards support this kind of native formats which kept more VRAM for data and increase performances on primitives processing aswell. :)=


I have noticed remarkable performance increases by going from normal vertex arrays to VAO, but then I'm only using floats.[/B]
I agree 100% with that! VAO are around twice faster than CVA. Now still comparing with Nvidia CVA implementation with lighting enabled, the 8500LE obtains really poor results even compared to a tornadoMx200!. Of course, the Radeon8500 has got others interesting features with its pixel pipes etc.. but i'm speaking about *raw* speed with single texturing (but lighting) this sort of things. ;)
Anyhow, the fact that all the geometry passed to the VAO will reside in AGP mem instead of VRAM should be a good explanation/theory.
1) AGP transft break // processes
2) It is obviously done by the CPU.
3) It has direct performance hit depending on the system.

Humus
09-18-2002, 07:54 PM
Originally posted by Ozzy:
Yep possibly, this should be performed while creating the VertexArrayObject then.

The driver doesn't know the format when you upload the data so it can't possibly preprocess anything.

Ozzy
09-18-2002, 10:00 PM
that's right! ;) Oops...
Thus, if data format doesn't match that could be the bottleneck.
Then, are there any details concerning the optimal format to use? (Floating point values for each field?, size?, pad?)
which could prevent the software conversion.

Ozzy
09-18-2002, 10:34 PM
from the specs then.
---------------------------------------------
Implementation Notes

For maximum hardware performance, all vertex arrays except for
color and secondary color should always be specified to use float
as the component type. Color and secondary color arrays may be
specified to use either float or 4-component unsigned byte as the
component type.
---------------------------------------------
So i will make a few more tests and change colors from bytes to float but i think it will *not* change anything.... ;))


[This message has been edited by Ozzy (edited 09-19-2002).]

Ozzy
09-19-2002, 04:23 AM
Ok after a few & last more testings it seems that using GL_FLOAT for colors is slower than GL_UNSIGNED_BYTE! :)
Moreover, i've also done a few tests using concurrent primitives (same handle with multiple display) to prevent overhead due to primitives initialisation and the gain is about 4% for 61% of concurrent prims in the scene. Note that using the same test with the VAR implementation, gains are *not* relevant.

the end.

Ysaneya
09-19-2002, 05:49 AM
Do you have the latest drivers ? Because hum.. ATI and drivers... last time i worked with the beta drivers, i found that OpenGL was outdated. I now prefer to work with the "official" catalyst drivers :)

Y.

Ozzy
09-19-2002, 07:07 AM
Muahahaha! http://www.opengl.org/discussion_boards/ubb/wink.gif) (nice thought thx)

i was using the latest catalyst stuff. (sept release)

Btw, that's too bad that it's again a problem with a driver implementation which is ruining performances of what i think at last a pretty nice piece of hw.
In fact, we should ask for a D3D expert to get his opinion about that 8500LE board when using static data with the microsoft API.
(if static geometry is possible with D3D)

[

[This message has been edited by Ozzy (edited 09-19-2002).]

Ysaneya
09-19-2002, 08:06 AM
When i got my Radeon 8500, that's the first thing i tested. I'm no D3D expert, but i can still compile and execute some stuff from the SDK :)

Anyway, the results were quite incredible.. I didn't achieve more than 10 M Tri / sec (all static) with OpenGL, but the optimized mesh sample from D3D was running at a smooth 40 M Tri / sec. Talk about a difference :p

I'm convinced that the Radeon is one of the most powerfull cards here, it's just that you don't see it in OpenGL ...

Y.

Korval
09-19-2002, 10:06 AM
For maximum hardware performance, all vertex arrays except for color and secondary color should always be specified to use float
as the component type. Color and secondary color arrays may be specified to use either float or 4-component unsigned byte as the
component type.

I would imagine (RE: hope) that this has changed for the 9700.

Humus
09-19-2002, 03:04 PM
Originally posted by Ysaneya:
I'm convinced that the Radeon is one of the most powerfull cards here, it's just that you don't see it in OpenGL ...

I'm not sure what you guys do to keep your Radeons performance down http://www.opengl.org/discussion_boards/ubb/wink.gif, but I have never had any performance problems with polygon throughput. In fact, I wrote a demo once doing vertex skinning in Direct3D, I also did the same thing in OpenGL, same model, same effect. There were no significant performance difference between OpenGL and D3D version. It was like 430 fps (OpenGL) vs. 425 fps (D3D) when looking away from the model and 270 fps (OpenGL) vs. 280 fps (D3D) when looking straight at the model.

Ozzy
09-19-2002, 09:42 PM
Humus, this is *not* the point to keep that 8500 board performance down!! :))
I have used a celeron300 for the tests ok..
i know that it has a direct impact on dynamic vertex througput! But you certainly also know that it should have no impact on performances when using resident geometry..

When overclocking that celeron to 450Mhz i have noticed an *interesting boost* thus when can definitely say that data are not stored onboard and that the CPU is too much involved in the VAO implementation.

Concerning D3D vs GL on the same system it sounds pretty logical to get approx the same results ;) Moreover and for GL only, i think that public 3DBenchmarks are quite good for evaluating perfs using standart/classic mechanisms (without dynamic lighting).
What i want to say, is that there shouldn't be so much differences using CVA (without lighting) on a Radeon or on a Geforce coz sending the data through the bus *is* the bottleneck.
I'm insisting on the dynamic lighting coz using the fixed funcs has always been a problem (in term of perfs) with ATI drivers!
It is really expensive and again far away from the results with NV implementations.

In fact, the trouble is that the game we're working on has been designed with T&L in mind. And while it will run smoothly with a stable 60FPS with a low cpu config on a GF256 it will not be the same story with the previous generation of ATI boards which were supposed to be much more powerful than the first GF256 series!

Conclusion: if you can't store (statically)and process the data in VRAM you are loosing all the advantages of what is called a T&L board.

And finally, ehehe Korval, it could be funny to test that radeon9700!! anyone can rent a 9700 to me? contact me! ;)) Come on ATI dudes... :)

tchOo

Humus
09-19-2002, 10:47 PM
Well, Ysaneya stated that the situation would be significantly better in D3D, and I show that in my experience that's not true. I've been able to reach very similar results in both API's. I'm also pretty sure that data IS stored locally. I went back and fired up that old OpenGL app I talked about in the previous post. It still ran at roughly 430 fps when looking away and 270 fps when looking at the model. Now I disabled VAO and let it run through the standard vertex array path, and performance dropped down to 82 fps constant regardless of viewing angle. So it seams pretty much obviuos to me that the vertices are stored onboard, otherwise I can't see how I could get that huge performance boost by using VAO. Especially when the performance also matches the performance of D3D at a similar task.

Ysaneya
09-19-2002, 11:34 PM
A precision: at the time i did that test, i had a PIII-500 Mhz.

Since then i completely changed my system. I now have an Athlon 1.4 with 512 Mb of RAM. Here's a test application to benchmark Radeons with 4 lights (note: it was not done by me, credits goes to whoever did it..):
http://www.fl-tw.com/opengl/texbug.exe

My results are:

Arrays : 4.5 FPS, 1228560 TPS, 2.18+2.11 Mb/s
CVA : 9.8 FPS, 2539024 TPS, 4.76+4.61 Mb/s
Lists : 29.5 FPS, 7412312 TPS, 14.25+13.83 Mb/s
Lists N/L : 103.9 FPS, 25595000 TPS, 50.20+48.71 Mb/s
Streaming VAO : 28.0 FPS, 7043744 TPS, 13.53+13.13 Mb/s
Static VAO : 34.4 FPS, 8558968 TPS, 16.61+16.12 Mb/s

Y.

harsman
09-19-2002, 11:55 PM
Wow, thats weird. My paltry GeForce 2MX beats your radeon when using vanilla vertex arrays or CVAs. My results:

AMD K7/MMX/3DNOW 899 Mhz
OpenGL ICD on NVIDIA Corporation GeForce2 MX/AGP/3DNOW! [1.3.1]

Arrays : 23.2 FPS, 5815184 TPS, 11.23+10.89 Mb/s
CVA : 23.1 FPS, 5815184 TPS, 11.18+10.84 Mb/s
Lists : 24.1 FPS, 6019944 TPS, 11.66+11.32 Mb/s
Lists N/L : 40.9 FPS, 10156096 TPS, 19.74+19.15 Mb/s
Streaming VAO : unable to initialize
Static VAO : unable to initialize

Ozzy
09-20-2002, 01:26 AM
Originally posted by Humus:
. So it seams pretty much obviuos to me that the vertices are stored onboard, otherwise I can't see how I could get that huge performance boost by using VAO. Especially when the performance also matches the performance of D3D at a similar task.

Well, i'm not convinced of that fact.
Look Humus, when u are using VAR with data in AGP mem it's faster than CVA. But it's also slower than using VRAM. :)
This could explain VAO performances vs CVA.

I really think that the VAO implementation load data into AGP mem instead of VRAM. Then the GPU would only deals with a kind of cache (in VRAM) to process the vertices.

Moreover if finally the data are *really* stored in VRAM then performances are incredibly slow for such a mechanism.

Again, this is pure speculation, only guys at ATI could answer. Anyhow, i think it's a bit ridiculous for a T&L board to have its performances *drastically* varying depending on the CPU &| BUS frequencies. Something is wrong in the design somewhere :))

Someone would be nice enough to test the little app with a 9700 and latest drivers? please. ;)

Ozzy
09-20-2002, 01:39 AM
Originally posted by harsman:
Wow, thats weird. My paltry GeForce 2MX beats your radeon when using vanilla vertex arrays or CVAs

i think you get it man! ;) Too bad that VAR are not tested in this app.

Humus
09-20-2002, 05:10 AM
Not sure what to make out of this little app (why is it called texbug btw?), results from Radeon 8500:

Arrays : 34.1 FPS, 8436112 TPS, 16.48+15.99 Mb/s
CVA : 33.3 FPS, 8231352 TPS, 16.11+15.63 Mb/s
Lists : 34.4 FPS, 8640872 TPS, 16.63+16.14 Mb/s
Lists N/L : 126.0 FPS, 31000664 TPS, 60.86+59.05 Mb/s
Streaming VAO : 33.6 FPS, 8395160 TPS, 16.21+15.73 Mb/s
Static VAO : 34.8 FPS, 8640872 TPS, 16.83+16.33 Mb/s

Pretty much the same performance regardless of mode, except without lighting.
Would be interesting to see the source of this app, but anyway, there should be no difference in number of bytes pulled from VRAM/AGP (or whereever it's stored) between lists and lists N/L mode. As there's a huge performance difference the bottleneck is obviously something else, it may be as simple as the T&L unit not being faster. That the performance is similar between all the other modes supports that theory.
I'm also not surprised that a GF2MX beats the Radeon in some tests, the GF2MX has the T&L engine of the GF2. In pretty much everything the GF2 beat the Radeon when it came to T&L throughput, even though the Radeon on the paper should be faster.

[This message has been edited by Humus (edited 09-20-2002).]

harsman
09-20-2002, 05:34 AM
I thought Ysaneya had a Radeon 8500 not a regular radeon 1. You seem to get much better scores but still low performance from VAO, even with static data.

Ysaneya
09-20-2002, 05:38 AM
No, i really have a 8500, just didn't mention it anymore since i already did a few times :) No really, no joke, it's really a 8500 :)

Anyway, i have uploaded the main source file if you wanna see it: http://www.fl-tw.com/opengl/texbug.cpp

Why is it called texbug? Can't remember. I thought i had renamed it to demonstrate a texturing bug with the first drivers release, but since there is no texture, i dunno...

Y.

Ysaneya
09-20-2002, 07:24 AM
I updated my driver from catalyst 2.2 to 2.3, and i got an improvement of 5 (!!) with regular vertex arrays.. sigh !!

Here are my results now:
Arrays : 20.3 FPS, 5119000 TPS, 9.79+9.49 Mb/s
CVA : 26.4 FPS, 6634224 TPS, 12.78+12.40 Mb/s
Lists : 34.6 FPS, 8640872 TPS, 16.73+16.23 Mb/s
Lists N/L : 128.2 FPS, 31492088 TPS, 61.92+60.07 Mb/s
Streaming VAO : 33.1 FPS, 8149448 TPS, 15.98+15.51 Mb/s
Static VAO : 35.1 FPS, 8640872 TPS, 16.94+16.44 Mb/s

Y.

Ozzy
09-20-2002, 09:46 AM
Interesting *small* difference between streaming and static VAO. ;)
eheheh

any cool 9700 owner out there? :)

josip
09-20-2002, 11:20 AM
I've got a 9700, but I'm really not thaat cool (really).

Here are my results (using Catalyst 2.3 drivers):

GenuineIntel processor/MMX/SSE/SSE2 1794 Mhz
OpenGL ICD on ATI Technologies Inc. Radeon 9700 x86/SSE2 [1.3.3302 WinXP Release]

Arrays : 40.0 FPS, 9869432 TPS, 19.35+18.77 Mb/s
CVA : 12.4 FPS, 3153304 TPS, 5.98+5.80 Mb/s
Lists : 58.4 FPS, 14456056 TPS, 28.20+27.36 Mb/s
Lists N/L : 237.0 FPS, 58315648 TPS, 114.50+111.09 Mb/s
Streaming VAO : 56.0 FPS, 13800824 TPS, 27.05+26.24 Mb/s
Static VAO : 58.4 FPS, 14456056 TPS, 28.19+27.35 Mb/s

Ysaneya
09-20-2002, 12:22 PM
CVA 3 times slower than regular vertex arrays? *Hem*. No Comment.

Y.

PH
09-20-2002, 12:59 PM
What about alignment ? Might be a problem in the code. ATI should write a document about proper use of VAO ( and element arrays ).

Ozzy
09-20-2002, 09:08 PM
Nice config Josip! thx!
But don't hide yourself we guess who u are. ;))
Hey Yasena could u go to your retailer at the corner and implement VAR in your little app? it could be really interesting to check perfs now with same kind of system. :)=

rk
09-21-2002, 01:36 AM
here are my results on a 9700

AMD K7/MMX/SSE/3DNOW 1540 Mhz
OpenGL ICD on ATI Technologies Inc. Radeon 9700 x86/MMX/3DNow!/SSE [1.3.3302 WinXP Release]

Arrays : 23.2 FPS, 5733280 TPS, 11.21+10.88 Mb/s
CVA : 11.6 FPS, 2989496 TPS, 5.61+5.45 Mb/s
Lists : 57.7 FPS, 14169392 TPS, 27.86+27.03 Mb/s
Lists N/L : 319.8 FPS, 78586888 TPS, 154.51+149.91 Mb/s
Streaming VAO : 54.7 FPS, 13432256 TPS, 26.41+25.62 Mb/s
Static VAO : 58.2 FPS, 14292248 TPS, 28.10+27.26 Mb/s

Korval
09-21-2002, 11:50 AM
I've got a question: what does "Lists" and "Lists N/L" mean?

And, if CVA's are that much slower than even normal vertex arrays, why doesn't ATi just no-op the glLock and Unlock calls?

josip
09-21-2002, 12:32 PM
Originally posted by Korval:
I've got a question: what does "Lists" and "Lists N/L" mean?

And, if CVA's are that much slower than even normal vertex arrays, why doesn't ATi just no-op the glLock and Unlock calls?

"Lists" are lit, compiled tri-strips while "Lists N/L" is the same, except it's unlit.

Ozzy
09-21-2002, 09:39 PM
Come on dudes! we need more results!
anyone with GF & Radeon on the same system? ;)

Ysaneya
09-22-2002, 02:20 AM
As i said, it's an application i downloaded somewhere, and i don't have the full source code. Can't recompile, heh.

However, i'm writing a similar benchmark (actually i'll try to keep the same test, to have comparable results), but will extend it to different vertex formats (interleaved or not), variable number of lights, and obviously GF extensions (VAR). Hope it's done sometime tonight or tomorrow..

Y.

josip
09-22-2002, 06:47 PM
Originally posted by Ozzy:
Come on dudes! we need more results!
anyone with GF & Radeon on the same system? http://www.opengl.org/discussion_boards/ubb/wink.gif

I swapped out my 9700 w/ my old (!) 128MB G4 4200 w/ the 40.41 drivers:

GenuineIntel processor/MMX/SSE/SSE2 1794 Mhz
OpenGL ICD on NVIDIA Corporation GeForce4 Ti 4200/AGP/SSE2 [1.4.0]

Arrays : 31.5 FPS, 7821832 TPS, 15.23+14.77 Mb/s
CVA : 32.1 FPS, 7944688 TPS, 15.49+15.03 Mb/s
Lists : 32.1 FPS, 7944688 TPS, 15.50+15.03 Mb/s
Lists N/L : 60.2 FPS, 14865576 TPS, 29.09+28.22 Mb/s
Streaming VAO : unable to initialize
Static VAO : unable to initialize

Ysaneya
09-23-2002, 06:31 AM
I have finished to implement the new (hopefull improved) benchmark. I'll post a new thread, so look for it..:)

Y.