a revival of an old problem that wont stop haunting me. the situation:
a large terrain takes a lot of memory, so saving memory wherever possible is a good thing. that includes storing positions as bytes or at least shorts instead of floats.
this approach works well when using vertex arrays, but slows to a crawl with vbo. so i tested a few things and in the result was: using anything but float in a vertex buffer would be horribly slow. i was aware of conversion to float that would have to happen so i wouldnt have been surprised about a small slowdown, but were talking a factor of 100 here (1200fps and 30fps).
so i wrote a simpler test case, just a quad, once as float in vb, then as byte. this time, the byte version wouldnt even show up. i doublechecked all sizes and just to be sure tried it with short. this time it worked AND was as fast as the float version. so i tried using short in the terrain again -> crawl
ok, so maybe the vertex position is supposed to be float. imagine my surprise, when i added a vb with colors as unsigned bytes and it again slowed down.
as ati didnt bother to reply over the last two weeks (probably have a lot to do making their drivers work with all those single games) i uploaded a test program and today got the chance to try it on a different machine.
result: works perfectly fine on nvidia cards (quadro fx, gf4 -though i wonder how-), while different ati cards including my 9800 would slow down when switching to short.
am i missing an important line in the vbo specs that says something like “youre supposed to use floats and anything else will result in indefined behaviour”? that kind of thing is forcing me to use 4 times the memory it would actually need (more even, but there arent any 5bit types ,-) ). its frustrating to see something require a 64mb buffer if you’d need less than 16… because only the latter has a chance to work.
whats so extremely different between va and vbo except where it is stored and whos managing the memory? what is ati doing so completely different from nvidia in their vbos? why shouldnt i try to cast a pointer aquired by mapping the buffer to another type (as it seems to be ignored and no matter how much explicit casting you use will keep storing the original type).
did anyone have similiar effects? are they supposed to be like that? and if ati drivers cause so much less trouble than a few years ago, how awful must it have been to develop for ati cards back then?