PDA

View Full Version : Yay! ARB_vertex_buffer_object supported in ATI's Catalyst 3.4 drivers!



Ostsol
05-15-2003, 01:44 PM
www.ati.com (http://www.ati.com)

Finally. . . time for some fun! Down with vendor specific extensions! http://www.opengl.org/discussion_boards/ubb/wink.gif

NitroGL
05-15-2003, 02:05 PM
Cool. Now all I need is my new nForce2 mobo, and I'll be all set! http://www.opengl.org/discussion_boards/ubb/smile.gif

skynet
05-15-2003, 03:41 PM
Well... that CounterStrike/Halflife problem is STILL NOT "fixed" :-( Ok, the Screen doesnīt stay black anymore and you can switch between desktop and game back and forth, BUT those screenswitches take ages and _when_ I get back into the game, the refresh rate goes back to annoying 60Hz, even though the game started with the expected 100Hz. ATI should take a look on how well nVidia does solve this task.

To VBO: It seems, ATI didnīt expect people to create thousands of VBO objects. Well, I do (I will reduce that some day, promised). After claming around 6000-6600 buffer objects, I get OUT_OF_MEMORY, though the amount of memory needed for all those objects fits well into 48megs. The Radeon9700 in use has 128megs of onboard memory and additional 64megs of AGP mem, my request SHOULD be fulfilled. All buffer objects either use GL_STATIC_DRAW_ARB or GL_DYNAMIC_DRAW_ARB. The same code worked without any overflow on a GF2 with 32megs (until the latest drivers 43.51 screwed VBO on GF2 class cards)

Aside from that a question: What is better (faster), to optimize for less texture switches (glBindTexture()) or less vertexbuffer switches (glBindBuffer(), plus glXXXPointer(), plus glLoadMatrix()) ?

Korval
05-15-2003, 04:06 PM
What is better (faster), to optimize for less texture switches (glBindTexture()) or less vertexbuffer switches (glBindBuffer(), plus glXXXPointer(), plus glLoadMatrix())

I imagine that glBindTexture will always hurt more, since you're talking about a possible texture upload, as video cards can't use textures in AGP directly. By contrast, video cards can directly use AGP vertex data just fine.

That being said, you'll need to benchmark it to know for sure.


It seems, ATI didnīt expect people to create thousands of VBO objects. Well, I do (I will reduce that some day, promised). After claming around 6000-6600 buffer objects, I get OUT_OF_MEMORY, though the amount of memory needed for all those objects fits well into 48megs.

If each VBO has to be page-aligned (typically 4K), the minimum data taken up by 6600 VBO's is 25.8MB. That's hardly trivial.

Also, each VBO consumes some amount of resources, just to manage them. Let's say that ATi's VBO's require a 1K-large struct for this purpose. That's 6.6MB right there.

Lastly, it is entirely possible that ATi stores each of these VBO management structs in some kind of static array. It makes memory management a bit easier. As such, I would suspect that pathological misuse of the API (such as yours) would result in an OUT_OF_MEMORY error.

In short, I consider this to be much more your fault than theirs. You are misusing the API. Fix it.


ATI should take a look on how well nVidia does solve this task.

I'm sure that ATi people have seen that nVidia drivers handle this situation perfectly. Of course, that doesn't actually help them solve the, apparently, non-trivial problem. For all we know, Half-Life is doing something wrong, and nVidia simply had an "if(HalfLife)" in their driver somewhere to solve this case.

I don't care about switching to the desktop. As long as you can actually exit a game and re-enter without it dying (or other adverse effects like changing the refresh rate), I don't care. I'll find out when I get home.

[This message has been edited by Korval (edited 05-15-2003).]

Ostsol
05-15-2003, 05:07 PM
Originally posted by skynet:
ATI should take a look on how well nVidia does solve this task.
Probably a proprietary fix. . . j/k! http://www.opengl.org/discussion_boards/ubb/wink.gif


To VBO: It seems, ATI didnīt expect people to create thousands of VBO objects. Well, I do (I will reduce that some day, promised). After claming around 6000-6600 buffer objects, I get OUT_OF_MEMORY, though the amount of memory needed for all those objects fits well into 48megs. The Radeon9700 in use has 128megs of onboard memory and additional 64megs of AGP mem, my request SHOULD be fulfilled. All buffer objects either use GL_STATIC_DRAW_ARB or GL_DYNAMIC_DRAW_ARB. The same code worked without any overflow on a GF2 with 32megs (until the latest drivers 43.51 screwed VBO on GF2 class cards)
Back with ATI_vertex_array_object I was running into OUT_OF_MEMORY at 32 MB. http://www.opengl.org/discussion_boards/ubb/frown.gif

gator
05-15-2003, 05:16 PM
3.4 is great and all, but did ATI skip a version?

I thought the last released was 3.2.
They never released 3.3?

Ostsol
05-15-2003, 05:31 PM
Originally posted by gator:
3.4 is great and all, but did ATI skip a version?

I thought the last released was 3.2.
They never released 3.3?
Yep, there was a WHQL test that prevented it 3.3 from being certified. It looks like the test was scheduled to be removed, though, but not soon enough. Rather than wait for it to be removed, ATI decided to skip 3.3's release entirely and start working on 3.4. Technically, it could still have been called 3.3, just for continuity, but I guess ATI decided "3.4" was more appropriate since it contained not only what 3.3 was set to achieve, but also what ATI had planned for the release after.

jwatte
05-15-2003, 06:26 PM
We've had members report bugs with "Catalyst 3.3" drivers to us, although we haven't had those drivers ourselves. From which I draw the conclusion that someone leaked the drivers and claimed they were real. Perhaps ATI just didn't want to confuse people by re-using that name. If that was a reason, I'd think that to be a smart and savvy reason, but I have no idea whether that's the case.

Ostsol
05-15-2003, 07:27 PM
If they were leaked, then the person who leaked them probably won't be getting any more betas. When ATI first introduced the Catalyst beta program, they stated that each driver will be made such that they can be traced back to the person they were originally given to. I believe there is also an NDA that beta testers must adhere to.

More likely, though, it is a set of ATI drivers released by Dell that were claimed by many as the Catalyst 3.3 drivers. They are, in fact, not the 3.3s, but they are newer than the 3.2s. They were also never WHQL certified, whereas all Catalyst drivers released to the public are.

[This message has been edited by Ostsol (edited 05-15-2003).]

cass
05-15-2003, 08:00 PM
During the VBO design, we agreed that allocating lots of relatively small VBOs would be inexpensive.



If each VBO has to be page-aligned (typically 4K), the minimum data taken up by 6600 VBO's is 25.8MB. That's hardly trivial.

Also, each VBO consumes some amount of resources, just to manage them. Let's say that ATi's VBO's require a 1K-large struct for this purpose. That's 6.6MB right there.

Lastly, it is entirely possible that ATi stores each of these VBO management structs in some kind of static array. It makes memory management a bit easier. As such, I would suspect that pathological misuse of the API (such as yours) would result in an OUT_OF_MEMORY error.

In short, I consider this to be much more your fault than theirs. You are misusing the API. Fix it.


I disagree. If the app can manage data efficiently by putting many arrays into a larger VBO, then the driver could do the same sort of management with very light-weight VBOs.

Again, the intent of these buffer objects is that they would be very lightweight. That is easy for the driver to manage, and it still provides the driver the flexibility to move buffers around to different memory. If you put lots of arrays into a single buffer, the driver must keep that single buffer contiguous, even if it need not be, because it would be legal for you to access any part of it. That's an unnecessary and useless constraint.

Thanks -
Cass

cass
05-15-2003, 08:04 PM
Also, the idea that each VBO requires a 1k structure for overhead is ludicrous. Where did you come up with that number?

Korval
05-15-2003, 08:59 PM
During the VBO design, we agreed that allocating lots of relatively small VBOs would be inexpensive.

Lots of small VBOs, I can understand. 6000+? I'm sorry, but I'm willing to consider that pathological misuse of the API, and it is perfectly fine if an implementation wants to fail on that. I wouldn't imagine that you could allocate 6000 small textures either, even if they were 32x32. If you can, that's great. But I'm highly unwilling to call it a bug if the implementation fails to do so.

If each buffer were so much as 385 32-byte vertices, 6000 buffers would take well over 90MB of memory.


I disagree. If the app can manage data efficiently by putting many arrays into a larger VBO, then the driver could do the same sort of management with very light-weight VBOs.

Of course, light-weight VBO's are a poor use of the API, as the glDraw* overhead starts to win out. As such, it doesn't really make sense to encourage someone to misuse the API in such a fashion.

Not only that, it complicates the implementation quite a bit more, and unnecessarily so. This tends to lead to more bugs.


Also, the idea that each VBO requires a 1k structure for overhead is ludicrous. Where did you come up with that number?

Admittedly, 1K is a bit much, but I was giving ATi the benifit of the doubt.

Tom Nuydens
05-15-2003, 11:42 PM
Just because you're using thousands of VBOs doesn't necessarily mean that they're lightweight (i.e. near-empty). You might as well be rendering a scene with 50 million polys, for example. It also doesn't mean that you intend to use all of them every frame.

Nowhere is it said that a VBO has to live in AGP or video memory. If they don't fit in there, they should just be shifted to system memory. That's what happens with textures, and it works just fine for those.

-- Tom

velco
05-16-2003, 12:14 AM
Originally posted by Ostsol:
www.ati.com (http://www.ati.com)

Finally. . . time for some fun! Down with vendor specific extensions! http://www.opengl.org/discussion_boards/ubb/wink.gif

No new GNU/Linux drivers though ... again http://www.opengl.org/discussion_boards/ubb/frown.gif

~velco

cass
05-16-2003, 04:02 AM
Originally posted by Korval:
Of course, light-weight VBO's are a poor use of the API, as the glDraw* overhead starts to win out. As such, it doesn't really make sense to encourage someone to misuse the API in such a fashion.

Not only that, it complicates the implementation quite a bit more, and unnecessarily so. This tends to lead to more bugs.


???

Small light-weight VBOs are an intentional aspect of the API design. They simplify the implementation in a number of really important ways: 1) buffer renaming, 2) buffer mapping, 3) synchronization, 4) buffer migration/duplication.

It's not unreasonable use of the API to have a several VBOs for each object in your application (e.g. one VBO for position, one VBO for tangent/binormal/normal, one VBO for texture coords, ...).

I fail to see how large heavy-weight VBOs with lots of app-managed arrays inside them makes things easier for the driver or the app developer.

Thanks -
Cass

V-man
05-16-2003, 06:39 AM
To the original poster who is getting out of memory errors.
What is your AGP aperature size?
I suggest you set it to maximum available number.

On a side note, a similar message to this was posted some time ago with the person hitting a limit of 26.x MB

skynet
05-16-2003, 07:26 AM
To clearify some things:
I know that allocating that much buffers can cause problems. Iīm just annoyed that the card with most memory drops out first (I tried 128 AGP aperture size, it didnīt help) :-( Maybe its just a little setting in the driver source like #define MAX_VBO 6666 that prevents it from working ;-)
Iīm developing a per-pixel-lighting graphics "engine". One of the main goals is to store all geometry in agp/videomem and to cache as much as possible intermediate results like L and H vectors. I prefer switching VBO over batching geometry (with CPU). Like Cass mentioned my model data is "scattered" across different VBO objects. There maybe static data like texturecoordinates and indices, dynamic data like vertices and normals, and intermediate data, like L and H vectores. Additionally there is shadowmodel data and according indices. I try to cache as much as possible. For instance, if neither light nor object have moved, but the camera did, I can keep the L-vecs-VBO and the shadowmodel and only need to update the Hvecs VBO.
This should somehow explain where that much VBO objects come from. Of course I do not access all of them within a single frame, but the needed buffer binds can still pile up beyond 1000 binds per frame. In future I will reduce the number of binds and number of VBO objects in general. But restricting me to say a few dozen VBO objects will throw me back in time of VAR (though this time I could have more than one, one for static, one for dynamic data :-) with own memory management of the large buffer.

To Cass:
Iīd be interested in some kind of performance FAQ for ARB_fragment_program, ARB_vertex_program and ARB_vertex_buffer_object. I am really interested in recommended application behaviour, recommended program layout (should be texture sample instructions be scattered across the program or not etc) and pitfalls to avoid.

Korval
05-16-2003, 09:21 AM
Nowhere is it said that a VBO has to live in AGP or video memory. If they don't fit in there, they should just be shifted to system memory. That's what happens with textures, and it works just fine for those.

Actually, it is said somewhere. Because the spec says that glBindBuffer should be a light-weight operation, the binding of a buffer should not necessitate a copy operation. Otherwise, it is no longer light-weight, and therefore violates the spec.

As such, if an implementation were to page out to system memory, I would not consider it a well-thought-out implementation, or even a valid one. VBO's should remain in hardware memory. Some reside in video, others in AGP. But nothing should fall back to memory where the fastest method of transfer is to copy the verts to AGP and then send them. This is simply unreasonable.

Besides, if I understand correctly, textures never get paged out to system memory. The farthest away they get paged is to AGP.


Small light-weight VBOs are an intentional aspect of the API design. They simplify the implementation in a number of really important ways: 1) buffer renaming, 2) buffer mapping, 3) synchronization, 4) buffer migration/duplication.

The point I was making was not the fault of VBO's, but simply a fact of the length of time that glDraw* takes to render anything. If I recall (granted, it was a D3D document, but I assume the performance spec transfers over to GL on the same hardware), the idea is to use as few glDraw* calls as you can for smaller objects, as these calls tend to dominate. Obviously, having low-poly objects means that the performance of glDraw* takes precidence over the actual rendering time of the mesh.


It's not unreasonable use of the API to have a several VBOs for each object in your application (e.g. one VBO for position, one VBO for tangent/binormal/normal, one VBO for texture coords, ...).

That's reasonable only to the extent that you're going to be doing some mixing-and-matching of these VBO's. If you aren't, what's the point? It's much easier, not only for the implementation, but for the person using the API, to have as vew VBO's per renderable object. Also, if these things ever get paged out to system memory, it is also faster, as few VBO's means that larger ones have to stick around.


I fail to see how large heavy-weight VBOs with lots of app-managed arrays inside them makes things easier for the driver or the app developer.

I'm not suggesting having a lot of app-managed arrays inside a VBO. What I'm suggesting is that VBO's should be concatonated if at all reasonably possible. And, that having 6000+ VBO's seems very much like a poor use of the API, and a poor use of a hardware resource.


There maybe static data like texturecoordinates and indices, dynamic data like vertices and normals, and intermediate data, like L and H vectores. Additionally there is shadowmodel data and according indices.

I have to ask: why are you computing the L&H vectors on the CPU? Isn't that what ARB_vertex_program is for?

I don't know what a "shadowmodel" refers to, but if this relates to stencil shadows, there are ways to use your regular meshes to do shadowing. I don't know too much about these ways, as I prefer the shadow map approach, but they do exist.

Here's a question. Much like glGenTextures, you create some number of buffer objects and then tell OpenGL how much space you want for them. Can you successfully generate, say, 8000 VBO's without allocating room for them? Or is it the actual allocation phase where the implementation reports a problem?

cass
05-16-2003, 10:55 AM
Originally posted by Korval:

Actually, it is said somewhere. Because the spec says that glBindBuffer should be a light-weight operation, the binding of a buffer should not necessitate a copy operation. Otherwise, it is no longer light-weight, and therefore violates the spec.


The spec states in the issues section that glBindBuffer should be light-weight. It did not say (as far as I'm aware) that binding of a buffer should not necessitate a copy operation. Using a buffer (not merely binding it) may indeed require the driver to copy it somewhere. This is not a spec violation. Specifically it is an implementation detail that is invisible to the user, and would therefore never be specified.



The point I was making was not the fault of VBO's, but simply a fact of the length of time that glDraw* takes to render anything. If I recall (granted, it was a D3D document, but I assume the performance spec transfers over to GL on the same hardware), the idea is to use as few glDraw* calls as you can for smaller objects, as these calls tend to dominate. Obviously, having low-poly objects means that the performance of glDraw* takes precidence over the actual rendering time of the mesh.


This is true for D3D, but not OpenGL. Draw calls are expensive in D3D because they are done in kernel mode. GL calls are in user mode, and therefore relatively lightweight.

Indeed this is the fundamental difference between the way D3D handles vertex buffers and the way OpenGL decided to do it. Draw calls and gl*Pointer() calls in OpenGL are lightweight, so it makes sense to make VBOs lightweight. In D3D draw calls and setting up vertex buffers is expensive, so their vertex buffers are larger and heavy-weight with lots of complex methods for "locking" (or mapping) regions and rules for what you will overwrite and what can be discarded. Most of that complication goes away if you just have small lightweight VBOs.



That's reasonable only to the extent that you're going to be doing some mixing-and-matching of these VBO's. If you aren't, what's the point? It's much easier, not only for the implementation, but for the person using the API, to have as vew VBO's per renderable object. Also, if these things ever get paged out to system memory, it is also faster, as few VBO's means that larger ones have to stick around.


It's not just mixing and matching. It's also when some arrays are dynamic and some are not. You want to keep those separate. In any case, you want to give the driver the opportunity to lay these things out in memory the best possible way. The driver may decide to keep a small area in video memory for static VBOs and move them in/out based on LRU statistics. Further, if you make one large buffer that holds vertex data for many vertex arrays, then you have to be careful how you synchronize updates to that buffer with draw calls. This "accidental" synchronization burden can really hurt performance. With multiple buffer objects, this synchronization burden just goes away.


I'm not suggesting having a lot of app-managed arrays inside a VBO. What I'm suggesting is that VBO's should be concatonated if at all reasonably possible. And, that having 6000+ VBO's seems very much like a poor use of the API, and a poor use of a hardware resource.


Again, I simply disagree. You'll wind up getting better performance on NVIDIA drivers if you don't do unnecessary concatenation. Sure, keep arrays that are "interleaved" in the same VBO, but don't spend energy trying to make a VBO heavyweight. Takes more time for the app to do, makes life more complicated for the driver. That's a lose/lose proposition.

Thanks -
Cass




[This message has been edited by cass (edited 05-16-2003).]

Tom Nuydens
05-16-2003, 11:03 AM
Originally posted by Korval:
As such, if an implementation were to page out to system memory, I would not consider it a well-thought-out implementation, or even a valid one. VBO's should remain in hardware memory. Some reside in video, others in AGP. But nothing should fall back to memory where the fastest method of transfer is to copy the verts to AGP and then send them. This is simply unreasonable.

Besides, if I understand correctly, textures never get paged out to system memory. The farthest away they get paged is to AGP.

Sure they do -- try it. I've succesfully ran apps that used 512 MB worth of textures, which is more than the size of my video memory and my AGP aperture combined.

It seems only natural to me that the drivers are free to put your data wherever they please. The VBO spec explicitly says so in the context of element buffers, but IMHO generalizing this to vertex buffers as well would be the logical thing to do.

If you have so much vertex data that it doesn't fit in video/AGP memory, chances are that you won't need all of it in the course of a single frame anyway. If that's the case, binding a "non-resident" VBO isn't half as bad as you make it sound. The driver could employ an LRU caching scheme just like it does for textures, and transfer the VBO to video memory just once, throwing out the least recently used one to make room.

You say that an implementation that behaves like this would be an invalid one. My opinion is the exact opposite: I think any implementation that doesn't do this is inflicting an unnecessary burden on the developer by forcing him to fall back to a standard vertex array code path.

Note that I'm not taking sides here: I have no idea how either NVIDIA or ATI deal with this situation, so for all I know they could both just report GL_OUT_OF_MEMORY and be done with it.

-- Tom

vincoof
05-16-2003, 11:30 AM
Originally posted by Korval:
I have to ask: why are you computing the L&H vectors on the CPU? Isn't that what ARB_vertex_program is for?


Yes vertex programs can compute it, but it has to be computed every time you render the model. If you can render the model multiple times when the L&H vectors don't change, you'd better compute them into the CPU once and then reuse the values.

You will say that L&H vectors change very often since it depends on either the camera, the light, or the objects which all can move, and you'd be right. But even in that case it is still interesting to store the vectors on the CPU side, especially for multipass rendering.

ehart
05-16-2003, 12:53 PM
Just a quick note on VBO sizes. We have made changes in the driver that should improve cases with massive numbers of VBO's in the next couple driver releases.

I still however do not advocate microscopic VBO's, as they will likely be somewhat inefficient. As someone mentioned above, memory management takes resources and typically has allocation granularities. I doubt anyone writes their app to dynamically allocate every chunk of memory individually, typically block allocations are used when applicable.

-Evan

Ostsol
05-16-2003, 01:52 PM
Will anything be done about the 32 MB limit that one runs into? Even with VBO I still cannot create a single vertex array of more than 32 MB or several arrays whose total is greater than 32 MB. Even with such a limit, shouldn't any data that cannot fit be pushed over into AGP memory? As it is, this does not appear to be the case. . .

cass
05-16-2003, 04:31 PM
Originally posted by ehart:

Just a quick note on VBO sizes. We have made changes in the driver that should improve cases with massive numbers of VBO's in the next couple driver releases.

I still however do not advocate microscopic VBO's, as they will likely be somewhat inefficient. As someone mentioned above, memory management takes resources and typically has allocation granularities. I doubt anyone writes their app to dynamically allocate every chunk of memory individually, typically block allocations are used when applicable.

-Evan

That's good to hear, Evan. Of course, I agree with the advice on microscopic VBOs. The gl*Pointer() validation overhead becomes dominant below some threshold - this is true even if you have a single VBO with lots of microscopic arrays in it. You should really coalesce microscopic arrays -- or render in immediate mode.

Thanks -
Cass

Humus
05-16-2003, 05:21 PM
Originally posted by velco:
No new GNU/Linux drivers though ... again http://www.opengl.org/discussion_boards/ubb/frown.gif

~velco

There are new Linux drivers for ATi cards here: http://www.schneider-digital.de/html/download_ati.html

They don't support VBO though unfortunately, but they are much better than the previous release.

Korval
05-20-2003, 09:47 AM
This is true for D3D, but not OpenGL. Draw calls are expensive in D3D because they are done in kernel mode. GL calls are in user mode, and therefore relatively lightweight.

I always had assumed that the paper was discussing a hardware problem, not a software/API one. Granted, until I read that paper, I had assumed that a glDrawElements call, without indices in AGP/video memory, would only do a quick index copy to AGP, add a few tokens to the command stream to the card, and return. Which, of course, didn't account for the problems with D3D's calls. I figured that, for reasons that would require intimate hardware knowledge, that there needed to be some explicit synchronization event or something of that nature.

Hmmm... this changes much...


Specifically it is an implementation detail that is invisible to the user, and would therefore never be specified.

What I would like to have is consistent performance. Regular vertex arrays do give consistent performance... consistently slow.

I would much rather see the driver throw an error or something than have it page VBO's out to system memory. Why? Because that does me little good.

One of the primary purposes behind extensions like VBO is to prevent that system-to-AGP/video memory copy that takes place with regular vertex arrays. Now, you're basically saying that VBO may, or may not, prevent that copy. It all depends.

It would be very nice if there was an explicit way to let the driver know not to page out VBO's to system memory.


In any case, you want to give the driver the opportunity to lay these things out in memory the best possible way.

The best possible way to lay things out is to put all static VBOs into video memory and all non-static ones into AGP. Putting either into system memory does precious little for performance.


If you can render the model multiple times when the L&H vectors don't change, you'd better compute them into the CPU once and then reuse the values.

I don't know about that. By computing them on the GPU:

1) you save the bandwidth of sending them. That's 6 less floats, or 24 fewer bytes, per-vertex. This bandwidth could go to more texture fetches

2) you get to have more consistent performance. The worst-case of the CPU approach is (likely) less optimal than the worst-case for the GPU approach. Obviously, the best-case CPU is better than the best-case GPU (for vertex T&L, not transfer). Indeed, the worst-case GPU is the same as the best-case GPU. So, while you may be getting less performance, you're, at least, getting consistent performance per-frame. Which is often better than having the sometimes-good/sometimes-bad performance.

3) you don't have to create dynamic or streaming VBO's. They can all be purely static data. And, therein, lies the possibility for greater vertex throughput (or, at least, more vertex bandwidth).

4) you get more time on the CPU for those sorts of tasks.

5) GPU's get faster faster than CPU's. As such, relying now on shader performance makes things easier in the future.


Will anything be done about the 32 MB limit that one runs into? Even with VBO I still cannot create a single vertex array of more than 32 MB or several arrays whose total is greater than 32 MB.

That, I find to be completely unacceptable. I can live with only being able to allocate, at most, a few thousand VBOs, but not being able to allocate more than 32 total MB of memory? No, that is just unacceptable and must be rectified.

[This message has been edited by Korval (edited 05-20-2003).]

NitroGL
05-20-2003, 09:58 AM
Originally posted by Korval:
That, I find to be completely unacceptable. I can live with only being able to allocate, at most, a few thousand VBOs, but not being able to allocate more than 32 total MB of memory? No, that is just unacceptable and must be rectified.

I just tried allocating a 32MB VBO, and it works just fine. 64MB crashed the GL ICD, but that's because it ran out of memory (my app uses a LOT of texture memory).

Edit:
That's with the latest 3.4 drivers BTW.

[This message has been edited by NitroGL (edited 05-20-2003).]

Tom Nuydens
05-20-2003, 10:16 AM
Originally posted by NitroGL:
I just tried allocating a 32MB VBO, and it works just fine. 64MB crashed the GL ICD, but that's because it ran out of memory (my app uses a LOT of texture memory).

First of all it shouldn't crash your ICD, it should report GL_OUT_OF_MEMORY.

Second, that indicates a big difference between ATI's implementation and NVIDIA's. For laughs, I tried to allocate VBOs in a loop to see how far it would go, and it kept going without GL errors until Windows told me it was going to resize the swap file to make room http://192.48.159.181/discussion_boards/ubb/smile.gif

-- Tom

NitroGL
05-20-2003, 02:06 PM
Originally posted by Tom Nuydens:
First of all it shouldn't crash your ICD, it should report GL_OUT_OF_MEMORY.

It does, but my program doesn't stop on out of memory errors. It crashes when I try to get a pointer to the memory (my program = dumb imp http://192.48.159.181/discussion_boards/ubb/smile.gif).

Tom Nuydens
05-20-2003, 11:10 PM
Originally posted by NitroGL:
It does, but my program doesn't stop on out of memory errors. It crashes when I try to get a pointer to the memory (my program = dumb imp http://192.48.159.181/discussion_boards/ubb/smile.gif).

Ah, good news for the rest of us http://192.48.159.181/discussion_boards/ubb/smile.gif

I still think it's illogical not to let VBOs spill over into system memory, though. Did ATI_vertex_array_object exhibit the same behaviour? Does it work if you allocate those 64 MB in several small VBOs instead of a single large one?

-- Tom

[This message has been edited by Tom Nuydens (edited 05-21-2003).]

vincoof
05-21-2003, 03:45 AM
Originally posted by Korval:
GPU's get faster faster than CPU's

Yup you're right. I'm not telling you that overloading CPU is always a good thing. I'm just explaining why it is not such a bad idea in certain cases.

ehart
05-21-2003, 07:29 AM
Just a quick follow-up to my earlier post. I believe the size cap issue is also resolved for future drivers.

-Evan

Ostsol
05-21-2003, 10:30 AM
Originally posted by ehart:

Just a quick follow-up to my earlier post. I believe the size cap issue is also resolved for future drivers.

-Evan
Ah. . . that's great new, too! http://192.48.159.181/discussion_boards/ubb/cool.gif Thanks!

Korval
05-21-2003, 10:54 AM
I believe the size cap issue is also resolved for future drivers.

Didn't you guys just release a driver last week? Considering that it "is resolved" currently, how is it not in current drivers?

Humus
05-21-2003, 11:06 AM
WHQL takes time unfortunately.

Edit: kehziah beat me to it.

[This message has been edited by Humus (edited 05-21-2003).]

kehziah
05-21-2003, 11:06 AM
just a guess : drivers that were released last week got out of the dev dept several weeks ago; they had to get through all the QA and WHQL thingy before being released to the public. Issues mentioned here were not fixed at that time.

skynet
05-21-2003, 01:07 PM
Didn't you guys just release a driver last week? Considering that it "is resolved" currently, how is it not in current drivers?


I really hope we donīt have to wait another TWO months until new drivers get out. I do not need WHQL very much, since it doesnīt check for OpenGL(-Extensions) compatibility anyway. Releasing non-WHQL drivers (even beta ones, like those infamous "leaked" detonators from nVidia) would help us and ATI. They get faster feedback and we get faster fixes!

Kaboodles
05-21-2003, 02:56 PM
Originally posted by ehart:

I still however do not advocate microscopic VBO's, as they will likely be somewhat inefficient.


How small is microscopic?


Pete