PDA

View Full Version : Translations within one batch



holdeWaldfee
06-06-2006, 01:39 PM
I am looking into the possibilities to translate subsets of vertices within one draw batch right now in order to reduce the amount of batches.
But I couldn’t find very valuable documentations on this yet.
So it would be nice if you could share your sources (online, books, etc.) about this. Thanks!

I find it very frustrating that there is still no straightforward technology to do this.
Especially considering that this is an increasing fundamental performance problem.
Something like Hardware-/API handled translation trees would be extremely useful.
Using vertex programs for this is just a mess in my opinion.


By the way, do you know about any plans by ATI/NV to implement 32 Bit depth-buffer support?
It really sucks that we still have to use over 8 year old (!) precision that is equal to a -100 dioptres sight.


But HEY! Why worry about things which really matter if we have cool sounding stuff like HDR!?

Obli
06-06-2006, 01:55 PM
Originally posted by holdeWaldfee:
I am looking into the possibilities to translate subsets of vertices within one draw batch right now in order to reduce the amount of batches.Depending on the context, you could pre-transform this or add another stream containing compressed per-vertex displacement. What are you trying to do?

Originally posted by holdeWaldfee:
Especially considering that this is an increasing fundamental performance problem.
Something like Hardware-/API handled translation trees would be extremely useful.
Using vertex programs for this is just a mess in my opinion.If you mean something like instancing, I believe it's foundamentally different from your problem (and I'm not sure of what you're doing).

Originally posted by holdeWaldfee:
By the way, do you know about any plans by ATI/NV to implement 32 Bit depth-buffer support?
It really sucks that we still have to use over 8 year old (!) precision that is equal to a -100 dioptres sight.I agree a higher precision Zbuffer would be great considering other parts of the pipe have been widely overhauled but I fear this isn't going to happen soon. It's likely you'll have 48-bit Z when GPU start supporting double precision but I don't think this is on the horizon.

Originally posted by holdeWaldfee:
But HEY! Why worry about things which really matter if we have cool sounding stuff like HDR!? In my opinion, HDR is one of the most important things recently introduced. I suggest you to take a look at High Dynamic Range Imaging by Debevec and others. I believe HDR improves the user experience much more than a higher precision Z by sure means.

holdeWaldfee
06-06-2006, 02:06 PM
To explain a bit better what I meant:

Right now I have to break up a model into a lot of batches because I want to buffer the vertex data on the graphics card without updating modified vertices (for performance reasons).
This problems results into several tenthousand batches in my planned scenes - and thats a huge performance problem as you can imagine.

What I want is one draw call for a whole model.
Right now I would have to use a vertex program to make this possible.
But I haven't found a good way to do this yet.

holdeWaldfee
06-06-2006, 02:15 PM
In my opinion, HDR is one of the most important things recently introduced. I suggest you to take a look at High Dynamic Range Imaging by Debevec and others. I believe HDR improves the user experience much more than a higher precision Z by sure means.Well, I disagree. ;-)

HDR only got a good reputation by users because the first implementations of bloom effects came together with HDR. Differing between HDR- and non-HDR scenes with equal content would cause a lot of yawning users in my opinion.

zeoverlord
06-06-2006, 05:47 PM
Originally posted by holdeWaldfee:

In my opinion, HDR is one of the most important things recently introduced. I suggest you to take a look at High Dynamic Range Imaging by Debevec and others. I believe HDR improves the user experience much more than a higher precision Z by sure means.Well, I disagree. ;-)

HDR only got a good reputation by users because the first implementations of bloom effects came together with HDR. Differing between HDR- and non-HDR scenes with equal content would cause a lot of yawning users in my opinion. We havn't even begun to see the full potential of HDR yet.
And yes with equal content the difference between hdr and non hdr is not that great, but if you make HDR-only content and effects, then you will start seing amazing stuff.

32bit depthbuffer support, don't they allready have that?, either way, unless you totaly screw up the zfar/znear ratio, even a 1024bit z-buffer won't make it look better.

ZbuffeR
06-06-2006, 06:17 PM
I don't think there are >24 bit zbuffers (on consumer/gamer video cards).

holdeWaldfee
06-06-2006, 06:27 PM
32bit depthbuffer support, don't they allready have that?They only have 24 Bit depth + 8 Bit stencil.


, either way, unless you totaly screw up the zfar/znear ratio, even a 1024bit z-buffer won't make it look better.It is an exponential increase of precision. So it would matter.

32 Bit depth precision would finally allow a natural range of view on non-billboard objects without very noticable z-fighting.

Humus
06-06-2006, 08:37 PM
More than 24bit Z isn't very useful as long as the vertex pipe is only FP32. That's a 23bit mantissa.

Brolingstanz
06-06-2006, 08:52 PM
And how much will a 32 bit zbuffer cost? You're talking about a different architecture here, one that is likely more expensive to fabricate than the current one. Not usually a good thing for consumer boards.

Plus, you don't have industry heavyweights lobbying for this, least not as far as I can tell. You need to get big ISVs doing the one-legged 32bit zbuffer rain dance (if the price isn't right, might not make any difference).

Zulfiqar Malik
06-07-2006, 04:05 AM
I seriously believe that although many of the recent advancements in the in-game visuals look pretty nice, they are not an alternative to good gameplay. If HDR were used in some smart gameplay like HL2 used physics to improve their gameplay (rather than just having dumb ragdolls lying all around), i would have been much happier.
And yes, i want 32-bit depth buffer, but i also agree that a 24-bit depth buffer is sufficient for more than 95% of the tasks especially on consumer grade hardware.

holdeWaldfee
06-07-2006, 09:48 AM
The development of gameplay is clearly going towards much larger scenes with higher range of view.
Try to draw a object with layered faces (like for trees) at znear*2000 and you definitely get very noticable z-fighting.
You can see in countless games where artists ran into this problem and had to remove content because of this.


Anyway, can anyone help me out a bit on the translation issue?

Thanks!

tamlin
06-07-2006, 12:29 PM
Could the effect you need perhaps be accomplished using vertex_blend (http://oss.sgi.com/projects/ogl-sample/registry/ARB/vertex_blend.txt) ?

Overmind
06-08-2006, 08:15 AM
You can't compare HDR and non-HDR with equal content.

That's the whole point with HDR, that you can have arbitrary content (in terms of light sources), while without HDR you have to design around the limitations (e.g. avoid saturation).

And content specifically designed for HDR (without the "old" limitations) looks a lot better. Especially, it looks a lot better near the camera. IMHO this should be more important than optimizing far away parts of the scene.

HDR is not about making existing scenes look better, it's about removing design limitations, so the issue is not so different from increasing the depth buffer precision. Things like bloom or diffuse environment mapping are side effects.

Of course I'm not saying that we don't need higher precision depth buffer, I'm just saying where the priorities should be. I'd rather see fast float MRT implementations (preferably 32 bit with blending) so things like deferred shading (for extremely high light source count, again lifting restrictions on the content designers) become feasible, than having a better depth buffer, which will "just" increase possible view distance.

holdeWaldfee
07-11-2006, 06:42 AM
Originally posted by tamlin:
Could the effect you need perhaps be accomplished using vertex_blend (http://oss.sgi.com/projects/ogl-sample/registry/ARB/vertex_blend.txt) ? Thanks for bringing this back to my mind.

It seems however that this old extension has some problems with IHV support.

holdeWaldfee
07-11-2006, 07:27 AM
I am not saying that things like HDR are useless.
My opinion however is that there are much more important things to do right now.

You can retouche graphics with fragment programs and whatever all you want. It doesn't make bad things like z-fighting go away.
Look at new games like Oblivion, Battlefield2, Crysis and so on. They all have really bad z-fighting problems, even though the artists already limited their content to reduce this.

Why not start with the fundamental things?

Humus
07-11-2006, 08:09 PM
Do you have any screenshots of that? I'm not aware of any Z-fighting problems in any of those games. In fact, I can't remember when I last saw a game have Z-fighting problems. But I'm sure you'll be delighted about the DXGI_FORMAT_D32_FLOAT format that's in DX10.

And yeah, HDR >> 32bit Z :)
I've even started playing with HDR in photography. It's wicked cool!

holdeWaldfee
07-12-2006, 12:43 PM
Originally posted by Humus:
[QB] Do you have any screenshots of that? I'm not aware of any Z-fighting problems in any of those games. In fact, I can't remember when I last saw a game have Z-fighting problems.For example in Battlefield2:

http://img377.imageshack.us/img377/3070/14kk1.jpg (http://imageshack.us)
http://img163.imageshack.us/img163/9223/27by4.jpg (http://imageshack.us)
http://img163.imageshack.us/img163/948/37kv2.jpg (http://imageshack.us)

It's far worse in motion of course.

Try to draw things like detailed trees at high ranges.
You get horrible z-fighting.

It is a HUGE issue if you want to make a game with wide open scenes (and the market demands this more and more).
And this becomes even worse if you want to use some more details in the scenes.


But I'm sure you'll be delighted about the DXGI_FORMAT_D32_FLOAT format that's in DX10.I don't have many informations about D3D10.
Does this really result into higher depth buffer precision?
Will it be available for OGL too?

knackered
07-12-2006, 02:53 PM
those screenshots have one thing in common, can you spot it children?
there are tricks to avoid these problems, like if you're going to render with a very narrow field of view you should apply a uniform scale to the scene to bring distant objects into a higher precision part of the depth range (seeing as though any z fighting on foreground objects isn't going to be noticed at that point).

RigidBody
07-12-2006, 02:59 PM
Originally posted by knackered:
those screenshots have one thing in common, can you spot it children? yes. they are all displayed two posts above this one. eh, eh, eeeehhh... :D

c'mon don't make it so thrilling, knackered---TELL ME WHAT IT IS!!!

knackered
07-12-2006, 03:02 PM
they are all SNIPER ZOOOOOOOOMED.

RigidBody
07-12-2006, 03:04 PM
oh. yes. really, they are.

i have to admit i liked the sniper mode in unreal tournament. but then, in the level with the two towers floating in space, i blew so many heads that it gave me nightmares...

RigidBody
07-12-2006, 03:11 PM
and by the way: if you see something like the above through only one of your eyes, you certainly shouldn't shout that much ;)

<whisper on> snaaaaiiiiiiipeeeeeeeeeer mooooooooouuuuuuuuud<whisper off>

holdeWaldfee
07-12-2006, 06:09 PM
Eh? You get the same problems with normal FOV.

holdeWaldfee
07-12-2006, 06:51 PM
Tree with foliage made out of more than 1000 triangles.
Draw this at z-near * 10000 (z-near = 10cm, tree 1000m) -> total ugly, horrible z-fighting!

Not to mention intersecting or layered terrain meshes.


It IS a HUGE limitation if you want to create open scenes with a higher range of view.
And that is definitely a huge market because many people are just sick of shoe-box shooter games.

holdeWaldfee
07-12-2006, 07:06 PM
Originally posted by knackered:
those screenshots have one thing in common, can you spot it children?
there are tricks to avoid these problems, like if you're going to render with a very narrow field of view you should apply a uniform scale to the scene to bring distant objects into a higher precision part of the depth range (seeing as though any z fighting on foreground objects isn't going to be noticed at that point). It wouldn't solve the problem anyway as described, but isn't it that w-buffering isn't supported in OGL?

Won
07-12-2006, 07:51 PM
Does OpenGL specify the depth buffering implementation? In any case, I don't think any modern hardware uses w-buffering, and I think it was removed from D3D (shooting from the hip...could be wrong).

So z-buffering has this problem where most of the precision is piled up by the near plane, depending on how close it is to the eye point. You really don't effectively address this problem by throwing more bits. What you really want is a floating point depth representation.

But doing the naive thing (mapping 0.0 to zNear) is actually really bad, because floating point values have most of their precision near 0.0, compounding the problem. You really want to map zNear to 1.0 and zFar to 0.0 (effectively, a 1-z buffer). If you do this, 16-bit floats could work well. You'd probably want to dispense with the sign bit and tweak the representation in other ways (e.g. exponent bias).

-Won

PS It should be self-evident (without even considering the overused HDR effect du jour) that 8 bits per color channel is pretty paltry. For one, it prevents you from using a linear representation for color.

Korval
07-12-2006, 08:31 PM
You really want to map zNear to 1.0 and zFar to 0.0 (effectively, a 1-z buffer).Know what's worse than seeing z-fighting at a distance? Seeing it up close.

I prefer it at a distance.

In any case, this can be "solved" by simply dividing the world up into 2 passes: one for the stuff far away, and one for stuff closer. Obviously, a z-buffer clear would happen between them. And you'd need a clipping plane to guarente that nothing unpleasant happens between the two images. And you couldn't use the whole range of the forward plane.

zed
07-12-2006, 09:04 PM
Try to draw a object with layered faces (like for trees) at znear*2000 and you definitely get very noticable z-fighting.with 24bit depth
near = 1 meter
far = 1,000,000 meters

Z = 2000 == ~0.2 meter resolution, i doubt at 2km distance u can notice 20cm distance. true billboarded stuff mightnt look 100% ok, but then again billboards are a hack anyway so what do u expect

rgpc
07-12-2006, 10:48 PM
Originally posted by holdeWaldfee:
For example in Battlefield2:
I'm curious - what hardware/drivers are you seeing these issues on? I play BF2 quite a bit - and mostly as a sniper and I've never noticed the zfighting before. Maybe it's the detail level I have (I currently use a 5900 - so I have the detail turned down - but I did have it turned up on the w'end and didn't notice any issues).

That being said the great thing about BF2 is the gameplay (I enjoy it without the eye candy). It has significant numbers of animated object and performs well even on older hardware. I somehow doubt that doing translations within a single batch would be too much of an issue for the BF2 developers.

However, I would think that you could do transformations in the same way that they are done with skeletal animations (through vertex programs). I haven't ever done this but I would think that it would do what you want.

holdeWaldfee
07-13-2006, 06:32 AM
Originally posted by Korval:
In any case, this can be "solved" by simply dividing the world up into 2 passes: one for the stuff far away, and one for stuff closer. Obviously, a z-buffer clear would happen between them. And you'd need a clipping plane to guarente that nothing unpleasant happens between the two images. And you couldn't use the whole range of the forward plane. I often heard this suggestion in forums and I even tried this out in our project.

But what I get is very bad z-fighting at the seam of the two regions.
Did anyone get good results with this?

PS.: No problems with cockpits and stuff, but terrain scenes really seem to be problematic.

holdeWaldfee
07-13-2006, 06:48 AM
Originally posted by zed:
with 24bit depth
near = 1 meter
far = 1,000,000 meters

Z = 2000 == ~0.2 meter resolution, i doubt at 2km distance u can notice 20cm distance. true billboarded stuff mightnt look 100% ok, but then again billboards are a hack anyway so what do u expect Z-near can't be 1 meter. This would cause very bad clipping if the camera gets close to objects. 10-20 cm is the maximum.

Try to make a scene where you “add” water to the terrain so that you have a ground below the water surface.
Now fly above this in a distance of 2 km --> horrible z-fighting between the water and terrain mesh.

Of course, you could always apply horrible LOD hacking stuff to reduce this problem, but than you would have to do a lot of them, because there are MANY different cases where this happens.

holdeWaldfee
07-13-2006, 07:52 AM
Originally posted by rgpc:
I'm curious - what hardware/drivers are you seeing these issues on?Radeon X800 something. It is driver independent.


I somehow doubt that doing translations within a single batch would be too much of an issue for the BF2 developers.One tank in BF2 has around ~50 translation matrices.
So that means 50 additional batches per instance (!) just because of the movable parts.

And we want much more detailed vehicles in the future.


However, I would think that you could do transformations in the same way that they are done with skeletal animations (through vertex programs). I haven't ever done this but I would think that it would do what you want. Yes, that’s matrix paletting.
I find it quite inefficient to specify the translation matrix for each vertex.
It would be really good to have a straightforward API functionality to handle this problem.

knackered
07-13-2006, 07:54 AM
Originally posted by holdeWaldfee:

Originally posted by knackered:
those screenshots have one thing in common, can you spot it children?
there are tricks to avoid these problems, like if you're going to render with a very narrow field of view you should apply a uniform scale to the scene to bring distant objects into a higher precision part of the depth range (seeing as though any z fighting on foreground objects isn't going to be noticed at that point). It wouldn't solve the problem anyway as described, but isn't it that w-buffering isn't supported in OGL? well the thing is, Mr. HoldeWaldfee, is that it does solve the problem as described, otherwise I would not have wasted my time suggesting it.

holdeWaldfee
07-13-2006, 08:05 AM
Originally posted by knackered:
well the thing is, Mr. HoldeWaldfee, is that it does solve the problem as described, otherwise I would not have wasted my time suggesting it.The problem happens with a normal FOV too, so this wouldn't solve it. It might reduce the effect when you have a narrow FOV.

And I thought w-buffering isn't supported very well under OGL?

knackered
07-13-2006, 08:12 AM
w-buffering isn't supported at all under OpenGL.
The best you can do is squash distant objects into higher precision parts of the zbuffer. Most outdoor scenes don't have very intricate models in them, such as a teapot, so you can afford to lose some foreground precision.
You basically want to keep your units as metres, but get better precision.

holdeWaldfee
07-13-2006, 08:27 AM
What do you mean with this?
How could I move distant objects closer to z-near?
By doing custom projection stuff?

knackered
07-13-2006, 09:07 AM
something like:-

glMatrixMode(GL_PROJECTION);
glPushMatrix();
glScalef(0.1f, 0.1f, 0.1f);
glMatrixMode(GL_MODELVIEW);
glPushMatrix();
glScalef(0.1f, 0.1f, 0.1f);
glMultMatrixf(m_cameraTransform);
drawScene();
glMatrixMode(GL_PROJECTION);
glPopMatrix();
glMatrixMode(GL_MODELVIEW);
glPopMatrix();

everything gets smaller, and nearer, but your camera also moves slower.

holdeWaldfee
07-13-2006, 09:41 AM
Hmm...

I think this would cause problems if you have a scene with a horizon since you can't apply this to the terrain.
But I am sure that this would help for a game in space.

knackered
07-13-2006, 11:02 AM
not with you, why would it be different for a terrain?
the far clip plane gets scaled inwards too, so everything is still clipped in the same place.
this works fine in my flight simulators, the only problem I came across was z fighting in the cockpit, where I have co-planar polys all over the shop....simple fix for that was to draw the cockpit in a second pass, reversing the depth test.
God in heaven, why don't you just try it in your application....I take it you've got an application to try it in?

holdeWaldfee
07-13-2006, 11:32 AM
I will test this tomorrow.

I can't figure out how this could help.
What is the point if everything is scaled down?
Wouldn't the relation between znear and zfar be the same?

rgpc
07-13-2006, 05:46 PM
Originally posted by holdeWaldfee:
One tank in BF2 has around ~50 translation matrices.
So that means 50 additional batches per instance (!) just because of the movable parts.

And we want much more detailed vehicles in the future.

50! I think you're a bit (lot) over enthusiastic there.

holdeWaldfee
07-14-2006, 05:57 AM
For the M1A2:

18 drive wheels
20 track segments
1 turret
1 main gun mantlet
1 main gun barrel
1 commander hatch
1 commander gun
10 antenna segments

Komat
07-14-2006, 08:04 AM
That can be solved by using HW skinning (one bone for each part) instead of having a parts drawn by separate calls. With sufficient number of vertex constants, entire tank may be drawn in one batch.

holdeWaldfee
07-14-2006, 09:43 AM
Yup, I tried to implement this.
But the problem was that my meshes use indexed vertices.
So it is possible that a vertex is linked to multiple translation matrices.

Does anyone know about good solutions for this?

Tin Whisker
07-14-2006, 10:00 AM
Maybe you need to separate your meshes into transform groups. If the meshes are individually transformable, they probably shouldn't be sharing vertices to start with, being separate pieces and all.

Is that what you mean? :)

holdeWaldfee
07-14-2006, 10:14 AM
This would work of course.
I would have redundant vertices then however, which is what I wanted to avoid.

It really would be cool if it were possible to find out the index value of the current vertex computed in the vertex program.
You could then get rid of the per vertex matrix index data and specify index ranges for the sub-object.
And you wouldn’t need redundant vertices anymore.

Tin Whisker
07-14-2006, 10:38 AM
Yea, I suppose. But how many vertices are actually shared in the case of the tank? Each part seems to me to be logically distinct. Although you may have something special going on that we're not aware of.

I think the next gen hardware will be capable of doing some limited index manipulation in the vertex shader stage. Don't know much about it, save what I've gleaned from this and that.

Komat
07-14-2006, 02:03 PM
Originally posted by Tin Whisker:
Yea, I suppose. But how many vertices are actually shared in the case of the tank? Each part seems to me to be logically distinct. Although you may have something special going on that we're not aware of.
One thing that may reasonably share vertices in indexed mesh are the parts that are used a multiple times on single tank like wheels or track segments.

zed
07-15-2006, 07:10 AM
For the M1A2:

18 drive wheels
20 track segments
1 turret
1 main gun mantlet
1 main gun barrel
1 commander hatch
1 commander gun
10 antenna segmentsas komat pointed out , u do the vert calcs yourself. thus that tank will be drawn in one drawcall, not 50 (draw calls are defined by material not vert movement).
your options are calc on gpu or cpu

One thing that may reasonably share vertices in indexed mesh are the parts that are used a multiple times on single tank like wheels or track segments.yes but artists dont make a tank with just one wheel (nitpick, if u can call it that) plus transformation data, they will in fact model the tank complete

Komat
07-15-2006, 09:48 PM
Originally posted by zed:
yes but artists dont make a tank with just one wheel (nitpick, if u can call it that) plus transformation data, they will in fact model the tank complete They usually dont do that explicitly however if the modeling package keeps track about instancing done during the modeling process (e.g. that all wheels were created as instances of the first wheel), the exporting tool may take advantage of this information to store the wheel geometry only once.

Korval
07-16-2006, 12:54 PM
They usually dont do that explicitly however if the modeling package keeps track about instancing done during the modeling process (e.g. that all wheels were created as instances of the first wheel), the exporting tool may take advantage of this information to store the wheel geometry only once.Do you really think that that's going to buy something, performance-wise? Vertices will still have to be T&L'ed. Fragments will still have to be shaded. And the cost of instancing (by throwing on a new batch for the wheels) will probably outweigh any minor performance improvements from bandwidth savings.

Komat
07-16-2006, 04:23 PM
Originally posted by Korval:
Do you really think that that's going to buy something, performance-wise?Performance-wise no. The memory savings caused by the instancing are the important thing. Altrough it is likely not the case for the tank mentioned here, the difference between having say 10MB of geometry for instanced data and 60MB of geometry for noninstanced data may be important.

tanzanite
07-17-2006, 06:23 AM
OT:

Originally posted by knackered:
glMatrixMode(GL_PROJECTION);
glPushMatrix();
glScalef(0.1f, 0.1f, 0.1f);
/.../Sorry to jump in OT, but ... HOW does that help? I mean - i kind of trust you that it does work, but WHY? As i have had often precision problems i have often tried to find something to solve/help it - with the result that nothing can be done with just adjusting the numbers. Now - i tried this again - on paper (can't do in gl for quite some time) and it didn't help anything. (a: i'm bad at math, b: limited precision math doing something weird, c: ???)

Could you explain?

Tin Whisker
07-17-2006, 01:11 PM
One thing that may reasonably share vertices in indexed mesh are the parts that are used a multiple times on single tank like wheels or track segments.Totally.

I think we're having 2 different conversations here, one about mesh vertex duplication and one about mesh instancing. I was referring to the former and your comment seems to be in regard to the latter. "Sharing vertices" to me means 2 distinct meshes having vertices in common that could be merged or welded, materials and other considerations aside. Obviously wheels and treads or any other exact duplicate of geometry is a candidate for instancing.

Komat
07-17-2006, 02:39 PM
Originally posted by Tin Whisker:

I think we're having 2 different conversations here, one about mesh vertex duplication and one about mesh instancing. I was referring to the former and your comment seems to be in regard to the latter. You are right. For the tank example the usefullness of simple sharing of vertices between the independently movable rigid parts seems to me as somewhat limited as long as you wish to avoid seams and holes inside the geometry when both parts move. For this reason I talked about the wheel instancing which is currently incompatible with the sharing. On dx10 class hw it should be imho possible (using the geometry shader, generated primitive id and buffer lookup) to draw the entire tank geometry in single call and simultaneously instance selected parts of the geometry (e.g. wheels) from shared vertices.