Hardware T&L Tutorials/Suggestions/Code

I am just beginning in OpenGL so understand that even though this question is somewhat above my head currently, that is where I want to get to as quicly as possible. Any help would be great. Thanks.

-Crow

E-Mail: crows_home@bigfoot.com

As I understand it HW T&L is automatic, so long as you let OpenGL do the transformations.

i.e.

Use the LoadMatrix, MultMatrix etc fuctions and don’t do any transformations yourself.

Actually to my knowledge, hardware T&L is not a feature that automatically works whenever able as you must set it to use the T&L driver. I have multiple applications that do not use my T&L driver for OpenGL and yet are OpenGL applications. If I am wrong please correct me.

-Crow

E-Mail: crows_home@bigfoot.com

If HW T&L is supported by your driver, you will get HW T&L, and there is nothing you have to do to enable it. But you have to use proper dataformats to be able to USE the HW T&L. For example, you must pass vertices a floats, not doubles or integers (on a GeForce at least), otherwise you fall back on software transformation.

And by the way, how do you know your applications doesn’t use the HW T&L feature?

No, there is no specific data format required for T&L. T&L is totally automatic. Some formats may waste performance because we’ll do conversions, but T&L will occur in HW no matter what format you use.

  • Matt

Oh really? Now thats great

The thing is, I think I read it at yer developer site, but maybe I either interpretated it wrong, or maybe it was something else I was thinking about. So I’m sorry for my first post about dataformats.

Oh well, learning new things every day…

Mind if I ask then why when I run an application in OpenGL and also again in DirectX that niether of them uses my T&L. It is obvious because I have run T&L projects before and it has worked.
What about that?

-Thanks,
Crow

E-Mail: crows_home@bigfoot.com

In Direct3D, you have to use the HT&L HAL
instead of the “regular” HAL to get HT&L.

In OpenGL, you will get hardware assist if
it’s there, although your efficiency will
go up if you use the glVertexPointer,
glColorPointer, and similar functions (part
of OpenGL 1.2 and available as extensions
in OpenGL 1.1), and will go up even more if
you take the trouble to use
vertex_array_range and correctly load your
vertex data in there (assuming you don’t need
to read it).

Okay. Take this for an example. I have run Unreal Tournament and Hitman: Codename 47 in OpenGL and DirectX mode. In niether was there and improvement in the display through OpenGL over DirectX. I can note an example when OpenL was used in Quake 3 : Arena. I have run the game with and without a T&L capable card in the same machine with no other changes and the frame rate was nigh unto doubled.
Can you then please explain to me or show me coding proof from an OpenGL manual that it automatically uses T&L when available? I am not trying to be difficult, I just want evidence so that I know in which direction to head. For me to understand what I am doing I need to understand where it is coming from.
Thanks again for all your help. If you have any other suggestions please post them as I am open to anything.

-Thanks,
Crow

E-Mail: crows_home@bigfoot.com

It’s not a fair comparison. Not all apps will show T&L as having an impact – something else may be limiting performance. Also, if an app sets up identity matrices, the app is effectively performing T&L for the driver, and there’s nothing left for the driver or hardware to do.

HW T&L is always automatic.

It’s not mentioned in OpenGL books because it’s an implementation detail generally considered to be irrelevant to application programmers.

  • Matt

Originally posted by Crow:
[b] Okay. Take this for an example. I have run Unreal Tournament and Hitman: Codename 47 in OpenGL and DirectX mode. In niether was there and improvement in the display through OpenGL over DirectX. I can note an example when OpenL was used in Quake 3 : Arena. I have run the game with and without a T&L capable card in the same machine with no other changes and the frame rate was nigh unto doubled.
Can you then please explain to me or show me coding proof from an OpenGL manual that it automatically uses T&L when available? I am not trying to be difficult, I just want evidence so that I know in which direction to head. For me to understand what I am doing I need to understand where it is coming from.
Thanks again for all your help. If you have any other suggestions please post them as I am open to anything.

-Thanks,
Crow

E-Mail: crows_home@bigfoot.com[/b]

Have you disabled the vertical sync?

I am replying to this from work so understand I do not have access to my PC directly, but as to my knowledge V-Sync is not set to on. I have it shut off I believe, but as I am not right at the PC I can not say with 100% assuredness, but I am fairly positive that is correct.
Also, to my knowledge you have to build a vertex buffer array for the information to be handled by the OpenGL T&L engine. Is this not so and if it is true then applications that do not do this will not use the T&L.
Again, thanks for all the posts. I am always open to suggestions and other information.

Thanks,
Crow

E-Mail: crows_home@bigfoot.com

I just wanted to hammer it once more, as Matt said :

HW T&L is always automatic.

Whether you use vertex buffer or NOT, the transform is done by the hardware.

The only thing is that the way you pass data to OpenGL affects the T&L efficiency ; but the stuff is always done by the hardware !

But if you want to take the advantages of HW tnl you need to optimize your code.

I just run into that situation when our app constantly runs at 30 fps in every resolution I tried because it is cpu bound.

csaba

Well thanks everyone for your replies and information. This has been most interesting and informative. I look forward to further posts later.

Thanks,
Crow

E-Mail: crows_home@bigfoot.com

>Also, to my knowledge you have to build a
>vertex buffer array for the information to
>be handled by the OpenGL T&L engine.

This is not true.

Vertex buffers (DX or their moral OpenGL
equivalents) will let the card slurp in more
vertexes during the same amount of time than
any other mechanism of transferring vertexes
to the card. However, no matter how the
vertex data goes to the card, the card will
run the vertex through its transform matrix,
just because that’s how the card is built.
Using glRotatef() or similar functions to
affect the transform matrix will change how
the hardware transforms your vertex, no
matter whether that vertex is transferred
indirectly through a glVertexPointer() call
or directly with glVertex3f().

Now, if an application sets the transform
matrix to identity, and does all the
transforms itself, the card will still do
“hardware transform” on the vertex data; it
will just be useless. So, in this case, the
performance of the CPU will be your limiting
factor. I believe some games still do this,
because the transform speed of some OpenGL
drivers has traditionally been, how should I
put it, “sub-optimal.”

However, with hardware T&L cards (i e a
Radeon or GeForce) you’ll need highly
optimized SSE code running on a Pentium IV to
touch the transform bandwidth of the card,
and evenso, I believe the GeForce2Ultra may
still have that not-yet-released CPU licked.
So, yeah, hardware T&L is automatic if you
use the API correctly, and is a very nice
thing to have. Especially for static geometry
where you put all the relevant data in AGP
(or even card) memory and don’t have to
pollute your cache with touching anything
related to the data at all.