PDA

View Full Version : VBO Triangle Strip



Jeffg
01-06-2011, 12:16 PM
I have a triangle strip I'm using to create a line. It uses 8 vertices (so 6 triangles) to create it. Now I have 1000 lines, for a total of 8000 vertices (6000 triangles). I place this into a VBO and call the GL_TRIANGLE_STRIP and it creates it, however, the strip doesn't know that each 8 vertices is it's own "strip" and to start a new strip with the next 8. It treats all 8000 as one big strip. How do I fix this to tell it to do 1000 strips using 8000 vertexes?

Code looks like this:

gl.glEnableClientState(GL.GL_NORMAL_ARRAY);
gl.glEnableClientState(GL.GL_VERTEX_ARRAY);
gl.glEnableClientState(GL.GL_COLOR_ARRAY);
gl.glEnableClientState(GL.GL_TEXTURE_COORD_ARRAY);
gl.glEnableVertexAttribArray(vr.gpuProgVolumeLine_ other);

gl.glBindBuffer( GL.GL_ARRAY_BUFFER, lnormal_vbo[0] );
gl.glNormalPointer(GL.GL_FLOAT,0,0);
gl.glBindBuffer(GL.GL_ARRAY_BUFFER, ltex_vbo[0] );
gl.glTexCoordPointer(2, GL.GL_FLOAT, 0, 0);
gl.glBindBuffer( GL.GL_ARRAY_BUFFER, lvertex_vbo[0] );
gl.glVertexPointer(3,GL.GL_FLOAT,0,0);
gl.glVertexAttribPointer(vr.gpuProgVolumeLine_othe r, 3, GL.GL_FLOAT, false, 0, lvertex_vbo[1]);
gl.glBindBuffer( GL.GL_ARRAY_BUFFER, cvertex_vbo[0] );
gl.glColorPointer(3,GL.GL_FLOAT,0,0);
gl.glDrawArrays( GL.GL_TRIANGLE_STRIP, 0, lvertexlength);

gl.glDisableClientState(GL.GL_NORMAL_ARRAY);
gl.glDisableClientState(GL.GL_VERTEX_ARRAY);
gl.glDisableClientState(GL.GL_TEXTURE_COORD_ARRAY) ;
gl.glDisableClientState(GL.GL_COLOR_ARRAY);
gl.glDisableVertexAttribArray(vr.gpuProgVolumeLine _other);

mhagain
01-06-2011, 12:29 PM
Don't use triangle strips, they're so 1998. Use GL_TRIANGLES with indexes instead and you can draw any kind of geometry you want.

Alfonse Reinheart
01-06-2011, 12:29 PM
How do I fix this to tell it to do 1000 strips using 8000 vertexes?

You could use Primitive Restart (http://www.opengl.org/wiki/Vertex_Specification#Primitive_Restart). Or you could just use a bunch of triangles instead of strips and render it in one call.

ugluk
01-06-2011, 12:55 PM
What happened in 1998? Strips ain't totally out yet. They're used on mobile devices AFAIK and he's using Java, so maybe he's writing for Android. Why aren't triangle fans ever mentioned?

Alfonse Reinheart
01-06-2011, 01:09 PM
Why aren't triangle fans ever mentioned?

Because they wouldn't be useful in this case. He's drawing lines composed of 8 triangles. It would take 2 triangle fans to make that work.

Dark Photon
01-06-2011, 03:05 PM
What happened in 1998? Strips ain't totally out yet. They're used on mobile devices AFAIK
Are you saying there are GL-ES mobile devices out there that don't have a post-T&L vertex cache? Really? Which ones?

(I don't know anything about specific mobile GPU capabilities, so I'm asking if you'll educate me here.)

Alfonse Reinheart
01-06-2011, 03:11 PM
Are you saying there are GL-ES mobile devices out there that don't have a post-T&L vertex cache? Really? Which ones?

No, but I think he's saying that there are GL-ES mobile devices that have limited memory. And strips take up about half the index size of triangle lists.

And using strips doesn't turn off the post-T&L cache. You may not be able to use it quite as efficiently, but there are algorithms that generate efficient strips for the post-T&L cache.

Jeffg
01-06-2011, 03:57 PM
Thanks for the great response. Not developing for mobile. I'm not sure how I would approach the triangles, since there is a lot going on with it. I'm not sure how to translate the normals, textures, vertexAttribs, and vertex shader to apply to 18 vertexs instead of 8. Also, I used 1000 lines as an example, but the reason I'm doing this is because I'm trying to draw many more lines than that and the performance of GL_LINE is slow on most systems (not hardware accelerated). I could be running 20,000 lines, so if I were to do full triangles, that would be 360,000 vertexs (that doesn't count normals, textures, colors, or the attribs). I'm currently running on 40,000 vertexes since I'm using GL_LINE, but I wanted to try use a texture to simulate a volumetric line (described here (http://sebastien.hillaire.free.fr/demos/simplevolumeline/vline.htm)). That would put it to 160,000 vertexes, but I was hoping that cached textures would be a lot faster, which has been my experience (I've run 500,000 textured sprites with no issue, but 20,000 lines kills it).

At a basic level, this is what is going on. To render a volumetric line you send the line start point 4 times and then the line end point 4 times also. Then, the vertex shader will handle the extrusion of the vertices along the line direction and orthogonal direction in screen space based on the weight you give to each vertices using glVertexAttrib4f.


gl.glBegin(gl.GL_TRIANGLE_STRIP);

gl.glNormal3fv(vline_vertexOffset[0], 0);
gl.glTexCoord2fv(vline_texCoord[0], 0);
gl.glVertexAttrib4f(this.gpuProgVolumeLine_other, p2[0], p2[1], p2[2], 1.0f);
gl.glVertex3fv(p1, 0);
gl.glNormal3fv(vline_vertexOffset[1], 0);
gl.glTexCoord2fv(vline_texCoord[1], 0);
gl.glVertexAttrib4f(this.gpuProgVolumeLine_other, p2[0], p2[1], p2[2], 1.0f);
gl.glVertex3fv(p1, 0);
gl.glNormal3fv(vline_vertexOffset[2], 0);
gl.glTexCoord2fv(vline_texCoord[2], 0);
gl.glVertexAttrib4f(this.gpuProgVolumeLine_other, p2[0], p2[1], p2[2], 1.0f);
gl.glVertex3fv(p1, 0);
gl.glNormal3fv(vline_vertexOffset[3], 0);
gl.glTexCoord2fv(vline_texCoord[3], 0);
gl.glVertexAttrib4f(this.gpuProgVolumeLine_other, p2[0], p2[1], p2[2], 1.0f);
gl.glVertex3fv(p1, 0);

gl.glNormal3fv(vline_vertexOffset[4], 0);
gl.glTexCoord2fv(vline_texCoord[4], 0);
gl.glVertexAttrib4f(this.gpuProgVolumeLine_other, p1[0], p1[1], p1[2], 1.0f);
gl.glVertex3fv(p2, 0);
gl.glNormal3fv(vline_vertexOffset[5], 0);
gl.glTexCoord2fv(vline_texCoord[5], 0);
gl.glVertexAttrib4f(this.gpuProgVolumeLine_other, p1[0], p1[1], p1[2], 1.0f);
gl.glVertex3fv(p2, 0);
gl.glNormal3fv(vline_vertexOffset[6], 0);
gl.glTexCoord2fv(vline_texCoord[6], 0);
gl.glVertexAttrib4f(this.gpuProgVolumeLine_other, p1[0], p1[1], p1[2], 1.0f);
gl.glVertex3fv(p2, 0);
gl.glNormal3fv(vline_vertexOffset[7], 0);
gl.glTexCoord2fv(vline_texCoord[7], 0);
gl.glVertexAttrib4f(this.gpuProgVolumeLine_other, p1[0], p1[1], p1[2], 1.0f);
gl.glVertex3fv(p2, 0);

gl.glEnd();
vertex shader

attribute vec4 other;
uniform float lineWidth;
varying vec4 color;
void main() {

gl_TexCoord[0] = gl_MultiTexCoord0;

color = gl_Color;

vec4 vMVP = gl_ModelViewProjectionMatrix * gl_Vertex;
vec4 otherMVP = gl_ModelViewProjectionMatrix * other;


vec2 lineDirProj = lineWidth * normalize( (vMVP.xy/vMVP.w) - (otherMVP.xy/otherMVP.w) );


if( sign(otherMVP.w) != sign(vMVP.w) ) {
lineDirProj = -lineDirProj;
}


vMVP.x = vMVP.x + lineDirProj.x * gl_Normal.x;
vMVP.y = vMVP.y + lineDirProj.y * gl_Normal.x;

vMVP.x = vMVP.x + lineDirProj.y * gl_Normal.y;
vMVP.y = vMVP.y - lineDirProj.x * gl_Normal.y;

gl_Position = vMVP;

}

ugluk
01-07-2011, 07:47 AM
One example, the (in)famous:

PowerVR MBX Lite

but I think there are others. No post T&L cache. It's VBOs are more cosmetics than anything else too. No difference than with using client arrays there. So maybe your claim, that compression also plays a role here is not far fetched.

Again, I wonder why triangle fans are never mentioned. I use them once and again.

mhagain
01-07-2011, 10:11 AM
http://home.comcast.net/~tom_forsyth/blog.wiki.html


While I'm having a rant - strippers. STOP IT. Stop writing academic papers about generating the ultimate strips. It's all totally pointless - pretty much every bit of hardware has indexed primitives (except the PS2, which has a bunch of completely bonkers rules that are ''nothing'' like standard strips anyway). The ultimate stripper will get you one vertex per triangle. But even a very quick and dirty indexer will get you that, and good indexer ... will get close to 0.65 vertices per triangle for most meshes with a 16-entry FIFO vertex cache. The theoretical limit for a regular triangulated plane with an infinitely large vertex cache is 0.5 verts/tri (think of a regular 2D grid - there's twice as many triangles as vertices), so thats not too shabby.

(snip)

All those academics still writing papers on this - go find something more useful to spend your time on. We've had indexed primitive hardware for <$50 for almost a decade now. Welcome to the 21st century.

It's good as a general rule; of course there is going to be crazy-assed hardware which is an exception, but you should understand the situation and understand that you're coding specifically for hardware that is an exception; otherwise just forget about strips.

Alfonse Reinheart
01-07-2011, 10:46 AM
It's good as a general rule

Sorry but I don't buy it. It's a nice starting point, true, but however much the post-T&amp;L cache matters, you can't just pretend memory doesn't matter. And if you've got lots of meshes in memory, the 2-3 times as many indices do add up.

Its a performance optimization. A memory vs. performance tradeoff. Of course, memory access itself is performance; reading more indices takes bandwidth that could have been used for textures or whatever.

Forsyth's notion that triangle lists are categorically better than strips is simply wrong. At least, not without some performance tests that deal with issues besides the post-T&amp;L cache.

mhagain
01-07-2011, 06:22 PM
That's quite true, and that's why I specifically said that it's "good as a general rule". Of course there are always gonna be special situations that disprove the general rule, and you should always benchmark your own code to see which tradeoff is most suitable for your own use case. Forsyth does tend to know what he's talking about though.

Dark Photon
01-07-2011, 07:54 PM
One example, the (in)famous:

PowerVR MBX Lite

but I think there are others. No post T&amp;L cache.
Interesting. Thanks. Had no clue such dinosaurs were still out there.

BTW, agree with the comments about Forsyth and indexed triangles. Everyone should try and bench with their apps, but I think once you've tried decent indexed triangles, you won't go back:

* http://home.comcast.net/~tom_forsyth/papers/fast_vert_cache_opt.html

Been faster for quite a while. And Forsyth's alg is easy to code up. There are several implementations of it out there on the net.

Remember: memory "space" is cheap. It's memory "bandwidth" that's expensive. And many apps are vertex bound, so reducing vertex shader executions pays dividends there.

But of course, if you're not vertex bound, don't sweat this yet. As with all things pipelined, profile, and always optimize the bottleneck. You do want to see an improvement, right? And you otherwise might be making things worse and not know it.

ugluk
01-08-2011, 03:50 AM
I can't really decide, if it doesn't have the cache because it's a dinosaur. It could be lacking the cache, because: Steve Jobs wanted a cheap GPU in the iphone/ipod touches or because of the power budget savings or maybe both. Anyway, the triangle strips recommendation is almost everywhere in the Apple's GL development docs.

V-man
01-08-2011, 10:01 AM
Jeffg, you should render everything with glDrawRangeElements.
The only exception is when you are rendering points, which don't benefit from indices since there is no same vertex reuse. Use glDrawArrays in that case.