Part of the Khronos Group

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Results 1 to 10 of 15

Thread: What is best practice for batch drawing objects with different transformations?

Threaded View

Previous Post Previous Post   Next Post Next Post
  1. #10
    Member Regular Contributor
    Join Date
    Jan 2011
    Paris, France
    Quote Originally Posted by openlearner View Post
    As we were talking about pre-multiplying vertex transformations before submitting them to the shader, I'm wondering what you mean in this context; Are you saying even these matrix calculations are pre-processed?
    In somes 3D model's formats, such as the .MD2 format, the vertex and matrix data are pre-processed for to minimize the size of the model's data :
    Code :
    // vertex typedef struct 
        unsigned char   v[3];                // compressed vertex (x, y, z) coordinates
         unsigned char   lightnormalindex;    // index to a normal vector for the lighting  
    } vertex_t;
    // texture coordinates typedef struct 
         short    s;
         short    t;  
    } texCoord_t;
    typedef struct {
         short   index_xyz[3];    // indexes to triangle's vertices
         short   index_st[3];     // indexes to vertices' texture coorinates  
    } triangle_t;
    // frame typedef struct 
        float       scale[3];      // scale values
         float       translate[3];   // translation vector
         char        name[16];       // frame name 
        vertex_t    verts[1];       // first vertex of this frame  
    } frame_t;
    glBegin( GL_TRIANGLES );   // draw each triangle
    for( int i = 0; i < header.num_tris; i++ )   
        // draw triangle #i
        for( int j = 0; j < 3; j++ )
               // k is the frame to draw
               // i is the current triangle of the frame
               // j is the current vertex of the triangle
                   (float)TexCoord[ Meshes[i].index_st[j] ].s / header.skinwidth,
                   (float)TexCoord[ Meshes[i].index_st[j] ].t / header.skinheight 
                glNormal3fv( anorms[ Vertices[ Meshes[i].index_xyz[j] ].lightnormalindex ] );
                       (Vertices[ Meshes[i].index_xyz[j] ].v[0] * frame[k].scale[0]) + frame[k].translate[0],
                        (Vertices[ Meshes[i].index_xyz[j] ].v[1] * frame[k].scale[1]) + frame[k].translate[1],                       
                       (Vertices[ Meshes[i].index_xyz[j] ].v[2] * frame[k].scale[2]) + frame[k].translate[2] 

    We can find an full explanation of the .MD2 format at for example

    Code :
    You may have noticed that v[3] contains vertex' (x,y,z) coordinates and because      
     of the unsigned char type, these coordinates can only range from 0 to 255. In fact these 3D       
    coordinates are compressed (3 bytes instead of 12 if we would use float or vec3_t). To uncompress it,       
    we'll use other data proper to each frame. lightnormalindex is an index to a precalculated       
    normal table. Normal vectors will be used for the lighting.

    => here, we can clearly say that the input vertex and matrix data is pre-processed ...
    (the vertex cordinates are stored with 3 bytes and not 3 floats and the normal is stored in a precalculed table [+ the matrix data is simplified to only handling scaling and translation])

    Note that into this 3D model format, the vertex/normal/texel/matrix data is not pre-multiplied on another side
    (cf. they are pre-processed [for to optimize the size of the data to store] but not pre-multiplied)

    Quote Originally Posted by tonyo_au View Post
    I don't think it was directly related to the batch size; I think it is more related to the number of buffers I had - I had 7000+ (not a good idea) but with small batch sizes I think the gpu was basically idle as it had very little work to do with are render call.

    I run on ATI 5870, nVidia Quadro 5000 and GTX 580 - the frame rate is different on each but the percentage change is similar
    Yes, in this case you are totally CPU limited, not GPU limited, like explained at,d.d2k&cad=rja
    Code :
    Yes, at < 130 tris/batch (avg) you are
    - completely,
    - utterly,
    - totally,
    - 100%
    – CPU limited!
    • CPU is busy doing nothing, but submitting batches!

    I think one good solution would be to have something like a "primitive transformation restart" than can be stored into the batch's indices with specials indices that indicate that the ongoing primitive have to handle "transformations vertices" and not trues vertices indices

    For a triangle batch, the first index can to be an index into a translation table, the second index into a rotation table and the third into a scaling table for example
    (if we use quads batchs, the fourth index can to be used for to handle homogeneous coordinates for example)

    => we can certainly use negatives indices for to indicate that the ingoing primitive is in fact a transformation primitive
    Last edited by The Little Body; 04-27-2013 at 06:05 PM.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts