different timer_query results on ATI and NVidia; NVidia sync?

Hi,

so I have tested glut and timer_queries on a ATI Radeon HD 5850 and a NVidia GTX 660 to measure the time on the CPU and the GPU.

One of the test-results are:
ATI : CPU: 805 ms, GPU: 156 ms
NVidia: CPU: 215 ms, GPU: 215 ms

So of course the NVidia card is newer and faster, anyway the point is: why does the GPU on the NVidia card sync with the CPU??
In all tests the NVidia card sync with the CPU. Why?

The code I use:

Definitions:


#define QUERY_BUFFERS 2
#define QUERY_COUNT 1

unsigned int queryID[QUERY_BUFFERS][QUERY_COUNT];
unsigned int queryBackBuffer = 0, queryFrontBuffer = 1;
GLuint64 elapsed_time;

void genQueries() {

    glGenQueries(QUERY_COUNT, queryID[queryBackBuffer]);
    glGenQueries(QUERY_COUNT, queryID[queryFrontBuffer]);
}

void swapQueryBuffers() {

    if (queryBackBuffer) {
        queryBackBuffer = 0;
        queryFrontBuffer = 1;
    }
    else {
        queryBackBuffer = 1;
        queryFrontBuffer = 0;
    }
}

void BeginQuery()
{
    glBeginQuery(GL_TIME_ELAPSED,queryID[queryBackBuffer][0]);
}

void EndQuery()
{
    glEndQuery(GL_TIME_ELAPSED);
}

void FinishQuery()
{
    glGetQueryObjectui64v(queryID[queryFrontBuffer][0], GL_QUERY_RESULT, &elapsed_time);
}

GLuint64 GetQuery()
{
    return elapsed_time;
}

Draw Call:

void display(void)
{
    double timestart = glutGet(GLUT_ELAPSED_TIME);
    BeginQuery();

    glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

    drawScene();

    EndQuery();
    swapQueryBuffers();
    FinishQuery();

    milliSecPerFrame_CPU = glutGet(GLUT_ELAPSED_TIME) - timestart;
    showInfo();

    glutSwapBuffers();
}

First your code has a bug. glutGet is called after glGetQueryObjectui64v from FnishQuery. glGetQueryObjectui64v is blocking until the result is available. Therefore if the CPU and the GPU time measurement start at the same time they will deliver the same results.

If the bug above is fixed then there are different scenarios depending if your code is CPU or GPU bound and how the driver dispatches work to the GPU.
CPU bound: if the glBeginQuery command is send to the CPU immediately then the CPU and GPU times are equal, else the CPU time is higher.
GPU bound: the CPU time is lower than the GPU time