directing transform feedback into a uniform buffer? (invalid operation error)

I’m working on trying to save transform feedback output into a pair of uniform buffers and I keep getting
“The specified operation is invalid for the current OpenGL state”
after beginning the transform feedback.
Basically, I’m trying to convert the flocking example from superbible #5 to OpenGL ES 3.

Here is the setup:


    GLuint positionBlockIndex = glGetUniformBlockIndex(flockingUpdateProgram, "PositionBlock");
    GLuint velocityBlockIndex = glGetUniformBlockIndex(flockingUpdateProgram, "VelocityBlock");
    GLint positionBlockSize,velocityBlockSize;
    glGetActiveUniformBlockiv( flockingUpdateProgram, positionBlockIndex, GL_UNIFORM_BLOCK_DATA_SIZE, &positionBlockSize );
    glGetActiveUniformBlockiv( flockingUpdateProgram, velocityBlockIndex, GL_UNIFORM_BLOCK_DATA_SIZE, &velocityBlockSize );
    // Set up UBOs
    for (i = 0; i < 2; i++)
    {
        glUniformBlockBinding(flockingUpdateProgram, positionBlockIndex, 0);
        glBindBufferBase(GL_UNIFORM_BUFFER,0,position_ubo[i]);
        glBufferData(GL_UNIFORM_BUFFER, positionBlockSize, NULL, GL_DYNAMIC_DRAW);
        glUniformBlockBinding(flockingUpdateProgram, velocityBlockIndex, 1);
        glBindBufferBase(GL_UNIFORM_BUFFER,1,velocity_ubo[i]);
        glBufferData(GL_UNIFORM_BUFFER, velocityBlockSize, NULL, GL_DYNAMIC_DRAW);
    }

Here is the render call:


    // Depending on whether we're rendering an even or odd frame...
    if (frame_index & 1)
    {
        // Read from second set of buffers (c)
        glBindBufferBase(GL_UNIFORM_BUFFER,1, velocity_ubo[1]);
        glBindBufferBase(GL_UNIFORM_BUFFER,0, position_ubo[1]);
        glBindVertexArray(update_vao[1]);
        
        // Write to first
        glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0, flock_position[0]);
        glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 1, flock_velocity[0]);
    }
    else
    {
        // Read from first set of buffers (a)
        glBindBufferBase(GL_UNIFORM_BUFFER,1, velocity_ubo[0]);
        glBindBufferBase(GL_UNIFORM_BUFFER,0, position_ubo[0]);
        glBindVertexArray(update_vao[0]);
        
        // Write to second
        glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0, flock_position[1]);
        glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 1, flock_velocity[1]);
    }
    
    // Turn off rasterization for the simulation pass (no fragment shader)
    glEnable(GL_RASTERIZER_DISCARD);
    
    // Press record...
    glBeginTransformFeedback(GL_POINTS);
    // Draw
    glDrawArrays(GL_POINTS, 0, flock_size);// **ERROR HERE**
    // Press stop...
    glEndTransformFeedback();
    
    // Turn rasterization back on
    glDisable(GL_RASTERIZER_DISCARD);

Debug tips:

Try to narrow down the interactions that could cause an error. First, just comment out Begin/EndTransformFeedback. Is there still an error? If so, look at all of your rendering state. And look at the glValidateProgram log just before glDrawArrays.

If the error only occurs with transform feedback, look at all of the TF state. Especially, make sure that your buffer objects are large enough to capture all of the transform feedback varyings (i.e. positionBlockSize needs to be large enough for flock_size * one position.) This error is specific to ES3:

The error INVALID_OPERATION is generated by DrawArrays and DrawArraysInstanced if recording the vertices of a primitive to the buffer objects being used for transform feedback purposes would result in either exceeding the limits of any buffer object’s size, or in exceeding the end position offset + size − 1, as setby BindBufferRange.

Desktop GL does not have this error, it just silently drops TF vertices beyond the end of the buffer.

[QUOTE=arekkusu;1257368]Debug tips:

Try to narrow down the interactions that could cause an error. First, just comment out Begin/EndTransformFeedback. Is there still an error? If so, look at all of your rendering state. And look at the glValidateProgram log just before glDrawArrays.
[/QUOTE]

Hmm, I get the error
Validation Failed: Fragment program failed to compile with current context state.

Validation Failed: Vertex program failed to compile with current context state.

However, after combing through the debugger several times, I did change some questionable looking parts, but the error messages are unchanged. I get the errors with or without transform feedback, at this point I’m pretty sure that the problem is with how the Shader data is being allocated, but I don’t see anything obviously wrong there…

Desktop GL does not have this error, it just silently drops TF vertices beyond the end of the buffer.[/QUOTE]

As far as I can tell, I’m compensating for this by calling

glGetActiveUniformBlockiv( flockingUpdateProgram, positionBlockIndex, GL_UNIFORM_BLOCK_DATA_SIZE, &positionBlockSize );
glGetActiveUniformBlockiv( flockingUpdateProgram, velocityBlockIndex, GL_UNIFORM_BLOCK_DATA_SIZE, &velocityBlockSize );

to get the size of the uniform block

If anyone wants to take one more look at it, here are the relevant(?) parts:

Set up:

const int  VEC3_SIZE = 3 * sizeof(float);
    
    GLuint positionBlockIndex = glGetUniformBlockIndex(flockingUpdateProgram, "PositionBlock");
    GLuint velocityBlockIndex = glGetUniformBlockIndex(flockingUpdateProgram, "VelocityBlock");
    GLint positionBlockSize,velocityBlockSize;
	glGetActiveUniformBlockiv( flockingUpdateProgram, positionBlockIndex, GL_UNIFORM_BLOCK_DATA_SIZE, &positionBlockSize );
    glGetActiveUniformBlockiv( flockingUpdateProgram, velocityBlockIndex, GL_UNIFORM_BLOCK_DATA_SIZE, &velocityBlockSize );
    
    glUniformBlockBinding(flockingUpdateProgram, positionBlockIndex, 0);
    glUniformBlockBinding(flockingUpdateProgram, velocityBlockIndex, 1);
    
    GenerateInitialPositions(position_data, flock_size, sqrt((float)flock_size) * 5.0f, flock_center);
    // Set up Update VAOs/UBOs
    for (i = 0; i < 2; i++)
    {
        glBindVertexArray(update_vao[i]);
        glBindBuffer(GL_ARRAY_BUFFER, flock_position[i]);
        glBufferData(GL_ARRAY_BUFFER, flock_size * VEC3_SIZE, position_data, GL_DYNAMIC_COPY);
        glEnableVertexAttribArray(0);
        glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, VEC3_SIZE, NULL);
        
        glBindBuffer(GL_UNIFORM_BUFFER,position_ubo[i]);
        glBufferData(GL_UNIFORM_BUFFER, positionBlockSize, position_data, GL_DYNAMIC_DRAW);
        
        // Now velocity (give each flock member a small initial velocity)
        if(i==0)
            GenerateInitialPositions(position_data, flock_size, 0.005f, NULL);
        glBindBuffer(GL_ARRAY_BUFFER, flock_velocity[i]);
        glBufferData(GL_ARRAY_BUFFER, flock_size * VEC3_SIZE, position_data, GL_DYNAMIC_COPY);
        glEnableVertexAttribArray(1);
        glVertexAttribPointer(1, 3, GL_FLOAT, GL_FALSE, VEC3_SIZE, NULL);
        
        glBindBuffer(GL_UNIFORM_BUFFER,velocity_ubo[i]);
        glBufferData(GL_UNIFORM_BUFFER, velocityBlockSize, position_data, GL_DYNAMIC_DRAW);
        
        glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, feedback[i]);
        glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0, flock_position[i]);
        glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 1, flock_velocity[i]);
        glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 2, position_ubo[i]);
        glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 3, velocity_ubo[i]);
        
        // Done with data (everything's on the GPU from now on)
        if(position_data){
            delete [] position_data;
            position_data=NULL;
        }
    }
    
    glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, 0);
    
    // Setup Render VAOs
    for (i = 0; i < 2; i++)
    {
        glBindVertexArray(render_vao[i]);
        glBindBuffer(GL_ARRAY_BUFFER, flock_position[i]);
        glEnableVertexAttribArray(0);
        glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, VEC3_SIZE, NULL);
        glVertexAttribDivisor(0, 1);
        glBindBuffer(GL_ARRAY_BUFFER, flock_velocity[i]);
        glEnableVertexAttribArray(1);
        glVertexAttribPointer(1, 3, GL_FLOAT, GL_FALSE, VEC3_SIZE, NULL);
        glVertexAttribDivisor(1, 1);
        glBindBuffer(GL_ARRAY_BUFFER, flock_geometry);
        glEnableVertexAttribArray(2);
        glVertexAttribPointer(2, 3, GL_FLOAT, GL_FALSE, VEC3_SIZE, NULL);
        // Only allocate the static geometry buffer once
        if (i == 0)
            glBufferData(GL_ARRAY_BUFFER, sizeof(geometry), geometry, GL_STATIC_DRAW);
    }

Render:

glUniform3fv(goal_location, 1, goal_position);
    glUniform1f(timestep_loc, t * 50.0f);
    
    // Depending on whether we're rendering an even or odd frame...
    if (frame_index & 1)
    {
        // Read from second set of buffers (c)
        glBindBufferBase(GL_UNIFORM_BUFFER,0, position_ubo[1]);
        glBindBufferBase(GL_UNIFORM_BUFFER,1, velocity_ubo[1]);

        glBindVertexArray(update_vao[1]);
        
        // Write to first
        glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, feedback[0]);
    }
    else
    {
        // Read from first set of buffers (a)
        glBindBufferBase(GL_UNIFORM_BUFFER,0, position_ubo[0]);
        glBindBufferBase(GL_UNIFORM_BUFFER,1, velocity_ubo[0]);
        
        glBindVertexArray(update_vao[0]);
        // Write to second
        glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, feedback[1]);
    }
    
    // Turn off rasterization for the simulation pass (no fragment shader)
    glEnable(GL_RASTERIZER_DISCARD);
    [self validateProgram:flockingUpdateProgram];
    // Press record...
    glBeginTransformFeedback(GL_POINTS);
    // Draw
    glDrawArrays(GL_POINTS, 0, flock_size);
    // Press stop...
    glEndTransformFeedback();
    
    // Turn rasterization back on
    glDisable(GL_RASTERIZER_DISCARD);
    
    glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, 0);
    glBindBufferBase(GL_UNIFORM_BUFFER,0, 0);
    glBindBufferBase(GL_UNIFORM_BUFFER,1, 0);

Update Vert Shader:


#version 300 es
#define MAX_INSTANCE_COUNT 32

// Position and velocity inputs
layout (location = 0) in vec3 flock_position;
layout (location = 1) in vec3 flock_velocity;

// Outputs (via transform feedback)
out vec3 position_vert_out;
out vec3 velocity_vert_out;
out vec3 position_uni_out;
out vec3 velocity_uni_out;

// TBOs containing the position and velocity of other flock members
layout( std140 ) uniform PositionBlock
{
    vec3 ubo_positions[MAX_INSTANCE_COUNT];
}positionBlock;

layout( std140 ) uniform VelocityBlock
{
    vec3 ubo_velocities[MAX_INSTANCE_COUNT];
}velocityBlock;

// Parameters...
// This has to match the app's view of the world - no default is given here.
uniform int flock_size;
// These all have defaults. In the example application, these aren't changed.
// Just edit these and rerun the application. It's certainly possible to change
// these parameters at run time by hooking the uniforms up in the application.
const float rule1_weight = 0.17;
const float rule2_weight = 0.01;
const float damping_coefficient = 0.99999;
const float closest_allowed_dist = 50.0;

// Time varying uniforms
uniform vec3 goal;
uniform float timestep;

// The two per-member rules
vec3 rule1(vec3 my_position, vec3 my_velocity, vec3 their_position, vec3 their_velocity)
{
    vec3 d = my_position - their_position;
    if (dot(d, d) < closest_allowed_dist)
        return d;
    return vec3(0.0);
}

vec3 rule2(vec3 my_position, vec3 my_velocity, vec3 their_position, vec3 their_velocity)
{
    vec3 d = their_position - my_position;
    vec3 dv = their_velocity - my_velocity;
    return dv / (dot(d, d) + 10.0);
}

void main(void)
{
    vec3 accelleration = vec3(0.0);
    vec3 center = vec3(0.0);
    vec3 new_velocity;
    int i;
    
    // Apply rules 1 and 2 for my member in the flock (based on all other
    // members)
    for (i = 0; i < flock_size; i++) {
        if (i != gl_VertexID) {
            vec3 their_position = positionBlock.ubo_positions[gl_VertexID];
            vec3 their_velocity = velocityBlock.ubo_velocities[gl_VertexID];
            accelleration += rule1(flock_position, flock_velocity, their_position, their_velocity) * rule1_weight;
            accelleration += rule2(flock_position, flock_velocity, their_position, their_velocity) * rule2_weight;
            center += their_position;
        }
    }
    // Also accellerate towards the goal (rule 3)
    accelleration += normalize(goal - flock_position) * 0.025;
    // Update position based on prior velocity and timestep
    position_vert_out = position_uni_out = flock_position + flock_velocity * timestep;
    // Update velocity based on calculated accelleration
    accelleration = normalize(accelleration) * min(length(accelleration), 10.0);
    new_velocity = flock_velocity * damping_coefficient + accelleration * timestep;
    // Hard clamp speed (mag(velocity) to 10 to prevent insanity
    if (length(new_velocity) > 10.0)
        new_velocity = normalize(new_velocity) * 10.0;
    velocity_uni_out = velocity_vert_out = new_velocity;
    // Write position (not strictly necessary as we're capturing user defined
    // outputs using transform feedback)
    gl_Position = vec4(flock_position * 0.1, 1.0);
}

Update Frag Shader:


#version 300 es

out mediump vec4 fragColor;

void main()
{
    fragColor = vec4(1.0, 1.0, 0.0, 1.0);
}