OpenGL Colourspace Issue: The Shader Strikes Back

So I previously posted a question here ( http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=271582#Post271582) asking for help with a colourspace transformation. I was told that using glMatrixMode(GL_COLOR) was generally not supported in hardware and that fragment shaders were the way forward.

So I have spent today reading up on GLSL, and specifically fragment shaders. I thought that I had cracked it… but the resulting code runs really really slowly, just like when I was using a colour transformation matrix. I have tried a variety of simple pixel shaders, from a ‘make everything red’ shader, to a red filter, to a YUV->RGB transform. It all works, EXCEPT for the fact that it’s really slow and thrashes the processor. I get about 0.2 FPS, on a dual core Core2, with a Quadro FX 3500 card with the latest applicable NVidia drivers. I’m using X11 and GLX for the windowing stuff.

I am now about to cry :cry:

So, what am I doing? Basically it breaks down something like below. I have omitted all the error checking and other unnecessary stuff for clarity of the code below:


const char * fragment_shader_text =
"varying vec4 gl_Color;
"
"void main()
"
"{
"
"  //Simple red -> green filter
"
"  gl_FragColor = vec4(0.0, gl_Color[0], 0.0, 1.0);
"
"}
";

void myclass::setup(int w, int h)
{
  // Create the window and the gl context in glx
  m_p_display = XOpenDisplay(NULL);
  m_screen_num = DefaultScreen(m_p_display);
  m_root_window = RootWindow(m_p_display, m_screen_num);
  m_video_window = XCreateSimpleWindow(m_p_display, m_root_window, 0, 0, w, h, 0, black_pixel, white_pixel);
  XFlush(m_p_display);
  XStoreName(m_p_display, m_video_window, "My Video Tester");
  XFlush(m_p_display);
  XMapWindow(m_p_display, m_video_window);
  XMapSubwindows(m_p_display, m_video_window);
  m_p_visual_info = glXChooseVisual(m_p_display, m_screen_num, (int[]) {GLX_RGBA, GLX_DOUBLEBUFFER, GLX_USE_GL, None});
  m_glx_context = glXCreateContext(m_p_display, m_p_visual_info, (GLXContext) None, GL_TRUE);
  glXMakeCurrent(m_p_display, m_video_window, m_glx_context);
  
  // Compile the shader
  m_shader_program = glCreateProgramObjectARB();
  m_fragment_shader = glCreateShaderObjectARB(GL_FRAGMENT_SHADER_ARB);
  glShaderSourceARB(m_fragment_shader, 1, &fragment_shader_text, NULL);
  glCompileShaderARB(m_fragment_shader);
  glGetObjectParameterivARB(m_fragment_shader, GL_OBJECT_COMPILE_STATUS_ARB, &compiled);
  // Checks for compilation state omitted - it compiles ok
  glAttachObjectARB(m_shader_program, m_fragment_shader);
  glLinkProgramARB(m_shader_program);

  // Set up gl projection matrix etc
  glMatrixMode(GL_PROJECTION);
  glLoadIdentity();
  gluOrtho2D(0.0, (GLfloat) w, 0.0, (GLfloat) h);
  glMatrixMode(GL_MODELVIEW);
  glLoadIdentity();
}  

// The frame-drawing method
// Stripped out so that it's only rendering the Y channel of
// the source into the R channel of the output. The shader
// then correctly does the red->green filter and we see the
// expected, green output.
void myclass::drawFrame(Frame * frame)
{
  glXMakeCurrent(m_p_display, m_video_window, m_glx_context);
  glPixelZoom(1.0, -1.0);
  glRasterPos2i(0, frame->getH());
  glUseProgramObjectARB(m_shader_program);
  glDrawPixels(frame->getW(), frame->getH(), GL_RED, GL_UNSIGNED_BYTE, frame->getPlaneData(0));
  glUseProgramObjectARB((GLhandleARB)NULL);
  glXSwapBuffers(m_p_display, m_video_window);
}

So, what am I doing wrong? If I comment out the two glUseProgramObjectARB lines in the render frame method, I get a red-channel-only display as expected and it runs at ~100s FPS. With the shader enabled for the drawPixels, it runs at about 0.2 FPS.

I have tried all sorts of combinations of copying the frame data into the framebuffer, into an aux buffer, blending etc. Whatever I do, it’s fast (as expected) until I enable the fragment shader, at which point the performance goes to hell.

Also worth noting, in the fuller app, I use a bouple of blend modes to layer the U and V channels into the G and B channels of the buffer - this blend is fast and does not show a CPU hit, so I’m pretty sure that OpenGL is uaing the hardware for something. I don’t understand why (or even how it’s possible) the shader is hammering the CPU.

Any and all help gratefully received!

Try with a textured quad instead of glDrawPixels.

Bruce

Yes, I am surprised you get a half decent frame rate using glDrawPixels at all.

Yep, that seems to have cracked it! I guess I don’t know enough about the graphics pipeline to know why the drawPixels method falls back to software whe ncombined with a shader, but copying the pixels into 3 textures, drawing a textured rectangle and using a frag shader to do the colour conversion works a treat. I’m getting about a 4x speedup in terms of frame processing and display over the pure software conversion, which is brilliant.

Thanks guys!