PDA

View Full Version : DShow video to opengl textures



game_cy
05-16-2007, 04:22 PM
Hi,
I am trying to implement video textures. I started to convert the texture3d sample from DShow (platformSDK) to OpenGL and I was wondering..

Is there a way I can get a pointer to the texture data in order to copy the video frames there directly ?
(that's what the code in D3D seem to be doing).

If there is a faster way than the one shown in the texture3D example then please do let me know! :)

santyhamer
05-16-2007, 07:53 PM
Have you considered to use a 100% portable and open source alternative like the Xiph Theora video?

http://www.theora.org/theorafaq.html

I think is very good, free, C-based portable and open source and tons of encoding/decoding applications and codecs available.

To see a video for windows + opengl example you can see this link:

http://nehe.gamedev.net/data/lessons/lesson.asp?lesson=35

Hope it helps

game_cy
05-17-2007, 01:52 AM
Thanks santyhammer but I have already seen the Nehe tutorial, it's obsolete and very slow. DirectShow is much faster

game_cy
05-17-2007, 02:31 AM
part of what I am (not) looking for, I found it here:
http://www.opengl.org/cgi-bin/ubb/ultimatebb.cgi?ubb=get_topic;f=3;t=014297#000003

It seems there is no way to get a direct pointer to the texture data. Thats a shame, because the video gets decoded somewhere, (gets copied to system memory ?) , gets copied to a PBO , and then to the actual texture buffer (glTExSubImage2D).

Thats 4 times of blitting a buffer instead of 1 or 2.

V-man
05-17-2007, 03:46 AM
I think OpenML is suppose to be a DirectShow replacement. I've never used it.
http://khronos.org

Brolingstanz
05-17-2007, 05:32 AM
Mmmmm.... OpenML looks pretty sweet...

yooyo
05-17-2007, 05:42 AM
1. Create decoding graph and insert your dshow texture renderer filter which accept rgb or rgba pixelformats
2. In SetMediaType inspect VIDEOINFOHEADER and get video resolution.
3. Create 2 PBO's with video resolution framesize. Create texture too.
4. Lock 1st PBO (glMapBuffer) and pass pointer to dshow texture renderer. DO NOT UNLOCK NOW!!!
5. Run graph (using IMediaControl::Run)
6. When DoRenderSample occure copy uncompressed frame in PBO memory using pointer in step 4. and notify main render thread that new frame has arrive
7. In main render thread, during update check for new frame notification and unlock PBO buffer, then lock 2nd buffer and pass it to dshow tex renderer. Then you can upload texture using glTexSubImage2D call from 1st PBO. This is a simple double-buffering.
8. On single core CPU add some Sleep in main render thread to leave CPU power to directshow decoding graph. Otherwise, playback might be jerky.
9. Add crittical section object to controll access no PBO pointer in dshow tex renderer to avoid potential crash while main thread change pointer in middle of copying in DoRenderSample.
10. Try to use some fast memcopy functions to improve performances.

yooyo

game_cy
05-17-2007, 07:15 AM
thanx yooyo,
That sounds really good. I'll see what I can do :]

game_cy
05-18-2007, 08:44 AM
The graph and filter are created succesfully, and i can hear sound playing, but I dont see anything rendered, I dont even see misaligned video, just a black texture. I am suspecting my PBO code, can you see anything wrong?

PBO class


void CPBO::init( int sx, int sy, int pixbytes,
GLenum pixformat, GLenum pixtype ,
GLenum usage )
{
glGenBuffers(1, &m_pboID);
m_sx = sx; m_sy = sy;
m_pixbytes = pixbytes; m_pixtype = pixtype; m_pixformat = pixformat;
m_usage = usage;

glBindBuffer(GL_PIXEL_UNPACK_BUFFER_ARB, m_pboID);
glBufferData(GL_PIXEL_UNPACK_BUFFER_ARB, sx*sy*pixbytes, NULL, usage);
glBindBuffer(GL_PIXEL_UNPACK_BUFFER_ARB, 0);
}


void* CPBO::LockGetPointer()
{
glBindBuffer(GL_PIXEL_UNPACK_BUFFER_ARB, m_pboID);
void* memio = glMapBuffer(GL_PIXEL_UNPACK_BUFFER_ARB, GL_WRITE_ONLY);
glBindBuffer(GL_PIXEL_UNPACK_BUFFER_ARB, 0);
return memio;
}


void CPBO::Unlock( )
{
glBindBuffer(GL_PIXEL_UNPACK_BUFFER_ARB, m_pboID);
glUnmapBuffer(GL_PIXEL_UNPACK_BUFFER_ARB);
glBindBuffer(GL_PIXEL_UNPACK_BUFFER_ARB, 0);
}

void CPBO::Upload( )
{
glBindBuffer(GL_PIXEL_UNPACK_BUFFER_ARB, m_pboID);
glUnmapBuffer(GL_PIXEL_UNPACK_BUFFER_ARB);
glTexSubImage2D(GL_TEXTURE_RECTANGLE_ARB, 0, 0, 0, m_sx, m_sy,
m_pixformat, m_pixtype, BUFFER_OFFSET(0));
glBindBuffer(GL_PIXEL_UNPACK_BUFFER_ARB, 0);
}

Render sample code

HRESULT CTextureRenderer::DoRenderSample( IMediaSample * pSample )
{
// wait for main thread to consume previous sample
while(m_bSampleReady) Sleep(10);
if (!m_pTxtBuffer) { return S_OK; AppendFile("debug.txt","no texture buffer for DoRenderSample()\n");}

BYTE *pBmpBuffer, *pTxtBuffer; // Bitmap buffer, texture buffer
LONG lTxtPitch; // Pitch of bitmap, texture

BYTE * pbS = NULL;
DWORD * pdwS = NULL;
DWORD * pdwD = NULL;
UINT row, col, dwordWidth;

CheckPointer(pSample,E_POINTER);
// CheckPointer(m_pTexture,E_UNEXPECTED);

// Get the video bitmap buffer
pSample->GetPointer( &pBmpBuffer );

pTxtBuffer = m_pTxtBuffer;
// Copy the bits

if (m_TextureFormat == GL_BGR)
{
memcpy(pBmpBuffer, pTxtBuffer, m_lVidWidth*m_lVidHeight*3);
}

if (m_TextureFormat == GL_BGRA)
{
memcpy(pBmpBuffer, pTxtBuffer, m_lVidWidth*m_lVidHeight*4);
}

Sleep(1);

m_bSampleReady = true;

return S_OK;
}

swap active PBO in main thread code


// swap buffers if needed
if (m_VideoRenderer->m_bSampleReady)
{
// unlock previous pbo
m_PBO[m_PBO_upload_idx].Unlock();
// get pointer to empty buf and clear sample flag
m_VideoRenderer->m_pTxtBuffer = (BYTE*) m_PBO[m_PBO_render_idx].LockGetPointer();
m_VideoRenderer->m_bSampleReady = false;

// upload texture to videocard
glBindTexture(GL_TEXTURE_RECTANGLE_EXT, m_texid);
m_PBO[m_PBO_upload_idx].Upload();


// safe swap ?
if (m_PBO_upload_idx) { m_PBO_upload_idx=0; m_PBO_render_idx=1; }
else { m_PBO_upload_idx=1; m_PBO_render_idx=0; }
}
render quad

gVidTex.CheckMovieStatus();
// draw video o quad
CCamera::SetOrtho2D();
glEnable(GL_TEXTURE_RECTANGLE_EXT);
glBindTexture(GL_TEXTURE_RECTANGLE_EXT, gVidTex.m_texid);
DrawTexQuad(0,0,gVidTex.m_VideoRenderer->m_lVidWidth, gVidTex.m_VideoRenderer->m_lVidHeight,
0.3,0.3,1,1);

game_cy
05-18-2007, 10:58 AM
ok, I had a mistake here:
memcpy(pBmpBuffer, pTxtBuffer, m_lVidWidth*m_lVidHeight*4);
It should have been copied the other way around:
memcpy(pTxtBuffer, pBmpBuffer, m_lVidWidth*m_lVidHeight*4);

I also tried memset(pTexBuffer, 128, m_lVidWidth*m_lVidHeight*4);

and no color is uploaded to the texture . . so it must be my PBO uploadin code

yooyo
05-18-2007, 12:07 PM
Code look fine (but it lack of safe access from different thread). Remove Sleep calls from DoRenderSample... Add Sleep before SwapBuffers.

Check for GL errors. Check is filter connected... check did you create proper texture (glTexImage2D call), use GL_BGR or GL_BGRA texture format. In some cases decoder deliver 0 in Alpha channel (in DoRenderSample) and such texture become unusable for blending.

Im using same approach in my code and it works. Could you post test app and I'll take look?

game_cy
05-18-2007, 12:31 PM
I have a sleep(0) in the main thread as well. Although I dont use a critical section, I have
HRESULT CTextureRenderer::DoRenderSample( IMediaSample * pSample )
{
// wait for main thread to consume previous sample
while(m_bSampleReady) Sleep(10);

and in main thread:
if (m_VideoRenderer->m_bSampleReady)
{
// unlock previous pbo
...........
m_VideoRenderer->m_bSampleReady = false;
}
which work fine for the synchronization
the format of the texture is the same as the pbo (BGRA with my sample .avi) ... i dont use blending, i dont care about blending for now.

I will check for errors and post a sample app

thnx

yooyo
05-18-2007, 12:38 PM
I think I found bug. You Unlock current PBO and lock it immediatly after that. Code should be:


// swap buffers if needed
if (m_VideoRenderer->m_bSampleReady)
{
// unlock previous pbo
m_PBO[m_PBO_upload_idx].Unlock();

int prevPBO_idx = m_PBO_upload_idx;

// safe swap ?
if (m_PBO_upload_idx) { m_PBO_upload_idx=0; m_PBO_render_idx=1; }
else { m_PBO_upload_idx=1; m_PBO_render_idx=0; }

// get pointer to empty buf and clear sample flag
m_VideoRenderer->m_pTxtBuffer = (BYTE*) m_PBO[m_PBO_render_idx].LockGetPointer();
m_VideoRenderer->m_bSampleReady = false;

// upload texture to videocard
glBindTexture(GL_TEXTURE_RECTANGLE_EXT, m_texid);
m_PBO[prevPBO_idx].Upload();
}

yooyo
05-18-2007, 12:42 PM
Sleep call in DoRenderSample might affect DirectShow timings. Result is probably jerky playback or video and audio is out of sync.

game_cy
05-18-2007, 01:56 PM
Thats not it,
I unlock upload_idx and I lock render_idx

here is a sample application:
http://www.cs.ucy.ac.cy/~ssotos/shots/videosample1.zip

yooyo
05-18-2007, 03:03 PM
You forget to enable rect texture before draw.
glEnable(GL_TEXTURE_RECTANGLE_EXT);

and change CPBO::Upload


void CPBO::Upload( )
{
glBindBuffer(GL_PIXEL_UNPACK_BUFFER_ARB, m_pboID);
glTexSubImage2D(GL_TEXTURE_RECTANGLE_EXT, 0, 0, 0, m_sx, m_sy,
m_pixformat, m_pixtype, BUFFER_OFFSET(0));
glBindBuffer(GL_PIXEL_UNPACK_BUFFER_ARB, 0);
}

game_cy
05-18-2007, 03:06 PM
Ok , I got it working, But I dont know what I fixed...

probably something trivial like enabling a state or something...

it still plays a bit jerky , probably because of the sleep() thingy. . . but other than that it seems to be working

thanks yooyo,
excelent teaching :)

game_cy
05-18-2007, 03:07 PM
yes . . now I saw your answer .. thnx :]

yooyo
05-18-2007, 03:19 PM
You still need to add correct syncing code. Best way is to lock access to m_VideoRenderer->m_pTxtBuffer with critical sections (EnterCriticalSection at beginning of DoRenderSample and LeaveCriticalSection at funcion end).
Also... using same critical section object guard

EnterCriticalSection(...)
m_VideoRenderer->m_pTxtBuffer = (BYTE*) m_PBO[m_PBO_render_idx].LockGetPointer();
m_VideoRenderer->m_bSampleReady = false;
LeaveCriticalSection(...)About jerky playback... I remove Sleep stuff from DoRenderSample and add Sleep(1) after SwapBuffers call. If you turn on vertical sync you don't need Sleep because SwapBuffer will wait for vsync and it will release CPU for other tasks (like decoding video).

game_cy
05-18-2007, 04:00 PM
Ok it works like a charm :)

here's the changes


// swap buffers if needed
if (m_VideoRenderer->m_bSampleReady)
{
EnterCriticalSection(&m_VideoRenderer->m_CriticalSection);
// unlock previous pbo
m_PBO[m_PBO_upload_idx].Unlock();
// get pointer to empty buf and clear sample flag
m_VideoRenderer->m_pTxtBuffer = (BYTE*) m_PBO[m_PBO_render_idx].LockGetPointer();
m_VideoRenderer->m_bSampleReady = false;
LeaveCriticalSection(&m_VideoRenderer->m_CriticalSection);

// upload texture to videocard
glBindTexture(GL_TEXTURE_RECTANGLE_EXT, m_texid);
m_PBO[m_PBO_upload_idx].Upload();

// swapped pbo's so swap idxs too
if (m_PBO_upload_idx) { m_PBO_upload_idx=0; m_PBO_render_idx=1; }
else { m_PBO_upload_idx=1; m_PBO_render_idx=0; }
}and


HRESULT CTextureRenderer::DoRenderSample( IMediaSample * pSample )
{
// wait for main thread to consume previous sample
while(m_bSampleReady) Sleep(0);
EnterCriticalSection(&m_CriticalSection);

..........
m_bSampleReady = true;
LeaveCriticalSection(&m_CriticalSection);

return S_OK;
}notice that it can also happen that a sample is not consumed by the main thread and DoRenderSample() has to wait for the pbo's to be swapped . . e.g. when the window is moved or when the main thread is delayed

tamlin
05-23-2007, 01:15 AM
I just wanted to comment on the use of Sleep(0).

Unless you know exactly what you are doing, don't do this. The reason is that Sleep(0) only gives up the remainder of the current thread's time-slice if there is a thread with the same or higher priority waiting to run.

This is something ATI at least used to do in their ICD that took me literally days to hunt down when my app locked up for several seconds at a time. Basically, they used a "poor man's critical section" in hopes of busywaiting would gain a nanosecond here or there, while completely locking up a uniprocessor system in case the window-managing thread is running at another priority than a/the rendering thread.

Please don't make the same mistake. Busy-waiting is usually bad.

To display two alternatives (as this is platform dependent):

- If you need to get notified ASAP, use an event. Just before you wait for the event, raise the thread priority (to a priority higher than the thread you're waiting for). When wait returns, immediately lower your thread priority back. Using this, you will get notified immediately when the event is signalled (I'm fairly certain it will in fact, on a uni-CPU system, be faster than the busy-waiting approach).

- Use a critical section as both synchronizing primitive and the signal (using the same raise/revert thread priority idea), but prepend the call to enter the critical section with a call to TryEnterCriticalSection. On a good day that'll save the three kernel calls to raise/lower thread pri. and waiting for the mutex.

AMD/ATI: Feel free to fix that crappy "home-made" (dead-)locking and use what I explained here.

trier
06-08-2007, 01:18 AM
Hi
I am working on a video decoding project, where
we use opengl textures to display the decoded video. I just wondered what performance you get with your implementation. Which formats are the fastest for dshow to decode and what resolution can your system display.

Thanks in advance

Pierre Francais
08-30-2011, 03:36 PM
I implemented the techniques that yooyo described in his post, but found that I also had to call eglMakeCurrent() from the DirectShow thread, to activate the OpenGL context. Without that it just rendered a black screen.
I'm using OpenGLES 1.1 on a Windows Mobile device, using the Vincent software implementation. I'm also using glTexSubImage2D() to update the texture directly from the DirectShow buffer, because 1.1 does not support PBOs.

Big thanks especially to yooyo for taking the time to explain the approach to solving this problem.