PDA

View Full Version : OpenCL interop performance



Thomas Gillen
06-21-2012, 08:42 AM
I've been looking into offloading some work in my renderer to CL kernels, but so far this has proven to be futile.

I've come across two problems. The first is that GL_ARB_sync and CL_KHR_gl_event don't appear to be implemented on any hardware, forcing me to synchronize via glFinish and clFinish. Causing a pipeline stall here is just about the worst thing you can possibly do for performance, and this alone kills any practical integration between OpenCL and OpenGL. Is there some secret handshake needed to get this working properly, or are people just conveniently forgetting this when they talk about CL<->GL interop?

The second, stranger, problem is that creating an OpenCL context (and then doing nothing with it) and sharing it with the GL context causes my frame rate to drop from 115 to about 30. This doesn't happen if the GL context is not shared.

Thomas Gillen
06-22-2012, 07:24 AM
After further investigation, it would seem that creating a texture and attaching it to an FBO while sharing the GL context with CL, causes apparently all operations on both the CPU and GPU to slow down until that texture is deleted.

Edit: Just updated to the latest AMD beta drivers (Catalyst 12.6 Beta), and the problem appears to have gone away. So it was a driver bug.

Dark Photon
06-22-2012, 09:01 AM
...two problems. The first is that GL_ARB_sync and CL_KHR_gl_event don't appear to be implemented on any hardware, forcing me to synchronize via glFinish and clFinish.

I hit this a few years back when I was stuck with OpenCL 1.0. Fortunately I didn't need to swap between GL and CL much so it wasn't completely prohibitive, just bothersome.

OpenCL 1.1 drivers are out now that I would have thought would have it. However, checking on that here on NVidia's latest public beta drivers (302.07b) by running oclDeviceQuery, I see:



CL_PLATFORM_VERSION: OpenCL 1.1 CUDA 4.2.1
OpenCL SDK Revision: 7027912
...
CL_DEVICE_NAME: GeForce GTX 560 Ti
CL_DEVICE_VENDOR: NVIDIA Corporation
CL_DRIVER_VERSION: 302.07
...
CL_DEVICE_EXTENSIONS: cl_khr_byte_addressable_store
cl_khr_icd
cl_khr_gl_sharing
cl_nv_compiler_options
cl_nv_device_attribute_query
cl_nv_pragma_unroll
cl_khr_global_int32_base_atomics
cl_khr_global_int32_extended_atomics
cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics
cl_khr_fp64


Nope, still no cl_khr_gl_event on the CL side (and no ARB_cl_event on the GL side). Wonder what the hold-up on this is? The lack of these extensions discourages heavy use of OpenGL and OpenCL on the same GPU.

As some consolation, several nice features added to OpenGL in recent years have reduced the cases where you'd otherwise have needed to resort to OpenCL (that'd actually make an excellent SIGGRAPH/GDC course -- showing how a few classically GPGPU graphics-related techniques can be mapped to OpenCL and GLSL 4.2 and the pros/cons of each).