OpenCL spec out, thoughts about GL+CL.

http://www.khronos.org/registry/cl/specs/opencl-1.0.29.pdf

Just noticed the OpenCL spec is out. Wondering what other GLers think about future combined OpenCL+OpenGL usage?

I can tell you what I think about Khronos after reading the first page (after the table of contents). Khronos is unworthy of naming APIs with called OpenXX, whatever XX may be.

I have to be a member of the Khronos Group to be allowed to print the pdf? I am not allowed to disclose its contents? I am not allowed to use something that may describe the specification’s contents? Even Khronos Group members are not allowed to give someone a copy of the specification with some sections highlighted? And they call this _Open_CL?

What’s next? The devil releasing Lord’s Prayer 2.0 (icluding new features such as sacrifice of virgins and burning cristian martyrs)?

Don’t confuse with open source.
It is an open standard in the sense that you don’t have to sign an NDA to merely look at it … wink iphone sdk wink wink …
hey wait, wasn’t Apple the original proponent of OpenCL ?

OpenCL looks interesting anyway.
Implementing RealOpenGL 3.X (with all deprecated features removed) on top of it seem possible, even if not very efficient yet.

Any news on actual implementations ?

I think it’ll be very useful for some appliations, e.g. games will use OpenCL for AI and physics simulation. Instead of having to make the physics stuff fit into a framework created for graphics it can now be done with an API created for more general-purpose stuff. The sharing of memory objects will allow for fast passing of information from the parts that do the simulation to the parts that do the visualization.

Somewhere in my mind there’s the idea of implementing some functionality of OpenGL drivers on top of OpenCL, e.g. executing an OpenCL program to compress a texture.

Philipp

OpenCL would be great for deferred rendering. Do display traversal in OpenCL, raster G-Buffer in OpenGL, finish up deferred rending and post processing in OpenCL, then copy to front buffer via OpenGL (if necessary).

I’d place a guess that all non-OpenCL features which are present in DX11 will be GL vendor extensions (perhaps NV after launch) and eventually get into the core at a much later date. Examples being tessellation, generic read/write from shaders, and shared registers. As for GPUs, seems like DX11 GPUs are to be end 2009 or 2010 for the high end only (even longer perhaps for mid-range). Meanwhile first OpenCL implementations (ie with Apple’s next OSX) are to be out in early 2009 (from my understanding) with a very clear importance from at least one vendor given that all new Apple MacBooks now have NVidia GPUs! Also OpenCL support on current DX10 level hardware.

So perhaps we get OpenGL+OpenCL on XP/Vista/Mac/Linux/Solaris/etc very soon and DX11 later (still only on Vista and later Windows). Might be that Open*L ends up ahead of DX.

Typo in appendix B.1

“if any only if” should probably be “if and only if”

A lot of whats in OpenCL looks an awful lot like what people have been asking for for OpenGL 3.1, so if this spec is a guide to whats going to be in the next OpenGL then i will be very very happy.

Precompiled kernel binaries - Its in OpenCL, so we should get precompiled shader binaries in OpenGL 3.1
Driver version query - We asked for this for OpenGL, OpenCL got it first.
Separate samplers/image buffers - Yes please.
Reference counted objects - Wonderful!
Multithreading support - The drivers will have to be thread-safe to support OpenCL, so this should rub-off on OpenGL.
Synchronisation events - Just what we need, although this spec doesn’t mention how to synchronise the OpenGL command queue with the OpenCL queue’s.
Background buffer loading with a separate thread - Should be able to use OpenCL to load a large texture in the background without causing OpenGL to drop a frame.

There doesn’t seem to be any way to stream data between the two API’s, so i assume the intention is to synchronize the OpenGL command queue so it waits for OpenCL to finish writing the VBO before rendering the next frame, then have OpenCL wait for the next SwapBuffers call before taking back control of the buffer object to update it.

We will need tesselation in 3.1 so we can take a low-resolution physics mesh that we use for collision detection with OpenCL, and expand it into a high-resolution mesh suitable for display.

I wonder if clUnloadCompiler also unloads the GLSL compiler, i would expect both to be using the same code base.

OpenCL in v1.0 is not a replacement for OpenGL. For example, OpenGL can talk to rasterizer hardware but CL at present cannot. Which is why you see effort to provide gateways from one world to the other - you can use CL to do geometry synthesis, potentially physics, collision detection etc, and have it generate buffers of data for GL to render from.

OpenGL 3.1 spec work is proceeding while OpenGL 3.0 driver implementations are firming up. I can’t speak to the feature set that it contains yet or how the implementation schedule for GL will play against implementation schedule for CL on each vendor’s devices.

Just a thought though, hold on to ideas like “CL got it first” until you see a running driver :slight_smile:

OpenCL+OpenGL could be used to used for hybrid rendering: Generate textures with raytracing(OpenCL) with correct reflections/refractions, then render the scene in OpenGL.

OpenCL looks great and especially the integration into an OpenGL project. With CUDA I had actually no clue on how to do it well but the fact that with OpenCL a program is build by the C++ program make everything nicer. Like this way it can easily integrated in a FX file system.

From my first look I would say congratulation this is about to be a really successful API I will definetly use!

Just one think:
CL Image Objects -> GL Textures
CL Image Objects -> GL Renderbuffers
Once again I fell that we need and image objects in OpenGL :stuck_out_tongue:

Please please please, OpenCL must be a guide line for OpenGL :o)

Simon Arbon became a criminal by noticing a

Well this shows that getting this out fast was an important goal for Khronos (in that respect they did much better than at OpenGL 3). Let’s hope that it’s only typos, no important problems in there.

Philipp

Had more time to digest the OpenCL spec. I’m going to assume that GPU production cycle is about 3 years and that OpenCL was also designed in mind for future GPU hardware. If so there are a few interesting things which might be gathered from the OpenCL spec.

  1. OpenCL supports write_image*() from kernels which takes an integer coordinate. The spec doesn’t stipulate that this integer coordinate has to be the work-item coordinate. So OpenCL supports the ability for kernels to write to anywhere in an framebuffer.

  2. OpenCL does not allow both read and write from the same image in the same kernel. Also there is no atomic operation set for images. This hints that CUDA PTX .surf isn’t going to ever be implemented, and that with OpenCL things like programmable blending, software Z buffers, and such must go through the standard atomic operations in global or local memory.

Given that Larrabee has to be set to have CL_DEVICE_GLOBAL_MEM_CACHE_TYPE as CL_READ_WRITE_CACHE, there is a clear way on future Intel hardware to start doing custom software rendering using atomic operations and global memory access. To compete, perhaps other vendors are going to need to have at a minimum have a high performance coherent cache for atomic operations supporting fast atomic vector scatter/gather to enable programmers to have fast parallel binning and queuing systems which are core to many fast software rendering algorithms. Maybe the ROP cache is going to transform to support this functionality. Perhaps this is just a reflection of what I personally want, but it seems as if the OpenCL spec at least supports it.

OpenGL/CL presentations from Siggraph Asia '08 posted:
http://www.khronos.org/library/detail/siggraph_asia_2008/

What really struck me just skimming through the spec is the new vector literal syntax in CL-C. Cool! :sunglasses:

According to that link, 12 months to lots of vendor drivers!