The OpenGL Pipeline Newsletter - Volume 003
Table of Contents
Previous article: GLSL: Center or Centroid? (Or When Shaders Attack!)
Next article: Optimize Your Application Performance
OpenGL and Windows Vista™
So Windows Vista is here, but what does it mean for an OpenGL user and developer? In this article we will try to give OpenGL application developers a quick peek at what to expect and the current state of OpenGL on Windows Vista.
Windows Vista supports two primary OpenGL implementations:
- Hardware manufacturers provide OpenGL ICD (installable client driver) with variable renderer string. The OpenGL version supported depends on the hardware manufacturer.
- Microsoft's software OpenGL 1.1 implementation (renderer string is GDI Generic), is clustered in higher numbered pixel formats.
Just like Windows XP, Windows Vista does not contain an OpenGL ICD "in the box." End users will need to install drivers from OEMs or video hardware manufacturers in order to access native hardware-accelerated OpenGL. These drivers can be found on the Web sites of most hardware manufacturers.
The two biggest changes that Windows Vista brings to OpenGL are:
- The new driver model, Windows Display Driver Model (WDDM), formerly known as Longhorn Display Driver Model (LDDM).
- The new Desktop Window Manager with its Desktop Compositing Engine provides 3D accelerated window composition when Windows Aero is turned on.
OpenGL and Direct3D are treated the same by Windows Vista, resulting in full integration into the OS for both APIs. For example, both Direct3D and OpenGL will get transparency and dynamic thumbnails when Windows Aero is on, and all the WDDM features (video memory virtualization, etc.) will work in a similar fashion.
Changes Introduced by the New Windows Display Driver Model
Under WDDM, Microsoft takes ownership of the virtualization of video resources at the video memory level, but also at the graphics engine level. In short, this means that multiple simultaneous graphics applications can be running in round robin as scheduled by Windows Vista's Video Scheduler and their working sets (video resources) will be paged in, as needed, by Windows Vista's Video Memory Manager.
Being that the video hardware is virtualized, user-mode components (the OpenGL ICD is one of those) no longer have direct access to that hardware, and need a kernel transition in order to program registers, submit command buffers, or know the real addresses of the video resources in memory.
Because Windows Vista controls the submission of graphic command buffers to the hardware, detecting hangs of the graphics chip due to invalid programming is now possible across the operating system. This is achieved via Windows Vista's Timeout Detection and Recovery (TDR). When a command buffer spends too long in the graphics chip (more than two seconds), the operating system assumes the chip is hung, kills all the graphics contexts, resets the graphics chip and recovers the graphics driver, in order to keep the operating system responsive. The user will then see a popup bubble notifying that the "Display driver stopped responding and has recovered."
Changes Introduced by the Desktop Window Manager
Graphics applications now have to share resources with the 3D-accelerated window manager. Each OpenGL window now requires an offscreen frontbuffer, because there's no longer direct access to the surface being displayed: the desktop. This is also true when the Desktop Windows Manager (DWM) is off.
In order for Windows Vista to perform compositing, DWM allocates an extra window-sized compositing buffer for each top-level window in the system. All these resources add up and increase the video memory footprint.
GDI is no longer hardware-accelerated, but instead rendered to system memory using the CPU. That rendering is later composed on a 3D surface in order to be shown on the desktop. The graphics hardware video driver is no longer involved in GDI rendering, which means that mixing GDI and accelerated 3D rendering in the same window is likely to produce corruption like stale or blanked 3D rendering, trails, etc. Using Microsoft's OpenGL software rendering (the first item in the four OpenGL implementations) will achieve GDI compatibility, but at the expense of rendering speed and lack of modern features.
Windows Vista running with Aero disabled. View Closeup
Windows Vista running with Aero enabled. Note the semi-transparent window decorations and the dynamic thumbnails representing the running applications. View Closeup
Windows Vista running with Aero enabled. Note the semi-transparent windows and the dynamic thumbnails representing the running applications.
What All This Means for the OpenGL ICD User
Software application companies are preparing new versions of their OpenGL applications to take advantage of the new features and fix the possible incompatibilities that Windows Vista may have introduced.
- Current Windows XP full screen OpenGL applications are likely to work, although applications that use GDI under the covers (e.g. taking screenshots using Alt+Print Screen, or some enhanced GDI mouse pointers) may not work.
- Carefully written windowed applications should also work. For those which make use of GDI and OpenGL a developer may find that the Desktop Window Manager is disabled when they launch with the message "The color scheme has been changed to Windows Vista Basic." The DWM will be turned on again when the application exits.
- For other windowed applications, if developers observe graphics corruption or lack of rendering refresh, developers may need to disable the DWM manually by switching to the "Windows Vista Basic" theme before starting the application. This also applies to an application's third-party plugins which require GDI interoperability without the application's knowledge. It is possible that some of them will cause corrupted rendering and will require developers to switch off DWM manually.
- Windowed applications that use frontbuffer rendering without ever calling glFlush or glFinish (as they should) are likely to appear completely black, because the rendering will sit forever in the offscreen frontbuffer. Not even switching the DWM off is likely to fix these, given that the offscreen frontbuffer is a requirement of the driver model itself.
- Windowed stereo rendering will not work.
- Simultaneously using graphics cards from multiple vendors will not work, given that Windows Vista only allows one WDDM driver to be loaded at the same time. Note that multi-card solutions from the same vendor (NVIDIA® SLI™ or AMD™ CrossFire™) should work.
- Memory consumption reduction schemes like Unified Depth/Backbuffer are not possible under the DWM, which increases the memory footprint of the application.
Will My Applications Run Fast?
Performance-wise, developers can expect a decrease of around 10-15% on Windows as compared to Windows XP. Applications that use problematic cases (for example, excessive flushing, or rendering to the frontbuffer, as explained later) can see a larger performance degradation. However, expect this gap to become smaller over time while the graphics hardware vendors work on further optimizing their Windows Vista WDDM drivers.
WDDM's increased memory footprint and new video memory manager approach may worsen resource-hungry scenarios. Applications which were already pushing the limits of memory consumption on Windows XP, just barely fitting, may fall off a performance cliff on Windows Vista. This is due to excessive thrashing because available system and/or video memory is now exhausted.
What All This Means for the OpenGL Developer
GDI compatibility notes
GDI usage over 3D accelerated regions is incompatible with Windows Aero, so developers have two options:
- Disable Windows Aero
- Do not use GDI on top of OpenGL rendering.
Windows Vista introduces the new pixelformat flag PFD_SUPPORT_COMPOSITION (defined in the Driver Development Kit's wingdi.h as 0x00008000). Creating an OpenGL context for a pixelformat without this flag will disable composition for the duration of the process which created the context. The flag is mutually exclusive with PFD_SUPPORT_GDI.
If a developer must use GDI on top of an OpenGL context, use the following rules:
- Create an OpenGL context using a pixelformat with GDI support (PFD_SUPPORT_GDI flag set). As this flag is mutually exclusive with PFD_SUPPORT_COMPOSITION, this will disable Aero for the lifetime of the current process.
- Don't use BeginPaint/EndPaint outside the WM_PAINT message handling.
- As on Windows XP, use the API synchronization calls whenever necessary: GdiFlush to synchronize GDI with OpenGL rendering and glFinish for the converse.
On the other hand, if a developer wants to have Windows Aero enabled with a windowed OpenGL application, use the following rules to verify that you are not inadvertently trying to mix GDI over OpenGL:
- Create an OpenGL context using a pixelformat with compositing support (PFD_SUPPORT_COMPOSITION set).
- Handle the application window's WM_ERASEBKGND by returning non-zero in the message handler (this will avoid GDI clearing the OpenGL windows' background).
- Verify that the OpenGL window has the proper clipping styles WS_CLIPCHILDREN or WS_CLIPSIBLINGS, so GDI rendering of sibling windows in the layout hierarchy is not painted over and vice versa.
- Repaint the application's windows as they are being resized rather than when the final resize happens. This will avoid interacting with GDI's xor drawing of the window border. For example, if the application has splitter bars in a four-viewport application, resize the viewports as the splitter bar is being dragged, otherwise GDI xor rendering over the OpenGL viewport will leave trails.
- Do not use GDI for xor drawing of "rubberbands" or selection highlighting over the OpenGL rendering. Use OpenGL logical operations instead.
- Do not get the desktop's DC and try to paint over it with GDI, as it will corrupt the 3D-accelerated regions.
- Under the DWM's new architecture it is especially important that an application developer verify that the application pairs GetDC/ReleaseDC appropriately. The same goes for LockWindowUpdate and LockWindowUpdate(NULL).
Performance notes and other recommended practices
If an application renders to the frontbuffer, remember to call glFinish or glFlush whenever it needs the contents to be made visible on the screen. For the same reason, do not call those two functions too frequently, as they will incur the penalty of copying the contents of the offscreen frontbuffer to the desktop.
Calling SwapBuffers on windowed applications incurs two extra copies. One from the backbuffer to the composition surface, and then one from the composition surface to the final desktop.
Calling synchronization routines like glFlush, glFinish, SwapBuffers, or glReadPixels (or any command buffer submission in general) now incurs a kernel transition, so use them wisely and sparingly.
Given that under WDDM the OpenGL ICD relinquishes control over the desktop, fullscreen mode is now achieved by the driver in a similar way to Direct3D's exclusive mode. For that reason do not try to use GDI features on a fullscreen application (e.g. large GDI cursors, doing readbacks via GetDC/BitBlt), as they refer to the desktop which resides in a completely different piece of memory than the 3D rendering.
If the application performs extremely GPU intensive and lengthy operations, for example rendering hundreds of fullscreen quads using a complex pixel shader all in a single glDrawElements call, in order to avoid exceeding the 2 second timeout and having an application being killed by Windows Vista's Timeout Detection and Recovery, split the call into chunks and call glFlush/glFinish between them. The driver may be able to split long chunks of work for the application, but there will always be corner cases it cannot control, so don't rely solely on the driver to keep rendering from exceeding the two second limit. Instead, anticipate these cases in your application and consider throttling the most intense rendering loads yourself.
Under Windows Vista, the notion of "available video memory" has even less significance than under Windows XP, given that first it is hard for the application to account for the extra footprint needed by the new driver model, and second, the video memory manager may make more memory available to an application on an as-needed-basis.
If your application handles huge datasets, you may find it competing for virtual address space with the video memory manager. In those cases it is recommended that developers move an application to 64-bit or, if not possible, compile them with the /LARGEADDRESSAWARE flag and either use a 64-bit OS (which results in 4GB of user address space per process) or boot the 32-bit OS with the /3GB flag (which results in 3GB of user address space per process).
Neither of these two solutions is completely trouble-free:
- Compiling for 64-bit has several caveats (e.g. sign extension, extra memory consumption due to larger pointers).
- Compiling /LARGEADDRESSAWARE may break applications that assume the high bit of user space addresses will be clear.
- When using /3GB a developer may also need to tune the /userva boot parameter to prevent the kernel from running out of page table entries.
Fun with Windows. View Closeup
- DWM interaction with graphics APIs
- /3GB and /userva
- 64-bit migration tips
Antonio Tejada, NVIDIA