Unicode Text Rendering?

I’m sure you guys have heard this question a million times before, and for that I apologize, but countless google searches have turned up very little that’s actually helpful.

Here’s my situation: I need a way to render text in my OpenGL application. It has to be cross-platform, it has to support unicode, it has to be capable to rendering arabic, chinese, japanese, or whatever the user wants it to, and it has to be as fast as possible. I’m using OpenGL 3.3 with the core profile, so no display list nonsense.

I’d like to use Freetype 2 to handle to actual nitty-gritty of the glyph rendering, as it seems to be the most convenient library around for that. From there, I’ve seen a number of different implementation tricks and I’m not sure which one to use.

One very common solution is to make a really, really big texture that holds every possible glyph, and have each rendered glyph be a textured quad (or two triangles) whose texture co-ordinates point to the correct glyph in the cached texture. The problem with this quickly becomes apparent when you consider that the Japanese language alone has somewhere around 30,000 glyphs. Even if it were possible to get a texture big enough to store everything, said texture would alone take up half of the memory on the user’s video card (which is a bad thing).

Another solution is to use Freetype to render each individual string from scratch once, the first time said string appears, rendering the whole string as a single textured quad. This quad, and the texture created by Freetype, would be cached in memory for as long as the string exists on the screen. This works well enough for static strings (which, admittedly, is what 90% of the application’s text will end up being), but what about dynamic text? Text input boxes, etc. What about memory? If there are many strings being rendered, for instance in an online chat, could memory usage be a concern?

How do operating systems and/or web browsers typically handle it?

What are some solutions that you have come up with? Are there some good (up to date) tutorials on the subject (trust me, I’ve looked)? Any advice that you could share to point me in the right direction would be much appreciated.

Using bitmap fonts is usually fine. Rarely is their size a concern. You can have different textures for each language and only load the appropriate texture on demand, and even then, textures arent that big anymore.

But if you want it to be very dynamic you can always try to draw the glyphs as vector graphics. Though this would have a far inferior performance I guess. It would also be a good deal more difficult to code.

I could see having separate bitmap textures for each language, but what happens when I have an online chat where people can input any character they like? Do I simply disallow some characters based on the language pack loaded?

Load the textures you need. If you need lots of them then you need lots of them. There is nothing you can do about it and no library out there can perform magic to make it work out of thin air.

There is “2D Shape Rendering by Distance Fields” by Stefan Gustavson coverfed in “OpenGL Insights”. It allows a very compact bitmap to render well when scaled. You can also look at creating a 2D texture array with a character in each layer. You can create characters on demand and add them to the array. Now use a simple mapping of your character to the array slot.

I like the idea of distance fields, and I like how nicely they scale up, but do they scale down as nicely (rendering small text)?

Right now I’m considering pre-making a different texture for each language, with every possible glyph on the same texture. One for Latin-1, one for japanese, one for hebrew, etc, and loading them only as they are needed (hoping that they don’t all get loaded at once).

Small fonts only seem to render well if you generate the bitmap to exactly match the font size. This is because a font actually has a large amount of hint information for the scaling the font which is lost when you generate the bitmap. Windows TTF fonts has, for example, 500+ opcodes in its hint language.

We spent a lot of time on a very similar problem - we have an app that runs on Windows and OS X, and have the need to draw Unicode characters from many different languages (include several Asian languages). Initially we used FTGL and attempted to load one of several fonts that had glyphs for all of the languages we used. Inevitably we ran into issues with the font we loaded not having the necessary glyphs we needed (or none of the fonts we tried to load were present on the user’s system). Both OS X and Windows use a glyph replacement system that will load glyphs from different fonts if the current font doesn’t provide it; FTGL doesn’t provide this.

In the end we wrote classes that used OS specific functionality to draw the text to an image, and used the image as a texture. This gave us the best of both worlds - very low memory usage (since the images were specific to the entire string we were displaying, and destroyed when we were done with them), and no worries about fonts and glyphs - at the expense of a fairly minor slowdown. We didn’t formally test the speed and see how exactly it compared to the previous FTGL solution, but informally we didn’t have any problems with it, even running on 10+ year old systems.