PDA

View Full Version : Terrain optimization



Demolit
08-03-2009, 08:04 AM
Currently, I've rendered some terrain generated with the diamond-square algorithm and indexed triangle strips (including degenerate strips) stored in VBOs, performance is approx 60fps at 1024x1024 large. Nothing new I guess. Now I'd like to implement an octree, and frustum culling. I just can't figure how this would work. I can use glDrawRangeElements so I don't have to store everything into VBOs every frame. Trouble with this is, how would I then index the degenerate strips? And the index buffer would have to be updated every frame..

Aleksandar
08-03-2009, 11:28 AM
Diamond-square algorithm ???
60fps for 1024x1024 grid?

Where have you been last 5 years? ;o)

Sorry for such beginning, but there are much better algorithms! Nowadays, GPUs are quick enough that the optimization of polygon count is not so important. Much more important is how to feed them efficiently.

TIN models are faraway past! I, myself, am working on the algorithm called RINGO that is based on Clipmaps. Although, Clipmaps are very useful with textures, there are some issues when dealing with geometry, because it requires very often indices updates and flat terrain patterns (there are also spherical clipmap algorithm, but it is very computationally intensive). My algorithm splits terrain into fix-sized blocks, enabling minimal geometry update while flying on the same height. Of course, I have still to work on it.

Currently, ray-casting is judged to be state of the art in the terrain rendering, but I really doubt that ray-casting can achieve speed of clipmaps.

Demolit
08-04-2009, 02:02 AM
Well, I graduated last year and just wandering around researching and havent done much GPU programming, so I need people like you to update me ;) Then I take it computing terrain on the CPU is obsolete?

ZbuffeR
08-04-2009, 04:37 AM
Computing fine grained lod, for each triangle, on CPU is obsolete indeed.
But of course coarse grain lod is still needed for very large terrain.

Just be sure that the lod optimization does not takes more time than brute-force rendering it :)

code, benchmark, code, benchmark, compare, ...

"If brute force does not work, you are not using enough"

Demolit
08-04-2009, 05:50 AM
Since it's been brought up, any books or sites I can look at so I can attempt to implement geometry clipmaps and/or ray casting? I've been googling around but alot of results were theses discussing the benefits, and didn't have any implementation details.

scratt
08-04-2009, 05:52 AM
The Virtual Terrain Project @ http://www.vterrain.org is very good, if you have not checked it out already. :)

EDIT : I was not looking at the methods you refer to at the time so am not sure how in depth it's covered, but I am sure if you branch out from there you'll find some cool stuff.

Aleksandar
08-04-2009, 06:23 AM
LOD is inevitably needed. Even if we can render all vertices in a single regular grid with fine fps, there will be many unpleasant artifacts on distant mountains because of Z-buffer precision. Regular grid blocks organized into nested rings centered about the viewer is very good solution.

This is an example of the pattern:

2 2 2 2 2 2 2
2 1 1 1 1 1 2
2 1 0 0 0 1 2
2 1 0 X 0 1 2
2 1 0 0 0 1 2
2 1 1 1 1 1 2
2 2 2 2 2 2 2

X - viewer
0 - block with highest level of detail
1 - block with lower level of detail
2 - block with even lower LOD, etc.

Unfortunately, the hard disk on the computer that hosted my presentations about Ringo is dead, so presentations will be unavailable for the next two weeks (since I'm on vacation now :) ).

Sorry for bothering you with my view on the "terrain algorithms". Terrain rendering is something that occupies my life for many years, and I simply couldn't resist replying to this post.

P.S. Yes, http://www.vterrain.org/ is very good starting point, especially http://www.vterrain.org/LOD/Papers/!

Demolit
08-05-2009, 08:52 AM
Oh cool website ;)
Eh, I'm still a little curious, do terrain algorithms typically expect to store all vertex data on the CPU? I've been reading the stuff in Arul & Hoppe's clipmap implementation and I'm not really sure where to begin. From what I've understood, the gist of it is that the shader language can be used to feed heights to a square area of already rendered vertices.

Aleksandar
08-06-2009, 06:52 AM
Of course that all terrain algorithms do not need to store whole data in main memory! Data can easily outgrow RAM capacity. Algorithms that use hard disk (and sometimes spatial databases) to store gigabytes or terabytes of data are called "out-of-core". I guess you asked that. Or you meant that all calculation is done on CPU?

In the year of 2005, it was the state of art to use textures to update vertex positions. Nowadays we do not need to use textures for such update, because there are arbitrary attributes (input parameters in new GLSL notation) to serve the purpose. In that paper, also, planar grid of vertices is used as base structure which is to be modified by heights. It is an excellent solution if you have short range terrain. For example, 5 to 10 kilometers in diameter. But if you try to visualize much larger area, Earth curvature is not negligible. For what purpose are you building the terrain?

knackered
08-06-2009, 10:31 AM
nowadays we most definitely do need to use vertex textures.
maybe in 2 or 3 years this will be different, but at the moment all other options require brand new cards - not practical in any sense whatsoever.
My fallback CPU geoclipmapping (from the original paper) quite often stretches its legs due to some customers ageing hardware.

Aleksandar
08-06-2009, 11:19 AM
Whom are you talking to? My previous terrain implementation is executed on some platforms that don't even support VBOs (OpenGL 1.3 or 1.4 compliant). Many special purposes tablets and laptops have very ancient graphical hardware. But I hope Demolit will have more luck. ;o)

He didn't say what his target platform is and how huge terrain he wants to implement, but I think it will serve the purpose to see some efficient implementations before it starts it's own work.

Demolit
08-06-2009, 08:59 PM
Hey hey I'm still here :|
I'm actually looking to implement terrain large enough to suit a real-time free-roaming game, but doesn't have to be specifically for that purpose. Basically just really interested in implementing a large area of terrain using modern techniques, don't have a target audience in mind (what gfx card or ogl version etc.)

So..do I or do I not need to use a texture as the base of my height fields? As I said in my first post, I've simply used a procedural technique to create the height fields, then cached them with VBOs, which I assumed I could use as a base to implement geometry clipmaps.

Obviously I know that storing all the vertices can outgrow the RAM, so switching to a texture is no issue (use GLSL to load texture). What then I don't get is, if I'm just feeding heights to a planar base, where does the LOD+clipmapping come in? Do I just update heights incrementally, decreasing grid size each time, from the coarsest to finest level? (starting to make sense now that I've typed it out...)

I do understand the Losasso and Hoppe version of clipmapping, but I frowned upon having to recompute all the indices every time the viewer moved. I believe this is knackered's fallback.

Aleksandar
08-07-2009, 05:45 AM
Hey hey I'm still here :|
I'm glad you are. :o)


I'm actually looking to implement terrain large enough to suit a real-time free-roaming game, but doesn't have to be specifically for that purpose.
Do you want to:
1. walk on the surface, or
2. fly on 20-100m, or
3. fly above 1000m.

If (1) or (2) is the case, regular Clipmaps can serve the purpose. If (3) is preferable, than flat pattern for the base is an issue.


So..do I or do I not need to use a texture as the base of my height fields?
No! If your hardware and drivers support OpenGL 3.x, than I suggest you to use ordinary VBO to store heights.


I've simply used a procedural technique to create the height fields, then cached them with VBOs, which I assumed I could use as a base to implement geometry clipmaps.
It's OK!


Obviously I know that storing all the vertices can outgrow the RAM, so switching to a texture is no issue (use GLSL to load texture). What then I don't get is, if I'm just feeding heights to a planar base, where does the LOD+clipmapping come in? Do I just update heights incrementally, decreasing grid size each time, from the coarsest to finest level? (starting to make sense now that I've typed it out...)
Uh, it is a long story, but we can save some space in this forum, and time (both mine and the people interested in reading this), if you read papers about Clipmaps again. :o)

And about out-of-the-core algorithms, I didn't mean that all vertices can or should be stored in RAM (I was speaking about data needed for terrain computation, because I'm working with real data which includes WGS84 ellipsoid, geoids correction and real satellite or aircraft measured heights, which includes even a geographic coordinate system transformations). No matter how large your terrain is, it shouldn't have more than, for example, 2 millions of vertices (2 mil. is not a measure, but rather an example, because it depends on screen size, caching techniques and many other things). The point is that while you are walking/flying you should generate (because you said that you are using procedural approach) heights and update proper buffers.


I do understand the Losasso and Hoppe version of clipmapping, but I frowned upon having to recompute all the indices every time the viewer moved. I believe this is knackered's fallback.
It is an issue, which I avoid using fixed size regular blocks. There is no index recomputation, just blocks movements (which are as fast as single pointer copying for each block) and loading data for blocks that are at the border of areas with different LODs. It is much faster, but has an issue too. It is more expensive for me to change height for more than a block size than it is with Clipmaps. But if you want to walk on the ground, or fly up to few hundred meters, my algorithm servers the purpose. And it is extremely fast if you stay in the central block, because the only recomputation and data feeding occurs when viewer crosses the border of the central block.

Demolit
08-07-2009, 08:40 AM
All fine and dandy...
Just to clarify, if I'm walking/flying and generating heights on-the-fly, then I wouldn't have to store anything in the VBO, would I?
Also, to use regular block sizes, would I be right to say the camera is fixed in the center of the center block?
And out of curiosity, do you use GLSL in your algorithm to update heights? Because I'm using an out-of-core technique, does that mean all my vertex data is on the CPU, and I won't be taking advantage of the GPU? Can I move everything over?

Sorry I've got so many questions, I want to be clear before I continue with implementation...

Aleksandar
08-07-2009, 11:26 AM
I'm sorry that I'm not in a position to publish some of my papers on a web server (WSEAS has published two my papers, but they cannot be downloaded now. :o( ). Till the August 17th I'm on vacation and do not have access to computers in our lab. Until then, I'll try to answer on all your questions as clearly as I can through this forum...


Just to clarify, if I'm walking/flying and generating heights on-the-fly, then I wouldn't have to store anything in the VBO, would I?

Of course that you have to update your VBOs. Not a single VBO, because I suppose that you will have at least one VBO for each LOD level, or one VBO per block, depending on the strategy you choose. But, as far as you are moving inside a central block, nothing changes. Update happens only when you cross the border of the block. Of course, you can add some threshold to avoid frequent updates if the walking line is parallel to the border or almost parallel to it, but I suggest skipping those details now and staying on main algorithm characteristics. (Just one suggestion more... Be very careful with floating point calculations, because rounding errors and finite precision can procure very unpleasant artifacts. Even gaps!)

To recapitulate shortly: Yes, you have to update some of VBOs that contain heights (not base grid) whenever cross the border of the block.

If you are going north, for example, only blocks marked with R and D have to be changed. R blocks should reload heights, and D blocks just downsample data (which is much faster operation, because it does not require any recalculation, just partial data copying).

R R R R R R R R R R R
2 2 2 2 2 2 2 2 2 2 2
2 2 R R R R R R R 2 2
2 2 1 1 1 1 1 1 1 2 2
2 2 1 1 R R R 1 1 2 2
2 2 1 1 0 0 0 1 1 2 2
2 2 1 1 0 0 0 1 1 2 2
2 2 1 1 D D D 1 1 2 2
2 2 1 1 1 1 1 1 1 2 2
2 2 D D D D D D D 2 2
2 2 2 2 2 2 2 2 2 2 2

Be aware that blocks are of different resolutions, so the whole update, depending on the LOD scheme, costs less that full update of central area (3x3 blocks of highest resolution). It will be much clearer when you see the papers. This ASCII art is pretty hard for me. :o)
(Copy/paste previous blocks organization into notepad to see the regular structure.)


Also, to use regular block sizes, would I be right to say the camera is fixed in the center of the center block?
NO! Definitely NOT!


And out of curiosity, do you use GLSL in your algorithm to update heights?
There are many versions of this algorithm. Currently I'm trying to completely change to GL 3.2 core functionality. But, that version is not for our projects and products, because it is hard to sell something or develop some software that will be used for industry, military or surveying systems that uses latest GPUs and drivers. The most widely spread version does not uses GLSL. Not even VBOs. The blocks are implemented as display lists. That is why I asked what the target audience of your program is. If you are developing a game engine or something like that, use new functionality!


Because I'm using an out-of-core technique, does that mean all my vertex data is on the CPU, and I won't be taking advantage of the GPU? Can I move everything over?

I really do not understand that question. You have to use CPU to feed GPU with data (to be more precise, GPU's memmory). When data are at GPU's memmory, rendering speed is at the highest rate. I do not know that GPU can by itself load data directly from hard disk. :o)
After all, if you are using some procedural method to generate heights (and perhaps texture) you do not need to implement out-of-core algorithm. What are the data you have to load from hard disk? Obviously, we do not understand each other.

Demolit
08-10-2009, 12:25 AM
Eek sry, my reply didn't go through...

So..lets say I have 4 levels, finest detail goes into a square size array of data, the other 3 contain a sort of 'range' of vertices surrounding the last level.
I guess what I don't understand from this is, how does the GPU affect these values? I can't 'pass' a VBO into the shader.

Aleksandar
08-10-2009, 04:11 AM
I don't understand you. Can you reformulate the question?

GPU has to sum up Z-values from the grid and values stored in different VBO, which serves as a height map. Everything is done at vertex shader. But if your terrain is purely procedural, than you can also use geometry shaders to tessellate blocks.

Demolit
08-10-2009, 08:01 AM
Forgive me, I'm thinking of something else entirely...

So far everything's been very informative ;) Now if I can just find the time to code it all..

Gimme some time and I'll get back to you. I'd definitely want to see your papers :)

Aleksandar
08-10-2009, 12:43 PM
No problem! Just remember me (when the time comes) what I've promised! ;)

Demolit
08-11-2009, 09:00 PM
There's still one thing I could never wrap my head around, this is more a coding problem than a theoretical one.

Let's say I have an array of height values, generated procedurally. Naturally these will be stored on the CPU, unless someone tells me I can actually generate them using shaders and store them there.
So as I've said before, these are now in VBOs. When it comes to the vertex shader, how do I update the height values, without actually passing the entire array of heights to the shader? The only thing I can come up with is to store all these into a texture, then load it using GLSL.

scratt
08-11-2009, 09:54 PM
I am not quite sure what you are asking, but you can update portions of VBOs with glBufferSubData, much like glTexSubImagexx. Is that what you need / mean?

What you can and can't do to index VBOs and textures is generally dependent on what GL Version you are on. Accessing textures from vertex shaders can be slow on earlier GPUs, and likewise with VBOs and indexing.

And on later hardware you can do a lot more terrain generation on the GPU too. But that is on the most recent stuff.

So often solutions to these problems need to be couched in terms of your target HW.

Demolit
08-12-2009, 12:18 AM
You see, I was under the impression you can use vertex shaders to change height values of the terrain whenever the viewer moves. This assuming you have already have an array of pre-generated vertices. That's all I'm curious about.

Just FYI, my card supports ogl v3.0.

scratt
08-12-2009, 05:19 AM
Do you mean something like this...
http://www.gamedev.net/reference/articles/article1936.asp

Demolit
08-12-2009, 10:03 AM
Funny, I was just reading the same article.
No not like that, the GPU is just used to assist calculation.

Not sure I can get much clearer, but I meant this:
Vertex Shader in GLSL:
uniform float heights[1024*1024];
void main()
{
gl_Vertex.y = heights[gl_Vertex.x * 1024 + gl_Vertex.z];
}

Yes I know the max array size is 255. And it's probably a dynamic array. I'm not that experienced with GLSL, so am not too sure what it can do. Sure, I can store the VBOs in video memory, use glBufferSubData to edit VBO values using height values stored in N levels (large arrays), but what does that leave the GPU to do?

Aleksandar
08-12-2009, 10:38 AM
Sorry, but we still don't understand what you exactly want to do. :(

In your example, you just want to read some value from the matrix. It can be done more elegantly from some buffer. If you want to generate the height field in your GPU, than could use some kind of Noise() function. This is a very common case in adding additional details on the height field. You probably have some algorithm to generate terrain which can be accommodate to be executed on vertex unit (don't be confused, nowadays GPUs use unified shader architecture, but when execute vertex shader I consider that unit to be a vertex unit). You only have to be aware that no one vertex knows about the others.

Might be a good idea to consider OpenCL or CUDA for complex and parallel terrain generation. I have some ideas but currently no time to try...

Also, I suggest you to spend some time reading Orange book. Might be helpful. ;)

knackered
08-12-2009, 11:28 AM
use...a....texture.
sample it in your vertex shader.
like the paper says.
then, if and when Aleksandar releases details of something better than geoclipmapping, try and implement it. Until then, treat what Aleksandar says with a very big pinch of salt. It sounds like tommy rot to me.

Aleksandar
08-12-2009, 12:47 PM
Until then, treat what Aleksandar says with a very big pinch of salt. It sounds like tommy rot to me. What did I say to deserve such qualification? :(


use...a....texture.
sample it in your vertex shader.
Can you tell us how reading a texture is cheaper than reading input attribute? When you store something into VBO, vertex shader reads it as an input attribute. Texture lookup, as far as I know, is far more expensive.

Nicolai de Haan
08-12-2009, 09:37 PM
Hmm, vertex texture fetch does work very well on newer architectures. The latency can often be hidden by ALU ops and because of the access pattern with these algorithms that are based on regular meshes, it achieves a very high cache-hit ratio. Being able to completely decouple the plane template/base mesh, from the heightmap data can help simplify the algorithm as well.

Demolit, if the clipmap approach is still unclear you can ry reading about my variation on the clipmap technique for terrain algorithm - http://www.gamedev.net/community/forums/topic.asp?topic_id=504549. (http://www.gamedev.net/community/forums/topic.asp?topic_id=504549)

Since I wrote the paper I've made experiments with using VBOs as input attributes instead of VTF, and I have not seen any evidence that VTF was slower than VBOs (NVIDIA 8800GTS). It complicated the algorithm somewhat but in return it runs on older GL2.0 hardware).

Demolit
08-13-2009, 12:42 AM
Woah getting heated...

I'll tell you what I've done thus far. Still using diamond-square, I've generated an array of vertices, normals and indices. All this into a VBO (still working on the different levels). Then, I've written all the vertices to a bitmap texture.
Now, whenever the viewer moves, I can pass the translation value to the vertex shader, add these to the vertex index, and grab the height using that index from the texture. No LOD yet, but sounds like a plan?
Just for clarification, when you say 'buffer', you mean an array stored on the CPU, right?

Demolit
08-13-2009, 03:40 AM
Though I haven't read it all, that looks like a great paper Nicolai :)
So eh..before I start implementing the clipmaps, I've read all the papers about them, including that section in your paper, but still not 100% sure. Is a clipmap level a square area with a square hole in the middle? Or without a hole? And I presume each level has half the resolution of the next level.

Brolingstanz
08-13-2009, 05:30 AM
Getting the hole right is the hardest part IMHO. You need to make room for a floating 'L'.

Nicolai de Haan
08-13-2009, 11:09 AM
Usually when people use the term "buffer" in a GL context they mean a chunk of memory (array) on the GPU. But the term is used loosely sometimes and a buffer can reside on the CPU or on the GPU (contrast Vertex Arrays and Vertex Buffer Object).

Each clipmap level is a rectangular array of texels, the texels may represent heights from a heightmap if the Clipmap is used for geometry or they may represent colors if the Clipmap is used for texturing. Each level has exactly the same number of texels (width and height) but represents an increasingly(*) larger area of the original texture. Level N+1 represents twice as large an area as level N. A clipmap is a clipped mipmap, so if you're unsure of the terminology you can read about Mipmaps.

So there's not really a square hole in the clipmap stack. But when a clipmap is used for geometry purposes, you (probably, in most cases) do not want to render the same geometry multiple times. When the clip center is set such that each clipmap level is stacked on top of each other, you will be covering the same area multiple times. That is why a square hole (the size of the next/previous level depending on draw order) is "cut" into the mesh.

As just noted, getting everything to fit nicely can be a bit tricky. If you want to work on this, my advice would be to write the clipmap algorithm for texturing first, then adapt it to geometry once you have all the details correct.

* If you count from level 0 and upwards as in the traditional Mipmap terminology (note that papers by Hoppe et al. counts downwards).