Terrain (advanced)

Graphics and GPU Programming Programming

Started by gnmgrl July 17, 2012 01:50 PM

5 comments, last by L. Spiro 11 years, 9 months ago

699

Author

July 17, 2012 01:50 PM

Hello guys,
I'm still working on my terrainengine. What I am doing so far is:
Loading a heightmap and a shadowmap, then create chunks when they are coming in visionrange, and pass them their parts of the two maps, and then delete them when they get out of range.

The problems I've got now are that
a) when chunks are loaded/unlaoded, my screen freezes for a short time (0.1 sec ~), because im calling new and delete on about 100 or more 65*65-arrays. I need to get rid of this, so I thought about multithreading. That appread to be pretty nasty to implement, but if you say that its the only/best way, I'll go for that.

b) althogh I already split everything into chunks and load only what is necessay, it still feels like I'm using tons of memory. Is there a way not to hold the whole heightmap in memory (dynamicly reading only the parts I need from the file or something?)

c) The FPS are around 45 at the time and I'm only rendering terrain right now, thats much to slow. How can I get it faster? I havent implemented LOD yet, will this boost the FPS rapidly? I want to have a big visionrange in the end.

Thanks for your help!

Hornsj3

195

July 17, 2012 02:50 PM

If you are using C++ you could use a memory pool. This would cut down on your allocation and deallocation on the heap.

Basically during program initialization you could allocate an amount of memory off the heap and store it in a pointer. Then, you can use the placement new (look this up) syntax during runtime to store your data in this pre-allocated area of memory.

Because heap allocations are expensive at runtime and can throw exceptions this solution would increase your performance and make it more exception safe.

You must look up how to use placement new because there are tricks to it, such as the need to explicitly call the destructor.

edit -

You don't have to store all of the tiles in memory either, you can place and destroy at will.

gnmgrl

699

Author

July 18, 2012 08:19 AM

So what I should do looks like the following? :



// Program Start:

chunk *chunkList;

chunkList = new chunk[max_loaded_chunks];



// To allocate a new one

chunkList = new (chunkList) chunk(d3d11Dev, ...);



//To delete one

chunkList->~chunk();

delete (chunkList) chunkList;

Something like that?

Hornsj3

195

July 18, 2012 11:34 AM

That looks about right, yes.

Edit -

I spoke a bit too soon. You don't want to call delete on a single item. The whole idea is to have that memory reserved for your program throughout its lifetime. Just call the destructor on the object you want to get rid of.

You only have to call delete on the buffer when you are done with the entire thing.

//To get rid of one item : pseudo code
(chunkList + someInteger)->~chunk(); (e.g. chunkList + 3 will put you at effectively chunkList[3])

//to get rid of the pool : pseudo code
1. for each object in pool
2. call destructor
3. end of loop
4. call delete [] pool.

Herb Sutter has a great example of this in Exceptional C++.

gnmgrl

699

Author

July 18, 2012 12:06 PM

That seems about right, I'm gonna test it as soon as I can.
Ty

Hornsj3

195

July 18, 2012 12:52 PM

Another optimization.

I can't remember the syntax exactly but you should probably grab raw memory instead of new chunk[max_chunks]. The reason is calling new chunks[max_chunks] will invoke the chunks constructor max_chunks times. There's no reason to do this and it will slow you down.

I think the solution is this chunk* variableName = static_cast<chunk>(operator new (sizeof(chunk) * max_chunks));

The difference is when you say "operator new" and specify an amount of memory it will just reserve raw memory and therefore not construct your objects.

All other aforementioned rules apply.

edit -

When doing this please note you have to keep track of how many constructed objects you have and where they are in the buffer because as I mentioned those objects will not exist until placed. Using raw memory when you expect to have a constructed object would not have the effect you intend =D.

L. Spiro

25,818

July 18, 2012 01:35 PM

If you plan on having a large visual range, you are basically stuck with 2 options:
#1: Geo Clipmaps
#2: Geo Mipmaps

Geo Clipmaps provides the best performance but Geo Mipmaps is easier to implement. Plenty of information is available online about these systems.

My own engine is intended to be easy to use by a large audience, so my memory managers have to take a performance hit in order to serve all usage cases.
On the other hand, our work engine memory manager is about 6 times faster than my personal memory manager because it is used only in-house, and thus we can be more strict on how it is used.
Our in-house memory manager forces people to be more conscientious about how they allocate memory, and the blatant fact is that this is the way it should be. A huge amount of performance can be gained by simply planning your allocations better.

Your major hiccup here is apparently in your allocation system. You didn’t explain why you need to allocate in this manner, but generally a bit of planning with a mixture of allocation options helps to eliminate this type of overhead.

I will assume that you do not want to remake an entire allocation system that fully replaces malloc() and free(), so my following points are about alternative allocation methods that are easy to implement and yet heavily out-perform malloc() and free().
#1: Stack allocators. These have no ability to “realloc()”, so if you want to use these you have to know how much memory you need before you allocate. Luckily this is often easy to discover. Many routines can be rewritten as a 2-pass routine, 1 pass to determine how much memory to allocate and a 2nd pass to actually fill in the values for that memory. 2 passes may seem like overhead, but in practice it is usually over 50 times faster than using standard realloc() and malloc(), partially because the cost of free() is entirely eliminated. When objects created on a stack allocator disappear, their distructors are called but any memory they allocated on that same stack allocator is not freed.
To put this into practical-use perspective, this reduced model loading times in some situations from 13 seconds to 0 seconds in my own engine.
#2: Trashable heaps. These are the same as above except that only free() is optimized away. malloc() and realloc() are valid options and the heap is resizable (unlike with stack allocators), but all of the allocations made on that heap are freed in one call. Again, destructors of objects are called, but nothing is actually freed from the heap. After all the objects are destructed, the memory for the heap is cleared in an instant, and this again can save you literally seconds of free() calls in places where you called malloc() many times. Note that this can also apply to new and delete.

With more allocation systems in place you have more options when allocating, and as my office engine proves this can be a significant asset in your allocation speed even if you are just using standard malloc() and free().
With more allocation systems, you have even more options, all of which can gain you huge amounts of savings in performance when used with proper planning and care.

L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

Terrain (advanced)

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Terrain (advanced)

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines