Rendering many trees

Started by
26 comments, last by jmakitalo 11 years, 10 months ago
Clb is right, you must measure as much as you can, and NO, you don't really need training for that.
You will be glad to simply check out methods with the highest profiles(the ones on top of the list) and possibly concentrate on/optimize them.

You can also more directly evaluate the time spent between statements with very simple functions.

Just google for c++ profiler/profiling tools.

Bench-marking your code is crucial to evaluate yet if you have enough budget for a specific job before writing it.
It also helps a lot to get extra clues about what's possibly wrong.
Advertisement
I have used gnu profiler before (I use g++), but usually the data seems useless, as it tends to list functions that just dont seem to be related to the problem at hand. I do realize that it takes some skill to be able to dig out the meaningful data.

I'm not sure if this is related, but I have also found that if I have, say, just one mesh that represents e.g. a hedge, consisting of maybe twenty crossing planes with alpha tested semi-transparent texture with opaque and transparent parts alternating at small scale, then a closeup view will make the rendering to crawl. Why would this happen? I mean, I understand that a large alpha blended polygon would have a large impact, but why alpha tested?

[background=rgb(250, 251, 252)]but usually the data seems useless[/background]


[/quote]
Sometimes it is hard to figure out the cause of your stalls from the profiler only but if a module function comes on top at your app's run-time, at least you know your app is intensively using it. You must also use timing functions directly in your code to locate bottlenecks.

For your second question, it often depend on your transparency approach. If you use raw blending, you need a clever primitive ordering to get the most of it but sometimes screen door is more appropriate (no back to front order needed).

Check those :
http://www.gamedev.net/topic/599103-issues-with-blending-transparency/

http://www.opengl.org/archives/resources/faq/technical/transparency.htm
I think I made some progress here. I reduced the rendering code temporarily to a simple form, where I was able to see that textures and vertex buffers are only bound once per whole tree system and nothing is clearly done in excess. Still no improvement. Then I realized that I was still drawing the trees to shadowmaps. This explains why looking at the ground did not affect framerate. I then disabled this feature and then the framerate increased when looking away from the trees. I placed three times more trees, and facing the ground still yielded a good framerate. So I guess that one could think that it is fillrate related. I then tried again sorting the trees, checking that sorting is actually happening, but this did not improve the situation. It is strange that even if I went really close to a tree trunk so that it occluded the whole screen area, there was no improvement. I would have guessed that sorting from front to back would have shown improvement in this case at least.
Do you suppose it would help significantly, if the leaves were rendered to an off screen target, which would be of low res? There was an article about this sort of thing in gpu gems 3 for particle effects. The problem would be downsampling and obtaining the zbuffer. Does arma2 do something like this? I remember that the trees looked a bit blurry.
You could try a depth-only z prepass without texture reads or color writes. Or you could try occlusion culling which might be very effective for culling away large numbers of distant trees
I did a quick tryout with z-pretest. I need to enable alpha writing for the leaves for this. I got some strange flickering artifacts, so meybe I need to use some kind of depth offset. Anyway, there was no improvement. Rendering the leaves without any shaders nor texturing did have a huge impact, though. So the shaders must be somewhat demanding, but z pretest wont seem to save it, or then I'm just doing it wrong.
Try to use alpha to coverage, it´s realy fast and could get you a extra fps.
"There will be major features. none to be thought of yet"
Only other thing I can think of is impostors, which can look very good if done right - Oblivion (SpeedTree) faded the tree model out for a flat billboard as close as 50m from the camera, and the treescapes always looked great (imo)

Try to use alpha to coverage, it´s realy fast and could get you a extra fps.


What do you mean by using alpha? I use alpha testing, which I believe is cheap, but it still yields low framerates.

I also just realized that there is the "discard" function in fragment shaders. I think that this improved the situation slightly, but it's not doing wonders.


Only other thing I can think of is impostors, which can look very good if done right - Oblivion (SpeedTree) faded the tree model out for a flat billboard as close as 50m from the camera, and the treescapes always looked great (imo)


Perhaps this is one way, but I bet it is a fair amount of work to make it fast and still look good. Probably need some fancy texture atlasing and what not.
Occlusion query might help to reduce the overdraw, but I'm suspicous as the meshes are quite low-poly and there are many of them.

Ok so, it is established that the low fps is obtained when the trees are in the view frustum and fill most of the screen. Can one deduce from this, that
the program is limited by fillrate or can there be some more complicated caching between vertex and pixel processing, so that the number of polygons
could still matter?

How do games generally cope with the problem, that if you have e.g. a hedge or a bush consisting of a few hundred overlapping alpha tested polygons and the player goes very
close to this bush? This really seems to crush the framerate even without alpha blending. Is the only way to improve the framerate just to do a precise depth sort from front to back?

This topic is closed to new replies.

Advertisement