What are good estimates of renderer (UE3, CryEngine1/2, ...) overhead?

Started by
14 comments, last by wolf 15 years, 11 months ago
Hello folks, do you have any ballpark estimations of what the overhead of the renderer architecture of a typical state-of-the-art engine (such as the mentioned ones) is? I'm thinking about shader-heavy scenes, with many different materials and objects. Also, what overhead levels would be acceptable to be able to compete with such a renderer? 20% more than the driver time? 50? Thanks in advance, Christian
Advertisement
There is no way to answer your question. There is no overhead of renderer architecture because every renderer is unique. It all depends on how you build it and you balance everything in a way that it runs the way you want it. So there is nothing fixed.
It also heavily depends on the underlying target platform. If you target XBOX 360 and PS3 this is a constant value otherwise not.
Even if you remove all the variables from the underlying hardware it all depends on what you render. Let's say you want to measure what it takes to render into a shadow map with the double-speed write. There is not much you can do different so it all depends on how much geometry you render and how much foliage will go in there and how big the foliage is ...

So no there is no general answer possible ... it all depends on how long you want it to take :-).
Okay, so I'll try to bring in a few more details.

I wrote a little hobby renderer which, amongst others, uses an abstraction layer on top of Cg.
This provides a convenient language to parameterize shaders for the upper layers, instantiates & caches shaders on the fly, does API calls only when necessary and is less than 1500 LoC. It combines a few tricks like copy-on-write, cached hashes and union-find to stay efficient. But still, I'm paying approx. 15-30% of the frame time for this convenience, depending on how complex & dynamic the scene is. So, what I'd like to decide is whether I can roll with it and build a full-blown renderer on top of it, or whether that's just too much overhead to be acceptable. The thing is, I don't know if other (successful) engines are designed this way, or whether there's some kind of mantra to do this kind of tracking at the upper layers, where more information is available and the overhead can be mitigated.

Here are a few excerpts from a demo app (the "backend" doesn't know anything about the layout of shaders, parameters, passes, materials etc.)

// done by the shadow mapping frameworkcg_value lookup = backend->create_instance("SingleMapLookup");lookup["viewproj"] = viewproj;lookup["map"] = tex;// done by the variance shadow mappershadowing = backend->create_instance("VarianceShadowing");	shadowing["lookup"] = lookup;shadowing["depthScale"] = 1.0f/target.range;// done by some light sourcesmodel = renderer->create_instance("PointLight");model["shadowing"] = backend->create_instance("NullShadowing");model["color"] = node->color();// when position changesmodel["position_w"] = node->global()[3];if (shadows) {	mapper->setup_point(node->global()[3],100);	mapper->calculate_shadowing(node->sector());	model["shadowing"] = mapper->get_shadowing();}// done by the render loop ......backend["brightness"] = cBrightness;...vector<cg_value> sources;sources.push_back(ambient.value());foreach(spatial_node *cur, visibles) {    ...	sources.push_back(light->model());}backend["lights"] = sources;...backend.render("ZPrepass",visibles);backend.render("AmbientIllumination",visibles);...


As you see, many distant parts of the engine are nicely decoupled by this use of a common parameter type, so it obviously has some merits...
In the end, I would do one or more graphics demos, in the style of Crysis (although not so foliage centric), running at not less than 20 fps on a modern PC (GF8800GT-ish).

[Edited by - pro_optimizer on May 16, 2008 6:13:01 PM]
Profile your code, you look heavily depended on string compares "lookup["map"]" which can be "non-cheap". But honestly from my experience this should be more like 5%. But without seeing the code behind it it's impossible to say.

"Those who would give up essential liberty to purchase a little temporary safety deserve neither liberty nor safety." --Benjamin Franklin

As mike2343 says profile your code. What you also can do is remove all the speed bumps like STL (can't use it on consoles anyway)and BOOST and all the other things you do not want to use in game development. If you use C++ -it looks like it- reduce the number of virtual functions, use as much C as possible and don't forget that C is the language you will use most to program your renderer and especially SPU's and try to keep your code fast.
Most modern engines are dealing mostly with data management. Depending on your underlying platform and the number of cores you will want to structure your C code following the data and not the other way around. So in essence do not think about designing classes but think about what you want to do with the data and see how you can stream data to the numerous stream processors on your hardware platform (GPU, SPU or multi-core CPUs). This is easier than one would think. If you know how to program a GPU you just build the same kind of model for some of the cores of a CPU. So you define a data input structure, a data output structure and you write small C functions that manipulate the data in-between. Than one of your cores works on this data ... I think you got the picture.
So the idea is to distribute the work that the renderer does over several processors.
When you program the GPU you will have to be careful not to re-build what the driver on the PC is already doing for you. Shadowing constant data might be a good thing to do. Managing render states so that you can set cache them too. Other than this I am not quite sure if you want to do anything special here.
Because the GPU is the most powerful chip in your console or PC, you want to spend lots of time to figure out how to squeeze the last cycle out of it. On the C level you have the challenge to feed the GPU as fast as possible with data without involving shader switching, shader constant waterfalling, render state switches, texture switches and all the other terrible things that can happen here. Whatever you design on the C level is build in a way that you can feed the GPU in the best possible way. So your design is driven by the way you balance shader switching and all this other stuff while rendering.
Quote:Original post by wolf
As mike2343 says profile your code. What you also can do is remove all the speed bumps like STL (can't use it on consoles anyway)and BOOST and all the other things you do not want to use in game development. If you use C++ -it looks like it- reduce the number of virtual functions, use as much C as possible and don't forget that C is the language you will use most to program your renderer and especially SPU's and try to keep your code fast.

I must have stumbled into a time machine because I thought it was 2008 and not 1988.

Quote:Original post by wolf
As mike2343 says profile your code. What you also can do is remove all the speed bumps like STL (can't use it on consoles anyway)and BOOST and all the other things you do not want to use in game development. If you use C++ -it looks like it- reduce the number of virtual functions, use as much C as possible and don't forget that C is the language you will use most to program your renderer and especially SPU's and try to keep your code fast.


Uh, what? I've used both STL and boost libraries on SPUs, there's nothing inherently slow about the code at all, and I doubt you could rewrite their algorithms to be faster until you vectorize it. Templates (or rather 'generic programming'), in general reduce the use of virtual functions, increase type safety, and make it much easier to write tight, fast code. Sounds like you're just dismissing tools without actually understanding what's slow about them. I can write slow code using any language, with any library.

Plus, almost every studio nowadays uses a decent competence of C++ to write most of their code.
Quote:Original post by wolf
As mike2343 says profile your code. What you also can do is remove all the speed bumps like STL (can't use it on consoles anyway)and BOOST and all the other things you do not want to use in game development. If you use C++ -it looks like it- reduce the number of virtual functions, use as much C as possible and don't forget that C is the language you will use most to program your renderer and especially SPU's and try to keep your code fast.


Does this really worth the extra development time you'll put into developing home-brewed containers if, say your title is PC only*? OK, maybe yes, but consider just how much time and man-power you'll have to put into debugging a full-blown set of containers. But more importantly, you'll loose most of the benefits of using a newer object-oriented language like modularity, code resuse, encapsulation and all other stuff that the world has been shouting about for the last decade or two. You'd basically have to start all over again for the next title.

Hell, even John Carmack, who's been known for his notorious use of C moved to C++ when working on Doom3, which considering the fact that the game's development started 4 years prior to its relase, it must have been around 2000.

Let's burry C deep down in the jungles with a stick in its heart! It's really time to move on.

EDIT:
* : The above poster noted that it can also be used on consoles.
Quote:Original post by AnAss
Quote:Original post by wolf
As mike2343 says profile your code. What you also can do is remove all the speed bumps like STL (can't use it on consoles anyway)and BOOST and all the other things you do not want to use in game development. If you use C++ -it looks like it- reduce the number of virtual functions, use as much C as possible and don't forget that C is the language you will use most to program your renderer and especially SPU's and try to keep your code fast.

I must have stumbled into a time machine because I thought it was 2008 and not 1988.


Much as I fear agreeing with AnAss might make me one I have to say that declaring the STL, Boost and C++ (by virtue of only addressing C) to be "all the other things you do not want to use in game development" is a level of madness I hear far too often in game coding circles.

Far too often it leads to the idiocy of writing every single thing yourself only to find that what has been implemented is... the STL, or Boost::AutoPtr but with none of the benefits of using the STL or Boost!

Sure there's plenty of times you don't want to be using the STL or many other libraries with wild abandon but they're a damned useful tool and nothing LESS.

Wierd to still see these opinions going around. For reference I have used the STL (or STLport when needed) on every platform that I've got listed in my profile. It's a useful tool. When you're profiling shows that it's the slow thing you rework the code but to date I've never found it to even showup in most of our profiling as a cause of poor performance, and when it HAS shown up in profiling it's because the algorithm calling it all of the time is wrong.

Andy

"Ars longa, vita brevis, occasio praeceps, experimentum periculosum, iudicium difficile"

"Life is short, [the] craft long, opportunity fleeting, experiment treacherous, judgement difficult."

Quote:Original post by wolf
Because the GPU is the most powerful chip in your console or PC, you want to spend lots of time to figure out how to squeeze the last cycle out of it. On the C level you have the challenge to feed the GPU as fast as possible with data without involving shader switching, shader constant waterfalling, render state switches, texture switches and all the other terrible things that can happen here.


That's funny, I thought many more games were simply GPU bound than CPU bound. Though it does pack a lot of good power, it's still often the bottleneck.

This topic is closed to new replies.

Advertisement