Renderer - Is this a good design?

Started by
6 comments, last by Harry Hunt 16 years, 2 months ago
Hello I am planning to make my engine multithreaded. So I came up with the following ideas to structure the renderer/scene. Let's start: At first I have a class for the scene which holds a list with all objects in the scene. In an extra thread this list is traversed and after an object is "updated" it is pushed into a queue with renderable objects (in the renderer class). In a second thread the renderer just goes through the queue with renderables and renders the objects. If the scene has finished updating all its objects it sends a signal to the renderer to swap the buffers. Then the scene waits until the renderer has done its job and starts again with updating the scene objects. So would this be a good design or are there maybe some traps to fall in when using this approach? thx in advance
Advertisement
This is very similar to how my renderer works and so far I haven't any trouble with it.

My approach is different though in that the render-queue is really a ring-buffer of render-queues and that my engine will do all the CPU processing for up to 5 frames in advance. The reason for this is that most games are CPU bound and so most of the time the renderer will wait for the scene and not the other way round.

Right now, the scene processing system runs in its own thread. It will visit each node in the scene graph, determine if that node is visible (this system is backed by an octree for static geometry and a quadtree for terrain) and then process the node according to its type (do particle simulations, transform bones, etc.). It does so by spawning a bunch of job objects which are completely isolated (meaning they operate on their own copy of the data) and which are then processed by a thread pool (if you have N cores, the thread pool will have N threads). Once they are processed, they are pushed onto the render queue for the frame they belong to. This generally happens while the renderer is busy with an entirely different frame meaning the renderer doesn't have to wait for the scene processing to complete. Once the render queue for one frame is complete, the ring-buffer moves on to the next frame.
The whole system is relatively complicated but it has the big advantage that it scales with almost linear performance gains from a single core to a quad-core machine and would probably also scale to 8 or more cores without any changes to the code. Also, there's almost no locking involved (both the ring-buffer and the render-queues are implemented in a lock-free fashion).

The problem I see with your approach is that the threads will spend a lot of time waiting for each other and that your code probably won't scale too well. Of course you'd need to do some profiling to see which is really the case.
Quote:Original post by madRenEGadE
Then the scene waits until the renderer has done its job and starts again with updating the scene objects.

Then what's the point in making it multithreaded?

To be honest, if you don't know how to use threads to speed things up then using them is more likely to slow things down; so stick with a single thread.
The point in making it multithreaded is because the renderer can render something BEFORE ALL objects in the scene are updated...
Here is my renderer's header file:


class renderer{private:	SDL_Surface * main_screen;	material_db * materials;public:	int x,y,b,f;	fog_control * fog;	light_control * lighting;	background_control * skybox;	renderer();	~renderer();	int initialize(int X, int Y, int B, int F);		bool initialize();	void new_frame();	void flip_buffers();	void render_static_mesh(static_mesh * m,g_vector t);	void render_md2_mesh(MD2_manager * m,g_vector t);		void set_material_db(material_db * m){materials = m;};	void set_transform(const float pos[3], const float rot[12]);        //take vector pos and matrix rot and set OpenGL view matrix.	void apply_transform(const float pos[3], const float rot[12]);        //transform OpenGL view matrix by rot followed by pos	void get_transform(float *pos, float *rot);	//get the OpenGL view matrix in a format similar to that used by ODE	void apply_camera(camera * c);	//apply that camera to the OpenGL view matrix};


The renderer is able to deal with models at an abstract level, and each system goes through its list of entities and calls either render_static_mesh() or render_md2_mesh(). It is, therefore, unaware of anything at a higher level than these two kinds of mesh. I find this is a good way to keep in tune with single responsibility principle. My main loop sends the signal to swap the buffers.
Don't thank me, thank the moon's gravitation pull! Post in My Journal and help me to not procrastinate!
Quote:Original post by madRenEGadE
The point in making it multithreaded is because the renderer can render something BEFORE ALL objects in the scene are updated...


So you're feeding objects to the renderer one at a time. This means the renderer is potentially waiting on your scene processor to feed it something. That means you have two threads potentially waiting on each other. That is your pitfall. Harry's approach sounds ideal.

GDNet+. It's only $5 a month. You know you want it.

i have done simalr things but problem #1 is that microsoft and threads are really a joke because all multi threading is doing on most peoples 32 bit pc's is just processing one then the other over and over again its not true multitasking.#2 that aside i have had quite alot of unexpected results due to the way the os handles the threads so it is hard to get the same results across multiple users machines.#3 there are a lot of times when multiple threads will end up costing you more than you will gain speed wise.

but if you can make it past all that then great i will be interested to see how it turns out
- Owen.Hindman();
Quote:Original post by hindmano
i have done simalr things but problem #1 is that microsoft and threads are really a joke because all multi threading is doing on most peoples 32 bit pc's is just processing one then the other over and over again its not true multitasking.#2 that aside i have had quite alot of unexpected results due to the way the os handles the threads so it is hard to get the same results across multiple users machines.#3 there are a lot of times when multiple threads will end up costing you more than you will gain speed wise.

but if you can make it past all that then great i will be interested to see how it turns out


o rly?

First of, if the OS only has one core/CPU/HW thread to work with, what else would it do? You can't run threads in parallel if the underlying hardware doesn't support it. This isn't a Microsoft problem at all and it especially doesn't have anything to do with whether your PC is 32 or 64 bits.

Secondly the "unexpected results" you've been getting also aren't due to the OS. They're called "race conditions" and it means you have a locking problem.

Your last point does make sense though: A well-written single-threaded engine will outperform a multi-threaded engine when the hardware it's running only supports one HW thread. That's of course due to the context-switching overhead. A thread-pool type of approach (where you spawn as many threads as you have HW threads available) however will help address this problem.

This topic is closed to new replies.

Advertisement