Hi everyone,
I have been thinking how could I speed up rendering, can using multi-threading speed up rendering?
What other techniques that I can use to speed up rendering?
Multi-threaded Rendering
Your question is a bit generic...
Yes, multi-threading can speed up rendering, but perhaps not how you're expecting...
One of the most common ways for speeding up rendering is by reducing state changes and render calls... For both, batch up similar meshes and render them in less calls... This is especially useful with many small objects, which seems to be pretty popular lately.
-Alamar
Yes, multi-threading can speed up rendering, but perhaps not how you're expecting...
One of the most common ways for speeding up rendering is by reducing state changes and render calls... For both, batch up similar meshes and render them in less calls... This is especially useful with many small objects, which seems to be pretty popular lately.
-Alamar
I'd like to know more about your program. What does it do ? How does it perform now (frame-rates) ? Does it suffer from performance issues in certain areas of your code ? How many draw calls are you making / how many objects are you rendering ?
I've never used multithreading, never had to. But I might be able to give some advice on 'the other techniques' but you're going to have to provide more information.
I've never used multithreading, never had to. But I might be able to give some advice on 'the other techniques' but you're going to have to provide more information.
Multi-threading can be used to speed up almost anything that's bottlenecked by computation, assuming you've got the extra CPU cores to run those extra threads.
Take note though, D3D9 is a single-threaded API; you should always make all of your D3D9 calls from a single thread only.
This doesn't mean that you can't write a threaded D3D9 renderer though -- it just means that the part of your renderer that is responsible for "submission" (calling D3D draw functions, setting states, etc) has to belong to a particular thread.
You can use threads to accelerate all the other responsibilities of a renderer -- e.g. traversing a scene to collect renderable objects, culling objects that aren't visible, sorting objects into an optimal order, determining which states will need to be set for each object, generating queues of commands for the "submission" thread to process, etc...
Take note though, D3D9 is a single-threaded API; you should always make all of your D3D9 calls from a single thread only.
This doesn't mean that you can't write a threaded D3D9 renderer though -- it just means that the part of your renderer that is responsible for "submission" (calling D3D draw functions, setting states, etc) has to belong to a particular thread.
You can use threads to accelerate all the other responsibilities of a renderer -- e.g. traversing a scene to collect renderable objects, culling objects that aren't visible, sorting objects into an optimal order, determining which states will need to be set for each object, generating queues of commands for the "submission" thread to process, etc...
The only thing I worry about is when loading resources while playing a game, which can lag your fps. You don't want to take more then 10 milliseconds copying data around. How I solve this is that my main loop is my rendering thread. I got a list of things to render, and if I need to change something in that list (requires D3D9 calls) I make sure to queue those changes up in a work package list. Then each frame I perform a work package and if less then 5 milliseconds has passed I can do another work package.
Typical work packages include copying texture-, vertex- and index data to buffers.
Any other non-D3D9 processing can be done on another thread.
Typical work packages include copying texture-, vertex- and index data to buffers.
Any other non-D3D9 processing can be done on another thread.
Hi Medo337
I've tried this in XNA and found rapidly that, on a non UI thread the following are good candidates;
However, broadly speaking, the graphics object is tied to the UI thread. You can pass the graphics object to another thread but it cannot do any useful work there. You can't generate textures or prepare VBs or IBs on another thread since they cannot then be marshalled back to the UI thread where the rendering must take place. It follows that you cannot render on any thread other than the UI thread.
I dont think this restriction matters much anyway because the code calls to the graphics object by and large are requests to queue a particular operation, not the operation itself (which tripped me up a few times when profiling). I perceive, perhaps wrongly, that the graphics rendering pipeline seen from joe-average-programmers point of view is a Queue on which you push requests, which are then actioned in a Present() (or similar call). The Present thread blocks your UI thread as it actions all the queued work, writing to the screen, and doing heavy maths (sometimes on the GPU, sometimes on the CPU depending on your hardware capabiltiies).
Hope that helps.
Phillip
I've tried this in XNA and found rapidly that, on a non UI thread the following are good candidates;
- world managment (asset loading etc)
- the render queue can be managed
- game mechanics (movement, collision, etc)
However, broadly speaking, the graphics object is tied to the UI thread. You can pass the graphics object to another thread but it cannot do any useful work there. You can't generate textures or prepare VBs or IBs on another thread since they cannot then be marshalled back to the UI thread where the rendering must take place. It follows that you cannot render on any thread other than the UI thread.
I dont think this restriction matters much anyway because the code calls to the graphics object by and large are requests to queue a particular operation, not the operation itself (which tripped me up a few times when profiling). I perceive, perhaps wrongly, that the graphics rendering pipeline seen from joe-average-programmers point of view is a Queue on which you push requests, which are then actioned in a Present() (or similar call). The Present thread blocks your UI thread as it actions all the queued work, writing to the screen, and doing heavy maths (sometimes on the GPU, sometimes on the CPU depending on your hardware capabiltiies).
Hope that helps.
Phillip
Didn't anyone mention deferred rendering contexts in D3D11 yet?
They were built for exactly this purpose, although they don't actually speed up the actual rendering part, instead they make sure that you can safely build separate command buffers on different threads which can later be executed in one go by the main device.
This will only help you if there's actually a bottleneck in your application caused by working with the rendering pipeline of course, so make sure you profile first.
I don't know which version of D3D you're using, but in the case that you're using D3D11 this might be a viable option.
They were built for exactly this purpose, although they don't actually speed up the actual rendering part, instead they make sure that you can safely build separate command buffers on different threads which can later be executed in one go by the main device.
This will only help you if there's actually a bottleneck in your application caused by working with the rendering pipeline of course, so make sure you profile first.
I don't know which version of D3D you're using, but in the case that you're using D3D11 this might be a viable option.
@Gavin Williams: I notice that the rendering is slow in the beginning of the program (first 2-3 seconds), then the FPS start to increase till it become 61, I also notice that the player is slower in the beginning, even I'm using frame independent movement
I'm using something similar to the following code:
I'm using something similar to the following code:
void render()
{
elapsedTime = timeGetTime() - LastTime;
// Move
x += speed * elapsedTime;
LastTime = timeGetTime();
}
Don't worry about the first 2 to 3 seconds, maybe it's just your average frame time building up, I don't know, but I think that's unimportant. If your rendering reaches 60 fps and sits on that consistently, then that's all you need to worry about at this stage I would say. Depending on your program, there could be a few things going on at start-up that can result in shuddering or unstable frames. If you are really worried about it, you might have to profile your application using a profiling tool or simply time your method calls and start getting some specific information about how long everything is taking. Personally, timing my function calls is the approach I take when my programs start getting bigger and I run into frame-rate issues. I have a clock class (using inter-op to access the high precision timer) which i can use to mark the start and stop of a function (in the clock class) and then spit out durations to an onscreen display or text file for later inspection.
Just in regard to the above code ... does timeGetTime() retrieve fresh information from the clock or does it just return an already retrieved value. I'm guessing that it fetches a fresh measurement, which it shouldn't do, because even though the time between your two calls to timeGetTime() might be trivial here, as your physics (x+=speed*elapsedTime ... etc) gets more complicated, the difference between the two calls will become substantial and you will start losing time which will result in incorrect physics. You should have something like this ..
See the slight difference ! You can even separate your physics from your timing. You can have an update clock stage in your main loop and then an update physics stage, which simply reads the elapsed time.
Edit : If you start recording your frame-times and the times of your function calls you'll start seeing abnormalities such as functions performing on distinct tiers or spikes in function times. These are to be expected for a number of reasons, often to do with the operating system and your hardware. But they are not a problem. The first frame or two may often result in a time-spike. Could be a cache miss, or maybe even .NET or it's gc (but I don't know much about the gc and .NET mechanics). But generally I would say that these things can be ignored, especially if your program settles into a regular 60fps after just a few seconds.
A question to ask about your program is ... Does the character move correctly given the particular frame-times. You can get those numbers (distance, time, speed) and work that out. That way at least you can make sure your timing and physics is correct.
Just in regard to the above code ... does timeGetTime() retrieve fresh information from the clock or does it just return an already retrieved value. I'm guessing that it fetches a fresh measurement, which it shouldn't do, because even though the time between your two calls to timeGetTime() might be trivial here, as your physics (x+=speed*elapsedTime ... etc) gets more complicated, the difference between the two calls will become substantial and you will start losing time which will result in incorrect physics. You should have something like this ..
long timeNow = QueryClock();
elapsedTime = timeNow-timePrev;
timePrev = timeNow;
See the slight difference ! You can even separate your physics from your timing. You can have an update clock stage in your main loop and then an update physics stage, which simply reads the elapsed time.
Edit : If you start recording your frame-times and the times of your function calls you'll start seeing abnormalities such as functions performing on distinct tiers or spikes in function times. These are to be expected for a number of reasons, often to do with the operating system and your hardware. But they are not a problem. The first frame or two may often result in a time-spike. Could be a cache miss, or maybe even .NET or it's gc (but I don't know much about the gc and .NET mechanics). But generally I would say that these things can be ignored, especially if your program settles into a regular 60fps after just a few seconds.
A question to ask about your program is ... Does the character move correctly given the particular frame-times. You can get those numbers (distance, time, speed) and work that out. That way at least you can make sure your timing and physics is correct.
@Gavin Williams: I figured out that the problem was with updating 'LastTime', I had to set it to LastTime = timeNow; instead of LastTime = timeGetTime();
Now it's running smoothly, I want to create two methods, one for rendering and one for updating the scene, however I'm concerned if it will affect the FPS negatively since I will have to go through models array twice:
Lets say entity.size() = 4000
I think calling updateAndRender(); is faster than calling update(); render(); since you only go through the entities once instead of twice.
However, I want to make the update method separate, is there is a way to do that without affecting the FPS?
Now it's running smoothly, I want to create two methods, one for rendering and one for updating the scene, however I'm concerned if it will affect the FPS negatively since I will have to go through models array twice:
Lets say entity.size() = 4000
void update()
{
for(UINT i = 0; i < entity.size(); i++)
{
entity->update(dt);
}
}
void render()
{
for(UINT i = 0; i < entity.size(); i++)
{
entity->render();
}
}
void updateAndRender()
{
for(UINT i = 0; i < entity.size(); i++)
{
entity->update(dt);
entity->render();
}
}
I think calling updateAndRender(); is faster than calling update(); render(); since you only go through the entities once instead of twice.
However, I want to make the update method separate, is there is a way to do that without affecting the FPS?
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement