Jump to content
  • Advertisement
Sign in to follow this  
supagu

multithreading software renderer

This topic is 3401 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

currently i have a software renderer that is single threaded. I want to make it multithreaded so whipped up some test code thats draws a single triangle that takes up half the screen. I divide the triangle in to the number of threads i have and each thread renders a block of the triangle. this is giving me a more than 50% improvement. How ever im not sure how this would work in real world especially if there are lots of small triangles. I guess i could dynamically adjust number of divisions of the triangle based on its size, but then i have wasted threads doing nothing. I've read up on larabee that they put triangles in to bins based on screen location so they can draw multiple triangles at once. Not sure if i should try this or some other methods? Ideas, thoughts?

Share this post


Link to post
Share on other sites
Advertisement
Ive implemented dual cpu software rendering before.

I did it by splitting the screen down the middle and having 2 clipped scenes.
Then I rendered half the screen on the first cpu and half on the other.

I did get framerate improvements but it still wasnt as fast as doing rendering with the gpu so I disbanded from the project.

It was the first time I ever handled threads, the way youve proposed I could probably attempt now but back then I probably couldnt so the way I did it was probably easier thread programming wise.

But I think what youve said should work too.

[EDIT] it gives you a gnarly clipping line down the centre of your screen :) [/EDIT]

Share this post


Link to post
Share on other sites
(i am in no way a software rendering pro, just someone who had the same idea during a school project)

there can be multiple problems:
if you split the screen in half:
- lots of small triangles give a lot of useless clipping tests. (if the triangle isnt in the frustum, gives you double tests.)
- lots of small triangles makes for not so threadable code (a lot of memory reads for small areas)
- lots of big triangles give you clipping problems: you have to clip them twice possibly doubling the ammount of output-vertices.

you could consider having multiple threads with each having their own render buffer, later on merging those two. you could divide the triangles over the two threads, render them to their own buffer, merge buffer (could also be done multithreaded). would be almost-lockless (just two syncs: both threads have to be done rendering, then both threads have to be done merging). i dont know what the performance gain would be in that case, it was just something i came up with but never got the time to implement.

--edit--

also worth a read:
http://www.devmaster.net/forums/showthread.php?t=1884
could be used for implementing it with threads.

Share this post


Link to post
Share on other sites
i've sort of thought of doing a multi buffer thing where they get merged, but im not sure if this would cause issues with sorting/blending
i suspect it would cause problems

Share this post


Link to post
Share on other sites
you would have to do the transparency single threaded at the end. for the rest it wouldn't matter much, i think, unless you are going to do anti-aliasing.

Share this post


Link to post
Share on other sites
Have you considered using threads for different purposes instead of multiple similar threads? For example, one thread could do the transormation, one could rasterize, one do texture lookup, etc... You could also dynamically choose each thread's current job too.

I tried something similar, but started to run into memory bandwidth bottlenecks and so suspended the project. Even so, if you could balance things correctly it could pay off.

Share this post


Link to post
Share on other sites
im thinking of combining my method of spliting the poly in to multiple divisions with the multi buffering.

The problem is how to do this in a lock-less fashion if possible.

say i have 2 threads.

so i have 2 render buffers, so i can render up to 2 triangles at once.
but each triangle gets chopped in to multiple.

so the thread gets to render part of the triangle, to a certain buffer.

need to work out how to work out what buffer to render too. IE. some way to say triangle index of 3 needs to go to buffer 1, triangle index of 4 goes to buffer 0.

this needs to be done when i start rendering the first block of a new poly as to automatically balance rendering between the 2 buffers.

Share this post


Link to post
Share on other sites
hrmm seems with the multi buffering it might be more hassle than its worth seems like any savings you might make could be offset by having to combine the buffers each frame.

Share this post


Link to post
Share on other sites
multi-threading in general is a complex beast, multi-threading a software renderer is even more complex.

one thing i'm noticing in this thread is that people are only talking about 2 threads... most PC's these days have 4-8 HW threads, consoles similarly have many threads.

lets assume a conservative 4 HW thread model... there are many processes that need to go on for rendering a single mesh.

1 Transform verts into screen space
2 Clip verts to "Tile" extents
3 interpolate vertex across the triangle and generate raster "quads" 1 per "Tile" touched
4 run the <shader> portion per pixel

If i were writing a software renderer I would be doing the following

1 Setup thread(s) to do <1 + 2 + 3> in one swoop pushing into a circular thread safe buffer (per Tile) a number of "quads" each quad fully describing itself in terms of what to rasterize.
2. Setup a second set of threads to READ from the circular buffers and perform the rasterisation. Each Thread here would represent only 1 "Tile" in the final buffer thus there would be ZERO contention on the final buffer itself, Transparency would be handled by normal render order methods.

This method would allow expansion to many threads depending entirely on how many tiles you wanted to split rendering into.

Note - this is off the cuff, I haven't implemented a software renderer in almost 15 years.

Share this post


Link to post
Share on other sites
fill me in with more info on the quad/tile idea? are you talking about dividing up the screen?

the problem i see with this idea is most polys will end up in the same tiles on the screen when you render a model anyways. If you deferred any rasterisation till the end once all meshes had been processed than it might be a different story.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!