Sign in to follow this  
reznov81

Making engine to work into jobs

Recommended Posts

reznov81    100
Recently i read a paper of id software about new id tech 5 engine in which they told that the PS3 Cell processor forced them to "re-factor the engine into jobs", I suppose that for make the most of parallelism. I wonder about how that design of an engine into jobs would be, in a general way so it would make the most also of current multicore PC processsors and make easier a possible portability to PS3. Anyone have links about this topic or some pseudocode ?

Thanks

Share this post


Link to post
Share on other sites
Hodgman    51221
Dice has some good [url="http://publications.dice.se/"]presentations[/url] on the topic, specifically: [url="http://www.slideshare.net/repii/parallel-futures-of-a-game-engine-v20"]Parallel Futures of a Game Engine[/url].
Maybe also: [url="http://bitsquid.blogspot.com/2010/03/task-management-practical-example.html"]Task Management -- A Practical Example[/url]

Every engine I know of uses jobs to some degree these days -- the PS3's got the Cell, the Xbox360 has 3 cores (each with 2 hardware threads), and quad-core PC's are becoming standard. Job-graphs are a great way to support all of these architectures.


You can also download all the documentation on the Cell CPU from IBM if you're interested. A short overview is -- the Cell has a whole bunch of extra cores, called SPUs, which are [b][i]extremely[/i] [/b]fast... the catch is that they can't directly access RAM like you're used to. Instead, each SPU has it's own little 256kb of local RAM, which is it's own private operating area. You can issue DMA commands to move data from main RAM into the SPU's "local store", do some processing on that data, and then DMA the results back into main RAM.

This means that "good code":
* knows exactly which areas of RAM it needs to read from and write to.
* operates on contiguous areas of RAM where possible.
* can be broken up into a small enough chunk to only require less than a few hundred KB of data at once.
* can break it's workload up into several invocations of a function (so multiple cores can share the workload).

And "good" data structures:
* are compatible with [font="Courier New"]memcpy[/font] ([i]POD types rule, modern C++ not so much[/i]).
* do not require global access.
* are contiguous and linear.
* make use of SIMD where possible.
* support the use of offsets instead of pointers.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this