Upcoming Events
Southwest Gaming Expo
11/20 - 11/22 @ Dallas, TX

Workshop on Network and Systems Support for Games (NetGames 2009)
11/23 - 11/25 @ Paris, France

ICIDS 2009 Interactive Storytelling
12/9 - 12/11 @ Guimarães, Portugal

Global Game Jam
1/29 - 1/31  

More events...


Quick Stats
6516 people currently visiting GDNet.
2341 articles in the reference section.

Help us fight cancer!
Join SETI Team GDNet!



Link to us

Link to us

  Intel sponsors gamedev.net search:   
Who moved the goalposts?
Posted April 12 10:59 PM by Mike Lewis
Developing a modern game is often a multi-year process, involving vast amounts of programming, art and content creation, and so on. As if this weren't enough challenge by itself, hardware also continues to advance during the development process, so that commonly available commodity hardware is significantly more advanced than the hardware was at the beginning of development.

Nowhere is this change as dramatic or rapid as with CPU advancement. CPU cores and architectures are changing radically, especially with new multi-core designs taking root in the market. An average video game may actually see the rise and fall of several different CPU architectures during its development lifetime. In this lighting-paced world, it can be extremely difficult to take advantage of all the hardware that is available - if for no other reason than the fact that you may be designing a game around hardware that will be obsolete by the time the title ships.


Tips and tricks for dealing with the pace of CPU change
  • CPUs are no longer fully interchangeable, especially with the rise of 64-bit processors. Therefore it is vital to test code on as wide of a variety of CPUs as possible.

  • As core counts continue to increase, knowing how to design and implement solid multithreaded code will become increasingly vital

  • SIMD is universally available; there's no longer any reason not to take advantage of it. SSE2 is essentially available on all modern CPUs, and processors from the dual-core era onward support at least SSE3.

  • Knowing how to optimize cache usage is crucial; be careful not to stomp on other threads' cache setups

  • Beware of "false sharing" and related cache issues

  • Communication that has to cross the front-side bus will be severely slow; this includes talking between CPUs (on certain architectures), talking to hardware devices, and even certain memory access patterns

  • Avoid using the heap when possible; prefer to use special memory pools (for cache locality and cheap allocations) or the stack (but avoid bad hacks like alloca)

  • In some situations where data must be queued, consider LIFO stacks instead of more traditional FIFO/queue structures, in order to improve cache locality

  • Employ data parallelism as much as possible; future CPU architectures will continue to improve the benefits of such approaches

  • Fine-grained tasks are often better than large, monolithic tasks, because they can be subdivided and spread across more CPU cores. However, beware of creating a scenario that involves huge amounts of locking to communicate between tasks, as this can completely destroy performance

  • Avoid sleeping a thread as much as possible, especially if doing so leaves one or more cores idle; this can create a situation where the core never actually sleeps (since there is nothing to yield to) and the cache can be thoroughly trashed due to context switching

  • Always profile your code, and especially perform regression tests after making an optimization to ensure that it is not in fact making a negative impact

  • Detecting CPU topology is very difficult, but there are some useful techniques here

  • Setting core/CPU affinities is dangerous and tricky; avoid doing so wherever possible, or at the very least allow users to disable affinitization. Chances are the OS and CPU architecture will function better when left to distribute code across cores as they see fit.



Additional Resources
Slides from the talk are available here.

Intel offers a comprehensive set of manuals for their processors here. Their Thread Building Blocks library also demonstrates some highly effective techniques for working with threaded code.
 
 
Menu
 Back to GDC 2009
 See more Programming
 Discuss this article