And yet, when it comes to utilizing multi- and many-core hardware, games are years behind. I've spoken to a senior developer at a very well-known AAA MMO studio, who told me they don't allow their developers to write multi-threaded code. Another architect at a different MMO studio told me that only now they're realizing that they're going to have to abandon the serial game-loop way of thinking, at this is very hard for them to accept.
Again, this very much depends on who you talk to. You'll get a lot of the same responses in other segments of the software industry too.
In most places that I've worked, yes people are discouraged/banned from writing multi-threaded code based on shared-state concurrency, because it is notoriously difficult. Even veteran engineers make mistakes in this paradigm from time to time, as it's notoriously difficult to prove that your code doesn't have any race conditions or locking/scheduling problems.
In many courses, this particular paradigm is the only method of multi-threading that's taught -- so when you interview a graduate and ask how they'd take advantage of a multi-core games console, they almost always talk about mutexes!
Often, shared-state code like this seems to be correct, and may even pass it's unit test successfully for months or years, before failing randomly and unreproducibly. Many old games have multi-threading bugs that don't manifest on single-core CPU's, but do manifest on true SMP hardware -- e.g. to play Sim City 4 without random crashes, I have to go into the Windows task manager and change it's thread affinity to restrict it to one core! To avoid this, it's better to simply avoid this unreliable paradigm as much as possible.
Other paradigms, like the one you've presented, aren't as easy to totally screw up. Message passing, the Actor Model, functional style programming, and automatic decomposition of serial code into a directed-acyclic-graph of task nodes, are used.
For example, the "serial game loop" can be written as high level Lua code. Whenever it makes a function call to the underlying engine, this call isn't executed immediately, instead it's queued and a future is returned. At the end of the Lua code, all these queued functions and futures can have their dependencies determined (by which functions have which futures passed in as arguments) and can be converted into a DAG. The DAG can then be linearized into a schedule of individual tasks and fences, spread across many cores.
This lets your high-level programmers continue to write seemingly serial code, which is automatically parallelized, while a few veterans of shared-state concurrency maintain the scheduling logic.
As mentioned before, the PS3 has a slow GPU and a slow/simple (in-order, etc) single-core CPU (overview). All of it's computing power is in the many-core NUMA co-CPU. This means that every PS3 game of any complexity at all, is pretty much guaranteed to contain a damn lot of parallel code (and not the shared-state kind, as that does not perform well at all under NUMA).
The PS3 with it's many-core hybrid CPU, and the 360 with it's hyperthreaded tri-core CPU have been out for over half a decade, so any game developer working on consoles games should definitely had the multi-core lesson sink in by now! Failing to write multi-core capable code has the effect of reducing the power of these consoles by a huge factor (~3-6x), so it's safe to assume people who are generalized as being obsessed with squeezing out ever last drop of performance from a bit of hardware would be investigating their options here...
Cool. What do you use to do it?
P.S. All our GPU code is JIT compiled
Sorry, I meant that everyone's GPU code is JIT compiled.
You control the GPU via an API like D3D, and shader code (e.g. HLSL) which is compiled into an intermediate bytecode format.
The driver then takes the command queue that's produced from D3D, and your shader byte-code, and JIT compiles these into actual proprietary GPU instructions as necessary.