From a hardware point of view imo an ideal setup would be;- x64/ARM cores for normal processing
- SPU/ALU array on the same die as the above for steam style processing
- DX11+ class GPU for graphics processing
That way you get a decent mix of the various processing requirements you have in a game; I'm kinda hoping the next consoles pick up this sort of mix of hardware tbh.
I agree entirely. We're going to need traditional CPUs for branchy stuff and scalar processing, pretty much everything we do on SIMD units today could be mapped to something like an SPU, and a discrete GPU can be optimized around graphics problems specifically, while still being general enough (as it already is) to be called in as reinforcements on some parallel problems.
I think the question in this set up is then, how much floating-point resources do CPUs have? Do they still have SIMD units? How about an FPU? If so, does each integer core get its own SIMD/FPU, or do we share between 2 or more integer cores like the newer SPARC cores and AMDs upcoming Bulldozer?
In my mind, you probably still need floating point on a per-core basis, but SIMD can probably be shared among 2-4 cores, if not eschewed entirely in favor of those SPU-like elements.
Give each CPU and SPU core their own cache, but put them on a shared cache at the highest level -- maybe let them DMA between lower-level caches -- and give them all explicit cache-control instructions. I think that would make for an awfully interesting architecture. Sans SIMD, you could probably double the number of CPU cores for a given area of die space.
8-16 ARM/x64/PPC cores, 8-32 "SPU" cores, plus a DX11-class GPU with 800 or so shader elements. Unified memory. Sign me up. I don't think its that far of a stretch to imagine something like that coming out of one of the console vendors next generation -- heck, its not much more than an updated mash-up of the PS3/360 anyhow.