Jump to content

  • Log In with Google      Sign In   
  • Create Account

Banner advertising on our site currently available from just $5!


1. Learn about the promo. 2. Sign up for GDNet+. 3. Set up your advert!


#ActualMJP

Posted 04 February 2014 - 02:55 AM


Imagine you have just spent eight years and many million dollars developing a library of code centered around the Cell architecture. Would you be very happy to hear you need to throw it away?

 

Why would you throw it away? Sure some parts of it may be SPU-specific, but paralleizable code that work on small chunks of contiguous data is great on almost any architecture. Most studios aren't going to be doing it Naughty Dog style with piles of hand-written pipelined SPU assembly. Which of course makes sense why Mark said that it was first-parties that were skeptical of x86.

 

 


Imagine you have spent eight years hiring people, and your focus has been to include people who deeply understand the "supercomputer on a chip" design that Cell offered, which is most powerful when developers focus on the chip as a master processor with collection of slave processors, and now find that all those employees must go back to the x86 model. Would you be happy to hear that those employees will no longer be necessary?



So what, these mighty geniuses of programming are suddenly useless when given 8 x86 cores and a GPU? GPU compute in games is a largely unexplored field, and it's going to require smart people to figure out the best way to make use of it. And engines always need people that can take a pile of spaghetti gameplay code and turn it into something that can run across multiple cores without a billion cache misses.

 

 

 


The parallel architecture is a little more tricky to develop for, but considering all the supercomputers built out of old PS3s it should make you think. Building parallel algorithms where the work is partitioned across processors takes a lot more of the science aspect of programming, but the result is that you are trading serial time for parallel time and can potentially do significantly more work in the same wall-clock time. While many companies who focused on cross-platform designs take minimal advantage of the hardware, the companies who focused specifically on that system could do far more. Consider that in terms of raw floating point operations per second, X360's x86 architecture could perform 77 GFLOPS, the PS3 could perform 230 GFLOPS. It takes a bit more computer science application and system-specific coding to take advantage of it, but offering four times the raw processing power is a notable thing.

And yet for all of that processing power X360 games regularly outperformed their PS3 versions. What good is oodles of flops if it's not accessible to the average dev team? I for one am glad that Sony decided to take their head of the sand on this issue, and instead doubled down on making a system that made its power available to developers instead of hiding it away from them.

 

 

 


People who have spent their whole programming lives on the x86 platform don't really notice, and those who stick to single-threaded high level languages without relying on chip-specific functionally don't really notice, but the x86 family of processors really are awful compared to what else is out there. Yes the are general purpose and can do a lot, but other designs have a lot to offer. Consider how x86 does memory access: You request a single byte. That byte might or might not be in the cache, and might require a long time to load. There is no good way to request a block be fetched or maintained for frequent access. In many other processors you can map a block of memory for fast access and continuous cache use, and swap it out and replace it if you want to. The x86 family originated in the days of much slower memory and other systems were not powerful. On the x86 if you want to copy memory you might have a way to do a DMA transfer (tell the memory to copy directly from one memory point to another) but in practice that rarely happens; everything goes through the CPU. Compare this with many other systems where you can copy and transfer memory blocks in the background without having it travel across the entire motherboard. The very small number of CPU registers on x86 was often derided with it's paltry 8 registers, then later it's 16 registers up until the 64-bit extensions brought it up to a barely respectable 64 64-bit registers and 8 128-bit SIMD registers; competitors during the 16-register introduction often had 32 32-bit registers, and when the 64-bit extensions were introduced competitors offered 32-64-bit and 32-128-bit registers or more; in some cases offering 128+ 128-bit registers for your processing enjoyment. The x86 64-bit extensions helped ease a lot of stresses, but the x86 family at the assembly level absolutely shows its age since the core instructions are still based around the hardware concepts from the early 1970s rather than the physical realities of today's hardware. And on and on and on


Sure x86 is old and crusty, but that doesn't mean that AMD's x86 chips are necessarily bad because of it. In the context of the PS4's design constraints I can hardly see how it was a bad choice. Their box would be more expensive and would use more power if they'd stuck another Cell-like monster in a separate die instead of going with their SOC solution. 


#1MJP

Posted 04 February 2014 - 02:54 AM


Imagine you have just spent eight years and many million dollars developing a library of code centered around the Cell architecture. Would you be very happy to hear you need to throw it away?

 

Why would you throw it away? Sure some parts of it may be SPU-specific, but paralleizable code that work on small chunks of contiguous data is great on almost any architecture. Most studios aren't going to be doing it Naughty Dog style with piles of hand-written pipelined SPU assembly. Which of course makes sense why Mark said that it was first-parties that were skeptical of x86.

 


Imagine you have spent eight years hiring people, and your focus has been to include people who deeply understand the "supercomputer on a chip" design that Cell offered, which is most powerful when developers focus on the chip as a master processor with collection of slave processors, and now find that all those employees must go back to the x86 model. Would you be happy to hear that those employees will no longer be necessary?



So what, these mighty geniuses of programming are suddenly useless when given 8 x86 cores and a GPU? GPU compute in games is a largely unexplored field, and it's going to require smart people to figure out the best way to make use of it. And engines always need people that can take a pile of spaghetti gameplay code and turn it into something that can run across multiple cores without a billion cache misses.

 


The parallel architecture is a little more tricky to develop for, but considering all the supercomputers built out of old PS3s it should make you think. Building parallel algorithms where the work is partitioned across processors takes a lot more of the science aspect of programming, but the result is that you are trading serial time for parallel time and can potentially do significantly more work in the same wall-clock time. While many companies who focused on cross-platform designs take minimal advantage of the hardware, the companies who focused specifically on that system could do far more. Consider that in terms of raw floating point operations per second, X360's x86 architecture could perform 77 GFLOPS, the PS3 could perform 230 GFLOPS. It takes a bit more computer science application and system-specific coding to take advantage of it, but offering four times the raw processing power is a notable thing.

And yet for all of that processing power X360 games regularly outperformed their PS3 versions. What good is oodles of flops if it's not accessible to the average dev team? I for one am glad that Sony decided to take their head of the sand on this issue, and instead doubled down on making a system that made it's power available to developers instead of hiding it away from them.

 


People who have spent their whole programming lives on the x86 platform don't really notice, and those who stick to single-threaded high level languages without relying on chip-specific functionally don't really notice, but the x86 family of processors really are awful compared to what else is out there. Yes the are general purpose and can do a lot, but other designs have a lot to offer. Consider how x86 does memory access: You request a single byte. That byte might or might not be in the cache, and might require a long time to load. There is no good way to request a block be fetched or maintained for frequent access. In many other processors you can map a block of memory for fast access and continuous cache use, and swap it out and replace it if you want to. The x86 family originated in the days of much slower memory and other systems were not powerful. On the x86 if you want to copy memory you might have a way to do a DMA transfer (tell the memory to copy directly from one memory point to another) but in practice that rarely happens; everything goes through the CPU. Compare this with many other systems where you can copy and transfer memory blocks in the background without having it travel across the entire motherboard. The very small number of CPU registers on x86 was often derided with it's paltry 8 registers, then later it's 16 registers up until the 64-bit extensions brought it up to a barely respectable 64 64-bit registers and 8 128-bit SIMD registers; competitors during the 16-register introduction often had 32 32-bit registers, and when the 64-bit extensions were introduced competitors offered 32-64-bit and 32-128-bit registers or more; in some cases offering 128+ 128-bit registers for your processing enjoyment. The x86 64-bit extensions helped ease a lot of stresses, but the x86 family at the assembly level absolutely shows its age since the core instructions are still based around the hardware concepts from the early 1970s rather than the physical realities of today's hardware. And on and on and on


Sure x86 is old and crusty, but that doesn't mean that AMD's x86 chips are necessarily bad because of it. In the context of the PS4's design constraints I can hardly see how it was a bad choice. Their box would be more expensive and would use more power if they'd stuck another Cell-like monster in a separate die instead of going with their SOC solution. 

PARTNERS