DX maybe dead before long....

Started by
41 comments, last by Ravyne 13 years, 1 month ago

it should now evolve to more flexible, efficient, and powerful programmable shading, which if I'm not mistaken is a synonym for lower level.


This "lower level" is something I totally agree with. DirectX will probably evolve in this direction as the hardware does, or it will give way to an API that does.

I think the article was championing access to "the metal" and a console development paradigm as the "lower level" and I cannot disagree with that more.

I can see now how we've had confusion. Both are "lower" than the current part-fixed-part-programmable pipeline. Your version of lower is far more rational and tractable in my opinion. In many ways, it is a natural extension of the current piecewise program-ability of GPUs. But, it has nothing to do with what the current generation of consoles do to compete with the PC market.
Advertisement

Programmable shading was a step in the right direction. Now we need full programming access to the GPU in way that facilitates parallel programming. I think we need to get rid of the idea of treating the GPU as a way to process individual triangles through the vertex and pixel shaders.


Right now, on a currently-available version of DirectX, you have the ability to author completely generic shader programs that are dispatched in groups of threads whose size you control, and are mapped to the hardware's separate processors in a way that you directly control. So I think it's completely reasonable to say that DX lets you completely bypass the whole triangle rasterization thing and treat it as a generic parallel processing unit. It doesn't get much more generic than "create resources, point shaders at them, and let them rip".

So the question is, how much further can we take it? Let's look at what we can do on consoles right now, and whether we could do it on PC:

  • Direct access to the command buffer - can't do this, because the low-level command buffer format is proprietary and changes from GPU to GPU. There may also be device memory access concerns, as someone else already pointed out
  • Statically creating command buffers, either at runtime or at tool time - this is kind of interesting...maybe you link to an Nvidia GTX480 library and it lets you directly spit out GPU-readable command buffers rather than an intermediate DX format. The problems are that this puts the burden of validation and debug layers on the IHV, who has to have it work for many library/hardware variations. And I think we all know better than to rely on IHV's for that sort of thing. Plus your app wouldn't be forward-compatible with new GPU's unless IHV's built translation into the driver layer, at which point you have the same situation as DirectX. It could possibly also be built into the GPU, but then you have an x86-esque situation where the GPU pretends it has some ISA but in reality it's something else under the hood.
  • Compiling shaders directly to GPU-readable microcode, and potentially allow access to inline microcode or microcode-only authoring - this is really the same situation as above. Now IHV's are in charge of the shader compiler, and you have the issue with backwards compatibility.
  • Direct access to device memory allocation - I don't think this really buys you much, and would make life very difficult due to the peculiarities of different hardware. I mean does anybody not get annoyed when they have to deal with tiled memory regions on the PS3, and its quirky memory alignment requirements?
Really when I think of the big advantages of console development with regards to graphics performance, most of them come from the simple fact of knowing what hardware your code is going to be run on. You can profile your shaders, make micro-optimizations, and know that they'll still work for the end user. Or you can exploit certain quirks of the hardware, like people do for a certain popular RSX synchronization method. But that stuff would almost never be worth it on PC, because the number of hardware configurations is just too vast.

From a hardware point of view imo an ideal setup would be;
  • x64/ARM cores for normal processing
  • SPU/ALU array on the same die as the above for steam style processing
  • DX11+ class GPU for graphics processing
That way you get a decent mix of the various processing requirements you have in a game; I'm kinda hoping the next consoles pick up this sort of mix of hardware tbh.


I agree entirely. We're going to need traditional CPUs for branchy stuff and scalar processing, pretty much everything we do on SIMD units today could be mapped to something like an SPU, and a discrete GPU can be optimized around graphics problems specifically, while still being general enough (as it already is) to be called in as reinforcements on some parallel problems.

I think the question in this set up is then, how much floating-point resources do CPUs have? Do they still have SIMD units? How about an FPU? If so, does each integer core get its own SIMD/FPU, or do we share between 2 or more integer cores like the newer SPARC cores and AMDs upcoming Bulldozer?

In my mind, you probably still need floating point on a per-core basis, but SIMD can probably be shared among 2-4 cores, if not eschewed entirely in favor of those SPU-like elements.

Give each CPU and SPU core their own cache, but put them on a shared cache at the highest level -- maybe let them DMA between lower-level caches -- and give them all explicit cache-control instructions. I think that would make for an awfully interesting architecture. Sans SIMD, you could probably double the number of CPU cores for a given area of die space.

8-16 ARM/x64/PPC cores, 8-32 "SPU" cores, plus a DX11-class GPU with 800 or so shader elements. Unified memory. Sign me up. I don't think its that far of a stretch to imagine something like that coming out of one of the console vendors next generation -- heck, its not much more than an updated mash-up of the PS3/360 anyhow.

throw table_exception("(? ???)? ? ???");

This topic is closed to new replies.

Advertisement