Direct3D10

posted in Cypher's Journal
Published December 16, 2005
Advertisement
I've decided to pause my game for awhile (which is about the same as putting the brakes on a car moving at 1km/h...) due to finals and something better than finding the Holy Grail inside the Arc of the Covenant: Direct3D 10.

Since the docs were released tuesday, I've been looking them up and down, asking around for clarification on certain things, and so on. Plain and simple, I think D3D10 is absolutely fantastic, and I ask for only one more bit of functionality that I've mentioned a bit: source pixel access in the pixel shader so that we can basically do custom blending, as opposed to using the still-fixed-function OMSetBlendState. Aside from that, I think it's great. It's extremely flexible, slim and powerful. I know that MS will be releasing subversions every now and then after the final release, but I think that even without those, devs would be finding new things to do with D3D10 over 5 years from now.

Anyways, after studying the docs and samples for the last couple days I've got a pretty good handle on D3D10, but there are a couple unadvertised features that really surprised me. Namely, multiple viewports and scissors. Right now I'm getting mixed readings on those, as Redbeard (who is a tester for the Direct3D 10 team) says that only one viewport or scissor can be bound to a single render target, but nothing in the docs suggest otherwise. That is something I'll want to work with to try and figure out what's-what.

Also, one other thing I'd like to have expanded on in the docs (Even though a good chunk of it is used in the shaders) is the SetRasterizerState/SetBlendState, etc. properties in FX/HLSL. I imagine that one could guess what each possible state for FX properties is though, but it'd still be nice to have that in the docs.


Since I'm going to be working on a not-as-high-end-as-my-current-computer laptop for the next 4 months while on my co-op work term, one thing I want to do is do a lot of D3D10 work using the REF rasterizer in my spare time. I've got a couple tech demos that I want to try out, and was wondering if anyone has any input for them (suggestions, changes, stuff like that).

-A demo doing really souped up shadow mapping using D3D10 features such as a single pass cubemap and depth buffer lookup in a pixel shader. Also, if I have time, work with some dynamic scaling of the shadow map.

-A demo showcasing Bezier surfaces, hopefully with, when needed, a virtually infinite level of detail. I know some of you will mention that hardware vendors only want us spitting out a max of 20 tri-er..primitives in the GS, but that'll be part of the challenge behind the demo.

-A demo showing a game where all of the logic is calculated in the shaders (i.e. only inputs and time increments are sent into the tech logic), possibly like the Geometry Wars clone I wanted to do. The guys on #graphicsdev kind of poo-poo'd the initial idea (I just said "a game" and didn't really specify much about it) but I think I might still give it a try, since I think it'll be a fun experiment. Plus, it could be a good demo showing off the variety of buffer accesses and unlimited shader length stuff in D3D10. If not that, maybe I'll give GPU physics a whirl. At the very least, something that has typically been reserved for CPU only.
Previous Entry Particle madness
0 likes 5 comments

Comments

jollyjeffers
I say you should write more journal entries on D3D10 [wink]

Quote:I ask for only one more bit of functionality that I've mentioned a bit: source pixel access in the pixel shader

You aren't the first (presumably not the last either) person to want this. However, I wonder if its actually feasable with current IHV implementations. Even with D3D10 - things like the core memory controllers and so on are probably going to be similar to D3D9 parts (reading/writing/blending is essentially the domain of the memory controller). Just wondering if, with the massively parallel and block architectures, its actually possible to know the source pixel at the time of processing. I'm sure they could but whether it'd be even remotely efficient is another question [oh]

Quote:Anyways, after studying the docs and samples for the last couple days I've got a pretty good handle on D3D10

You must have more spare time than me [headshake] Been reading bits-n-pieces for ages now and I still keep finding bits I'm sure weren't there yesterday [lol]

Quote:-A demo doing really souped up shadow mapping using D3D10 features such as a single pass cubemap and depth buffer lookup in a pixel shader. Also, if I have time, work with some dynamic scaling of the shadow map.

I'd essentially call that an optimization of current technology. Nothing wrong with that mind [smile]

Quote:-A demo showcasing Bezier surfaces, hopefully with, when needed, a virtually infinite level of detail. I know some of you will mention that hardware vendors only want us spitting out a max of 20 tri-er..primitives in the GS, but that'll be part of the challenge behind the demo.

Most of what I've heard about the GS, at least early IHV implementations, seem to indicate it'll be more of a "discovery" feature - allowing the pipeline to be more autonomous and expressive, rather than as a programmable tesselator. I wonder if you could dynamically alter the LOD based on distance though... seams would be a bugger to solve, but with 1-ring adjacency it must be possible.


I have a different suggestion [grin]

It's one I was hoping to look into, but knowing my luck I won't have the time ([sad]). With the system generated PrimitiveID value being accesible in the GS, I wonder if it's possible to offload a lot of the material system to the GPU. At the very least it could greatly simplify the need to bucket sort by shader params and textures...

Use the value in the GS/PS along with integer instruction to look up into an attached buffer - the buffer is just a lookup/reference indicating what textures/values/properties should then be used for final rasterization.

Jack
December 16, 2005 04:22 PM
Cypher19
Quote:I'd essentially call that an optimization of current technology. Nothing wrong with that mind


Well so is the Shadow Volume example in the SDK, right?

Quote:Most of what I've heard about the GS, at least early IHV implementations, seem to indicate it'll be more of a "discovery" feature - allowing the pipeline to be more autonomous and expressive, rather than as a programmable tesselator. I wonder if you could dynamically alter the LOD based on distance though... seams would be a bugger to solve, but with 1-ring adjacency it must be possible.


Unless there's some aspect of stream out that I'm missing, I see no reason why I couldn't make it a programmable tesselator. Heck, circlesoft even made a little GS that does tesselation of a mesh. The resulting image is the same, but it still works just fine:


Quote:With the system generated PrimitiveID value being accesible in the GS, I wonder if it's possible to offload a lot of the material system to the GPU. At the very least it could greatly simplify the need to bucket sort by shader params and textures...

Use the value in the GS/PS along with integer instruction to look up into an attached buffer - the buffer is just a lookup/reference indicating what textures/values/properties should then be used for final rasterization.


Well, if I do make that game (oo, just thought of a name: "Shader Wars"), a uint ID in the vertex struct, linking it to some index in a texture array, then yeah, a material system like that would be relatively easy to implement, and would greatly augment any kind of instancing system.
December 16, 2005 04:37 PM
jollyjeffers
Quote:
Quote:I'd essentially call that an optimization of current technology. Nothing wrong with that mind
Well so is the Shadow Volume example in the SDK, right?

Yup [smile]

My general point was that, at least for the first generation of D3D10 hardware, brute force processing won't be such a great idea. It'll use a bit of brute force, but the genius will be in leveraging the extra knowledge that the GS allows.

Quote:Unless there's some aspect of stream out that I'm missing, I see no reason why I couldn't make it a programmable tesselator.

Don't get me wrong - I'm not saying its impossible to make a tesselator, rather that it might not be practicle to do so.

Given that the VS is executed before the GS, you'll have to make sure that you dont get any seams forming between triangles. A few trivial Sub-D algorithms I've seen would map nicely to GS - but would require a whole lot of extra work to make sure that you didn't end up with awful creases and/or seams at boundaries.

Jack
December 16, 2005 06:10 PM
Monder
Quote:A demo showcasing Bezier surfaces, hopefully with, when needed, a virtually infinite level of detail.


Just remember there's a hard limit on how much geometry a geometry shader can emit, IIRC it's 1024 verts.

Quote:A demo doing really souped up shadow mapping using D3D10 features such as a single pass cubemap


One of the samples demonstrates a way to do this. It basically just makes six copies of all the tris in the geometry shader transforms each of them using one of six camera matrices and then sets the SV_RenderTargetArrayIndex semantic appropiately so it gets rendered to the right cube-map face. I wonder if you'd get a performance increase if you did some frustum culling in the GS and just render triangles to cube-map faces they'll actually appear on. I would guess not as the hardware can probably clip and cull quicker than code in a GS but you never know.

Quote:-A demo showing a game where all of the logic is calculated in the shaders (i.e. only inputs and time increments are sent into the tech logic)


There's a sample where they do a particle system entirely on the GPU. The features of D3D10 certainly allow more GPGPU stuff. You could actually do a full dynamic particle system on the GPU using SM3, you had textures which held the state of the system and a pixel shader which updated them. You then did a texture read in a vertex shader to get position info and transform vertices acordingly.
December 17, 2005 08:59 AM
Cypher19
Quote:Just remember there's a hard limit on how much geometry a geometry shader can emit, IIRC it's 1024 verts.


There's actually another limit that I mention in my post, and that is the fact that ATi and NV are recommending not to spit out more than a couple dozen triangles in one shot, as opposed to the 1k verts you mention. I actually mentioned this limit in the journal entry, but I can definitely work within the hardware limit.

Quote:One of the samples demonstrates a way to do this. It basically just makes six copies of all the tris in the geometry shader transforms each of them using one of six camera matrices and then sets the SV_RenderTargetArrayIndex semantic appropiately so it gets rendered to the right cube-map face. I wonder if you'd get a performance increase if you did some frustum culling in the GS and just render triangles to cube-map faces they'll actually appear on. I would guess not as the hardware can probably clip and cull quicker than code in a GS but you never know.


I know, I've looked at the CubemapGS sample up and down, and I have no doubt the shadow mapping will be loosely based on that. I say 'loosely' because I think I may instead take advantage of spitting out triangles to seperate viewports instead of seperate render targets. Plus, it'll also be able to do depth-buffer lookup.

Quote:The features of D3D10 certainly allow more GPGPU stuff.


Why do you think I want to try it out? :wink:
December 17, 2005 11:03 AM
You must log in to join the conversation.
Don't have a GameDev.net account? Sign up!
Profile
Author
Advertisement
Advertisement