Programmable depth buffer/rasteriser?

Started by
18 comments, last by Leo_E_49 17 years, 11 months ago
Is there any development towards making a programmable depth buffer (and presumably a programmable rasteriser) on graphics cards? I would like to be able to program my own depth functions into depth buffers, instead of the standard painter's algorithm in the relatively near future. Imho, this would help incredibly with calculating volumetric lighting and soft shadows. Or am I going to have to wait a while for this technology to become available? :p [Edited by - Leo_E_49 on May 7, 2006 8:47:21 PM]
Advertisement
You mean utilizing the oDepth D3D9 ASM register/DEPTH0 HLSL fragment semantic/whatever the heck GLSL, Cg or GLASM (?) have?

Well, you can, but it shuts off early z-cull resulting in a fairly big perf drop.
No, I'm fairly sure this is outside of a pixel/vertex shader. It's controlled by the rasteriser in the graphics pipeline. I want to eliminate the painter's algorithm altogether :)

In other words, from what I understand:

Vertex shader -> Tesselator -> Rasteriser -> Pixel shader

I want to program the data which is put into the depth buffer by the rasteriser when it would usually be performing the painter's algorithm to setup the depth buffer. Instead of writing normal depth values using zc = -ze* (f+n)/(f-n) - we* 2*f*n/(f-n), I want the ability to write whatever values into the buffer I want (using a shader or whatever ends up being made for it). Also, it would be nice to have the ability to write whatever values into graphics buffers I want, instead of writing normal colours to buffers, vertices could contain fluid dynamics or other mathematical data to be written using the rasteriser to a screen buffer for parallel processing on the GPU.

To clarify even further. Normally the following is the case:

Vertex -> transformed to screen space by a vertex shader or ffp -> rendered to buffer with depths calculated using the standard formula written above, colours are interpolated perspective-correct linearly across the triangles, all this is done in the rasteriser -> pixel shader applied to the buffer created by the rasteriser and passed on to the frame buffer

I want to be able to:

Vertex -> transformed to screen space -> rendered to buffer with depths which I define using my own depth function and also arbitrary data which is interpolated the way I program it to be (not necessarily perspective correct linear interpolation) -> pixel shader applied to the buffer created by the rasteriser and passed on to the frame buffer

I know that the tesselator will be programmable in the near future under the unified shading model, but what about the rasteriser? or is that considered part of the tesselator? (I didn't get this impression from what I read about it)

This would allow me to program deep shadow mapping on a per-pixel basis. The depth function could be calculated in graphics hardware instead of being ray-traced per-vertex on the CPU as described in this document:
http://graphics.stanford.edu/papers/deepshadows/

I hope someone can tell me whether this functionality is planned in the near future, or even whether it is possible to do. :) Which I'm sure it is, because John Carmack (theoretically) had the ability to do this when he was programming Doom and Quake in pre-hardware acceleration days, when he had to write his own rasteriser. (Abrash 1997, LaMothe 2003)

P.S. Having the ability to program the depth buffer functions could mean that volumetric rendering would be trivial, as would per pixel correct alpha blending without having to sort objects at all. This kind of control over the graphics hardware would be great for graphics which could normally only be programmed using voxels.

P.P.S. I heard someone talking about DirectX 10 like it was going to be the last major upgrade in graphics technology before the only changes in GPUs end up being performance increases. Until all of the abilities of the GPU are available as part of a programmable pipeline will this be the case, I do not think that will be for a long time to come.

BTW: If I'm the first person to come up with this idea (seriously doubt it :p) and someone ends up making this into actual hardware for some strange reason, good on you! XD

In fact, I'm going to push this idea right to its limits. We're truly in the realm of science fiction here:

If it were possible to determine the formula for interpolation of values between vertices, using splines for example, then why not allow for a programmable triangle silhouette? Judging from the rasteriser code I've encountered (LaMothe, 2003), this is very much within the realm of possibility. In which case, no longer would we be dealing with triangles, but we could potentially be dealing with per-pixel accurate 3 sided NURBS surfaces in their place. No longer would we need displacement or normal mapping (although they could be used to add even more detail), real time NURBS surfaces programmed on the rasteriser would have per-pixel accurate silhouettes with a fraction of the cost of displacement mapping for the same level of detail. This would spell the end of the polygon, instead all surfaces in computer graphics would be polyNURBS surfaces. If such technology were available in real time, I would say that real-time graphics have a good shot at approaching photo-realism. Of course, retraining for this kind of graphics programming would hearken back to the days of DOOM and QUAKE programming, and also have close ties to programming for non-real-time graphics. Tools would have to be created to provide vertex data which would calculate silhouettes which tesselate well. But, in future, perhaps this will be possible. :p

References:
LaMothe, A. (2003). Tricks of the 3D Game Programming Gurus-Advanced 3D Graphics and Rasterization. Sams Publishing.
Abrash, M. (1997) Michael Abrash's Graphics Programming Black Book (Special Edition). Coriolis Group Books.

[Edited by - Leo_E_49 on May 7, 2006 10:32:40 PM]
The rasterizer doesn't put the depth data into the depth buffer; it merely passes it to the pixel processing pipeline (pixel shader). So, Cypher19's approach is what you want.

Also, generating things like splines on the GPU is exactly what the geometry shaders in D3D10 can be used for. You'd pass in a single vertex or pair of vertices (depending on how much data you have per-primitive) and then emit a list of lines (or a line-strip) that are set up with the positions/data of the spline as it goes along.

(I wouldn't recommend using a 9-year-old book and a book on software rendering as resources on the operation of modern hardware, btw.)

Richard "Superpig" Fine - saving pigs from untimely fates - Microsoft DirectX MVP 2006/2007/2008/2009
"Shaders are not meant to do everything. Of course you can try to use it for everything, but it's like playing football using cabbage." - MickeyMouse

Just wild speculation, it's good to hear that this sort of thing will be available with geometry shading, however. What puts the depth values in the depth buffer instead of the rasteriser these days then? Also, what does depth occlusion these days instead of it?

I'm still pretty sure that I can't make depth functions using oDepth on the GPU. From what I understand, the depth data passed into the pixel shader already has the painter's algorithm applied to it, so all the depth data of hidden surfaces (depth occluded) is lost. If this were not the case, I'm sure the article I saw on deep shadow mapping would have calculated the depth function in a pixel shader. Without that information, deep shadow mapping must indeed be ray traced per polygon. Also, if this information were in fact available, true penumbral soft shadowing methods would be trivial too, and could be done in the rasteriser in the pipeline stage between the geometry and pixel shaders. Furthermore such algorithms would be pixel perfect, without aliasing artifacts, because they do not depend on scaling the depth buffer and masking it on top of distant objects. I see this as a problem relating to ffp, which could be solved if per pixel depth calculations did not have a depth-buffer replace method (painter's algorithm). In other words, deep shadow mapping requires you to see depth-occluded data, which would normally be culled out by having only a single depth buffer value.

Furthermore, I'm pretty sure that 9 year old book still describes the fundamentals of rasterisation as used today. I've not read any documentation contradicting it. I hope that the graphics card manufacturers haven't changed the basis of their graphics rasterisation without telling anyone. Any rasteriser has to convert screen space triangles to pixels on screen, I presumed that doing so included the process of division by w and subsequent application of the painter's algorithm to receive the final depth value at each pixel before painting it onto the buffer which is passed, pixel by pixel, to the pixel shader. I'm well aware of how modern graphics hardware works barring the rasterisation phase. I hope someone will publish a more detailed article about the rasterisation phase on modern graphics hardware if there have been significant advances in algorithms used.

There's not much documentation on this subject out there (modern rasterisers are a bit of a mystery, with pixel and vertex shaders, and now geometry shaders stealing the limelight), so forgive my ignorance if rasterisation is now done somehow differently. But if it's still simple plotting of pixels in rows forming the outline of a triangle on a buffer, then it can be made programmable, and the benefits of doing so are enourmous.

Edit: According to wikipedia, the sweep-line algorithm described in the 1997 and 2003 books is still used on modern graphics hardware and the rasteriser does allocate values to the depth buffer.

http://en.wikipedia.org/wiki/Rasterisation

If this source is wrong, I will request that it be modified to describe modern rendering standards.

If it isn't, would anyone care to tell me whether it is going to be made programmable?

Edit: Further confirmation that rasterisation still works the same way it did in 1997.
http://herakles.zcu.cz/local/manuals/OpenGl/glspec1.1/node41.html
http://www.cs.fit.edu/~wds/classes/graphics/Rasterize/rasterize/rasterize.html#SECTION00070000000000000000
http://online.cs.nps.navy.mil/DistanceEducation/online.siggraph.org/2002/Panels/01_WhenWillRayTracingReplaceRasterization/cdrom.pdf


Also described
http://www.opengl.org/documentation/specs/version1.1/glspec1.1/node54.html#SECTION00651000000000000000

I quote: "A scan-line rasterizer that linearly interpolates data along each edge and then linearly interpolates data across each horizontal span from edge to edge". This functionality can be made programmable. I do not want to have to use the existing linear interpolation of scanlines, I would like the ability to interpolate these by spline.

Now I'm sure we're still using the same algorithm, if it's on the OpenGL website. Why are we still using an outdated, inflexible rasterisation algorithm from 1997 in our graphics hardware? Why layer programmability on top of such a clearly inflexible basis of rendering?


[Edited by - Leo_E_49 on May 8, 2006 4:48:42 AM]
The rasterizer will not be programmable with D3D10, and probably not for the foreseeable future.

I've heard talks about programmable interpolators though, which if these could be applied to the depth value, would be able to achieve the same effect. By applied to the depth value, I mean applied to the same value as used by the hardware depth buffer and early z rejection, not any user defined parameter. I don't have any sources for this available, so I'm sorry I can't back up my information.. I might be wrong as well, if my memory doesn't serve me right, but it does seem like a natural step.

Personally, I'd like to move towards getting a general purpose, fully programmable, vector processing monster in there, instead of this special purpose hardware. It might set us back a couple of years in terms of graphic fidelity, but the possibilites from there are so much less limited. It could also be used for other types of processing, such as physics, and would keep all this data in the same place. Naturally you'd want it to be virtualized, so you can have a larger dataset than what can fit on the actual hardware. It seems this is the direction graphics hardware is heading, just in a round about way [smile]
It's a pity that no development is being done towards making a flexible (non-linear interpolation) rasteriser. I would have liked to program real-time fog and variable density objects with deep shadows in real time :( Vector graphics, the bane of spline surfaces. :p

To represent a programmable rasteriser ASCII-graphically:

Vector rasterisation: (produces triangles) Fixed function rasteriser

         x        .      ..    ...  .....x.....    ..     x


Here, interpolation between vertices is done using straight lines.

Non-vector rasterisation: (produces NURBS surfaces) Programmable rasteriser

        .x     ...   ...  .... ....x....  ...    .x


Here, interpolation between vertices is done using splines.

Graphics cards of today are unfortunately running on an outdated paradigm. :(

[Edited by - Leo_E_49 on May 8, 2006 9:14:02 AM]
Quote:Original post by Leo_E_49
What puts the depth values in the depth buffer instead of the rasteriser these days then?


All that the rasterizer does, all that it ever did, was convert polygons to fragments/pixels. You may be used to a rendering architecture where the depth-test-and-write happened immediately after the rasterizer, but the two operations are by no means linked.

Quote:Also, what does depth occlusion these days instead of it?
The "z-buffer hardware," comprising units that perform the z-test, and write new z-values to the buffer, updating things like hierarchical z structures in the process.

Quote:From what I understand, the depth data passed into the pixel shader already has the painter's algorithm applied to it, so all the depth data of hidden surfaces (depth occluded) is lost.
Not if you're writing oDepth in the pixel shader. Sure, if you're not then modern cards will perform the depth-test beforehand (and also the write, though the write could happen in parallel) because it saves you from having to run shaders on occluded pixels (handling the Z values at this point in time is known as 'early Z'). However, if you are modifying the depth in the pixel shader, then the hardware cannot perform the test-and-write until after the shader is done because it doesn't know what the final depth value is. This is why modifying the depth values in the pixel shader causes you to lose early Z for the rest of the frame; the hardware's not equipped to feed your newly calculated depth value back into the hierarchical processor and must go direct to the buffer, so the hierarchy gets out of sync with what's actually in the buffer, making it useless.

Quote:If this were not the case, I'm sure the article I saw on deep shadow mapping would have calculated the depth function in a pixel shader.
If you'd care to point us to the article in question, I'm sure we can try and figure out for you why they do it the way they do.

Quote:Furthermore, I'm pretty sure that 9 year old book still describes the fundamentals of rasterisation as used today.
Fundamentals, sure. Details and deep performance characteristics, probably not.

Quote:I've not read any documentation contradicting it.


Take a look through this lot. In particular, this chapter from GPU Gems 2 describes the architecture of the GeForce 6, including full coverage of where Z operations are performed in the pipeline.

Quote:I hope that the graphics card manufacturers haven't changed the basis of their graphics rasterisation without telling anyone.
Don't be ridiculous.

Quote:Any rasteriser has to convert screen space triangles to pixels on screen, I presumed that doing so included the process of division by w and subsequent application of the painter's algorithm to receive the final depth value at each pixel before painting it onto the buffer which is passed, pixel by pixel, to the pixel shader.
Your presumption is ill-advised. The rasteriser calculates the depth value that will later be used by the z-test hardware, but it does not perform the test itself. (As noted, early Z means that the test might happen immediately afterwards, but only in some situations).

Quote:Edit: According to wikipedia, the sweep-line algorithm described in the 1997 and 2003 books is still used on modern graphics hardware and the rasteriser does allocate values to the depth buffer.

http://en.wikipedia.org/wiki/Rasterisation

If this source is wrong, I will request that it be modified to describe modern rendering standards.
It's misleading at best. The rasteriser's responsibility is purely to break triangles into fragments, with appropriately interpolated values as per-fragment data. Nothing else.

Richard "Superpig" Fine - saving pigs from untimely fates - Microsoft DirectX MVP 2006/2007/2008/2009
"Shaders are not meant to do everything. Of course you can try to use it for everything, but it's like playing football using cabbage." - MickeyMouse

Quote:Original post by superpig
All that the rasterizer does, all that it ever did, was convert polygons to fragments/pixels. You may be used to a rendering architecture where the depth-test-and-write happened immediately after the rasterizer, but the two operations are by no means linked.


Thanks for that, I wasn't aware they were separate operations. :) However, it would be nice if this were programmable too.

Quote:The "z-buffer hardware," comprising units that perform the z-test, and write new z-values to the buffer, updating things like hierarchical z structures in the process.


This is what I'd really like access to. The z-cull part of the graphics pipeline.

Quote:Not if you're writing oDepth in the pixel shader. Sure, if you're not then modern cards will perform the depth-test beforehand (and also the write, though the write could happen in parallel) because it saves you from having to run shaders on occluded pixels (handling the Z values at this point in time is known as 'early Z'). However, if you are modifying the depth in the pixel shader, then the hardware cannot perform the test-and-write until after the shader is done because it doesn't know what the final depth value is. This is why modifying the depth values in the pixel shader causes you to lose early Z for the rest of the frame; the hardware's not equipped to feed your newly calculated depth value back into the hierarchical processor and must go direct to the buffer, so the hierarchy gets out of sync with what's actually in the buffer, making it useless.


Let me make sure that I've got the concept behind this correct (I usually use GLSL so I'm not that familiar with oDepth). I write a modified value to oDepth for each pixel and then when the shader is complete, it does the painters algorithm on these values? I'm afraid that will not suffice for what I'm talking about. The depth functions used depend on the layering of depth values of more than one fragment. It would almost certainly have to be done in the z-cull process, instead of applying painters, for example, the hardware might average all the depths it encounters at a single fragment.

Quote:If you'd care to point us to the article in question, I'm sure we can try and figure out for you why they do it the way they do.


http://graphics.stanford.edu/papers/deepshadows/deepshad.pdf

Quote:Fundamentals, sure. Details and deep performance characteristics, probably not.


What book can I read on the topic which will be relavent to modern hardware architectures? I checked in Real-Time Rendering, it's got the same algorithm listed on page 18 too. Has there been a replacement for the painter's algorithm and scan-line rasterisation?

Quote:Take a look through this lot. In particular, this chapter from GPU Gems 2 describes the architecture of the GeForce 6, including full coverage of where Z operations are performed in the pipeline.


Thanks, that explains a great deal. Things haven't changed that much though from what I can see here. (Relative to what I was talking about that is)

Quote:Don't be ridiculous.


I was being a bit wasn't I? :p

Quote:Your presumption is ill-advised. The rasteriser calculates the depth value that will later be used by the z-test hardware, but it does not perform the test itself. (As noted, early Z means that the test might happen immediately afterwards, but only in some situations).


So, the rasteriser generates buffers containing the z values of each polygon independently, storing them all and then later, the z-cull is applied to these fragments after a fragment shader is applied to them?

In any case, this is not the issue in question. The ability to program the value which ends up stored in the depth buffer is the issue, I'm a bit tired of using the painter's algorithm, when other algorithms would be so much more flexible.

Quote:It's misleading at best. The rasteriser's responsibility is purely to break triangles into fragments, with appropriately interpolated values as per-fragment data. Nothing else.


Thanks for clarifying.

Perhaps I should break my proposition up into two sections:

1. Programmable z-cull unit (or whatever it's called).

Are there plans to make the z-cull unit programmable? So that I could write any depth function on post-pixel-shader data that I need?

2. Programmable rasteriser.

By the same token, are there plans to make the rasteriser programmable? So that linear interpolation is not the only interpolation method available to graphics programmers? Spline interpolation of vertex data is very appealing as is the prospect of spline "triangle" outlines, so that polyNURBS surfaces could be used in real time? (Actually, is this even possible?)

[Edited by - Leo_E_49 on May 8, 2006 11:36:47 AM]
there is an article in ShaderX4 on how to use deep shadow maps on modern graphics hardware. It is called "Real-Time Soft Shadows Using the PDSM Technique". PDSM stands for Penumbra Deep Shadow Maps. This is very close to Deep Shadow maps :-)

This topic is closed to new replies.

Advertisement