Programmable depth buffer/rasteriser?

Started by
18 comments, last by Leo_E_49 17 years, 11 months ago
Thanks, I'll see if I can get a copy of ShaderX4 to read about it.
Advertisement
Quote:Original post by Leo_E_49
Let me make sure that I've got the concept behind this correct (I usually use GLSL so I'm not that familiar with oDepth). I write a modified value to oDepth for each pixel and then when the shader is complete, it does the painters algorithm on these values?
Yes, that's correct; it passes the value that you wrote to oDepth to the z-cull hardware. (I'd probably recommend moving away from using 'painters algorithm' to talk about that operation, because you can of course change the comparison function it uses, allowing you to only write fragments where the depth value is greater than the existing value, or equal to it, etc).

Quote:I'm afraid that will not suffice for what I'm talking about. The depth functions used depend on the layering of depth values of more than one fragment. It would almost certainly have to be done in the z-cull process, instead of applying painters, for example, the hardware might average all the depths it encounters at a single fragment.
Problem is, how/where do you intend to store these extra depth values? The z-buffer only has room for one value per fragment. The only technique I know that stores multiple depth values per fragment is called Depth Peeling and works by using a seperate z-buffer for each encountered depth value.

Quote:http://graphics.stanford.edu/papers/deepshadows/deepshad.pdf

Ahhh, ok.

Firstly, be aware that the technique described in the paper is not realtime; they cite render times in the order of seconds, not milliseconds.

Secondly, where does it say that they didn't compute the depth functions in a pixel shader? Personally, I suspect that they didn't, because these guys are at Pixar and thus were probably using Renderman for much of their work; Renderman is quite different to the graphics hardware inside your average PC. So chances are they didn't use pixel shaders simply because the concept does not exist in Renderman. (It uses microfragments instead, IIRC).

Quote:What book can I read on the topic which will be relavent to modern hardware architectures? I checked in Real-Time Rendering, it's got the same algorithm listed on page 18 too. Has there been a replacement for the painter's algorithm and scan-line rasterisation?
The basic approach is still the same - hence my comment that those books are probably fine for giving you the fundamentals - but papers like that GeForce 6 one are what you should be reading if you want to understand exactly how the approach is being implemented and the resulting idiosyncrasies. ATI have similar papers on their developer site, I believe.

Quote:
Quote:Don't be ridiculous.
I was being a bit wasn't I? :p
Yes, a little :P

Quote:So, the rasteriser generates buffers containing the z values of each polygon independently, storing them all and then later, the z-cull is applied to these fragments after a fragment shader is applied to them?
The rasterizer generates fragments. Each fragment contains it's position, z-value, texture coordinates, diffuse colour, etc, calculated by interpolation. If the current fragment shader does not modify the depth of the fragments (i.e. does not write to oDepth), then the hierarchical Z system will process the fragments, culling those that it can quickly see are occluded, and leaving those that are not. If the current fragment shader does modify the depth of the fragments, then the hierarchical Z system is skipped over (it's actually disabled until the Z buffer is next cleared). After possibly being early-Z-culled, the fragments are passed to the pixel pipeline, which - on modern cards - means that they go into a queue to be picked up by one of a number of pixel shader 'threads.' After the pixel shader has finished processing the fragment - which at this point is just a position, depth, and colour-per-rendertarget - it is passed to hardware which does the proper, fully-precise depth test (and possibly other things like the scissor test), and then if it passes that, on to the alpha blending and depth writing hardware.

Quote:
In any case, this is not the issue in question. The ability to program the value which ends up stored in the depth buffer is the issue, I'm a bit tired of using the painter's algorithm, when other algorithms would be so much more flexible.
So you want to use seperate values for the depth test and the depth write?

Quote:Perhaps I should break my proposition up into two sections:
I think that's sensible...

Quote:
1. Programmable z-cull unit (or whatever it's called).

Are there plans to make the z-cull unit programmable? So that I could write any depth function on post-pixel-shader data that I need?
Not to my knowledge. The fact that the depth test can only be one of a small number of functions (NEVER, LESS, LEQ, EQ, NEQ, GEQ, GREATER, ALWAYS) allows the chip designers to make some significant optimizations (like the addition of early-Z). Making it programmable would be a major speed hit.

Quote:
2. Programmable rasteriser.

By the same token, are there plans to make the rasteriser programmable? So that linear interpolation is not the only interpolation method available to graphics programmers? Spline interpolation of vertex data is very appealing as is the prospect of spline "triangle" outlines, so that polyNURBS surfaces could be used in real time? (Actually, is this even possible?)


Again, I know of no plans to make the rasterizer programmable; the speed hit would be immense.

However, we already have the ability to do Bezier patches in realtime, and have done for several generations of graphics hardware - it's simply handled by tesselating the surface into triangles pre-rasterizer. If you tesselate it into small enough pieces (like, say, microfragments), then you'll get the exact same results as if you used a 'programmable tesselator.' This is the sort of thing you could use D3D10's geometry shader for; I believe there may even already be a demo around of a renderer that passes control point geometry to the GS, which synthesises the actual patch geometry from it.

Richard "Superpig" Fine - saving pigs from untimely fates - Microsoft DirectX MVP 2006/2007/2008/2009
"Shaders are not meant to do everything. Of course you can try to use it for everything, but it's like playing football using cabbage." - MickeyMouse

Quote:
Quote: 1. Programmable z-cull unit (or whatever it's called).

Are there plans to make the z-cull unit programmable? So that I could write any depth function on post-pixel-shader data that I need?


Not to my knowledge. The fact that the depth test can only be one of a small number of functions (NEVER, LESS, LEQ, EQ, NEQ, GEQ, GREATER, ALWAYS) allows the chip designers to make some significant optimizations (like the addition of early-Z). Making it programmable would be a major speed hit.


In addition to the speed hit, WHY would you need the z-cull unit to be programmable? Culling is a boolean operation and hence you can only have boolean ops like never, less, etc. as pig outlined above. The inputs can be changed though, and that's where the oDepth register can be utilized to meet your ends.

If you're feeling REALLY ballsy though, you can utilize the clip()(HLSL)/texkill(D3DASM) instructions, in conjunction with depth buffer lookups in the pixel shader to perform the operations that you want to do, since that'd let you cull pixels based on whatever criteria you want, including z-vals.
Quote:Original post by superpig(I'd probably recommend moving away from using 'painters algorithm' to talk about that operation


What are these called collectively these days then?

Quote:Original post by superpigProblem is, how/where do you intend to store these extra depth values? The z-buffer only has room for one value per fragment. The only technique I know that stores multiple depth values per fragment is called Depth Peeling and works by using a seperate z-buffer for each encountered depth value.


Quote:Original post by superpigSo you want to use seperate values for the depth test and the depth write?


I don't want to store multiple values in the depth buffer at all. I want to store a density value in the depth buffer instead. If I just wanted to add more depth values, I'd have asked whether there would be more depth buffers being added to the graphics pipeline.

This is similar to the depth functions which are used with each fragment in the paper which I linked to above. Basically, if a depth is written into a position on the depth buffer, instead of overwriting the existing value, it could be combined with it. For example, taking the average of the existing value and the new value and storing that in the buffer instead (you could think of it as depth-blending perhaps?). This could mean that the depth buffer could be used to easily render volumetric data. Of course, you'd cap the depth value at 1.0 (or whatever the current maximum value in the buffer is), in order to render solid (opaque) objects. (Presumably, you would only want to add to the "density buffer" as it were, but perhaps subtraction would provide an interesting effect)

Perhaps instead of making this completely programmable, there could be a number of blend flags (add, subtract, modulate, etc), similar to alpha blending back in the "old days".

Quote:Original post by superpigAgain, I know of no plans to make the rasterizer programmable; the speed hit would be immense.


Well, I would argue that it may eventually get to the point where it isn't, perhaps when the destinction between triangles and pixels begins to decrease. All the new data you'd need per vertex is a pair of control vectors to handle the splines for the edges of the 2D NURBS patch. The interpolation would use a spline tracing algorithm to create a modified form of scan-line rasterisation, instead of a Bresenham's derivative scan-line rasterisation. I'm sure there are plenty of people who can do the maths, I might even give it a shot myself just out of interest. Whether this will be more or less efficient than rendering scenes polygonally in the future is the question. I'll warrant that just one of these NURBS patches would save a couple of hundred triangles on the path to photo-realism.

Quote:Original post by Cypher19
In addition to the speed hit, WHY would you need the z-cull unit to be programmable? Culling is a boolean operation and hence you can only have boolean ops like never, less, etc. as pig outlined above. The inputs can be changed though, and that's where the oDepth register can be utilized to meet your ends.

If you're feeling REALLY ballsy though, you can utilize the clip()(HLSL)/texkill(D3DASM) instructions, in conjunction with depth buffer lookups in the pixel shader to perform the operations that you want to do, since that'd let you cull pixels based on whatever criteria you want, including z-vals.


This is purely a volumetric rendering issue (i.e. not using the depth buffer for culling) and I was just being hopefully optimistic that this technology may someday be available. Unfortunately, it doesn't look like this will happen in the forseeable future. :(

Being able to store density values in the depth buffer would probably reduce the processing required to apply the above-described algorithm. Although I will give your method a try, thanks for the suggestion. :)

[Edited by - Leo_E_49 on May 8, 2006 4:58:11 PM]
Quote:Original post by Leo_E_49
This is similar to the depth functions which are used with each fragment in the paper which I linked to above. Basically, if a depth is written into a position on the depth buffer, instead of overwriting the existing value, it could be combined with it. For example, taking the average of the existing value and the new value and storing that in the buffer instead (you could think of it as depth-blending perhaps?). This could mean that the depth buffer could be used to easily render volumetric data. Of course, you'd cap the depth value at 1.0 (or whatever the current maximum value in the buffer is), in order to render solid objects.
So, you want to test against the current value, and if the test passes, blend the new value with the current value. Are you sure that the alpha channel can't be made to do this?

Quote:
Well, I would argue that it may eventually get to the point where it isn't, perhaps when the destinction between triangles and pixels begins to decrease. All the new data you'd need per vertex is a pair of control vectors to handle the splines for the edges of the 2D NURBS patch. The interpolation would use a spline tracing algorithm to create a modified form of scan-line rasterisation, instead of a Bresenham's derivative scan-line rasterisation. I'm sure there are plenty of people who can do the maths, I might even give it a shot myself just out of interest. Whether this will be more or less efficient than rendering scenes polygonally in the future is the question. I'll warrant that just one of these NURBS patches would save a couple of hundred triangles on the path to photo-realism.
The chap who gave the 'Future of Direct3D' talk at the Microsoft Developer Day this past GDC agrees with you; he was talking about moving away from triangles as rendering primitives, and using patches instead. Personally, I'm not so sure. Photorealism doesn't really have anything to do with our ability (or inability) to render curved surfaces; it's much more about detail, about high-frequency geometry.

Richard "Superpig" Fine - saving pigs from untimely fates - Microsoft DirectX MVP 2006/2007/2008/2009
"Shaders are not meant to do everything. Of course you can try to use it for everything, but it's like playing football using cabbage." - MickeyMouse

Quote:Original post by superpigSo, you want to test against the current value, and if the test passes, blend the new value with the current value. Are you sure that the alpha channel can't be made to do this?


Well, if I could test the alpha channels of previously processed fragments, sure it would be possible. However, I can't do a screen render for each triangle on the screen, that would be far too slow and would defeat the aim of my dissertation (real-time graphics). I'm pretty sure I can't access the current frame-buffer in a pixel shader either, or I'd have heard about it. Is there any other buffer I have constant access to between rendering individual fragments other than the depth buffer?

Next best thing is to lock a screen texture and per-pixel process that myself, which would also not be real-time, I've tried that one before.

Quote:Original post by superpigThe chap who gave the 'Future of Direct3D' talk at the Microsoft Developer Day this past GDC agrees with you; he was talking about moving away from triangles as rendering primitives, and using patches instead. Personally, I'm not so sure. Photorealism doesn't really have anything to do with our ability (or inability) to render curved surfaces; it's much more about detail, about high-frequency geometry.


There's only so far you can subdivide a triangle before you reach a pixel (lim -> infinity), why not start with pixel-perfect curved surfaces in the first place? This would reduce the need for displacement mapping to a certain extent as silhouettes would already be non-polygonal. Also, normal mapping could be applied to NURBS surfaces just the same way they are applied to polygonal surfaces. I'd argue that it would also be easier to render organic objects with this kind of hardware, and it's these kinds of organic objects that really kill a nice looking scene if they look blocky around the edges.

It's good to hear that someone (in a position of slightly higher regard than myself) thinks the same way about the issue. :)
Quote:Original post by Leo_E_49
Well, if I could test the alpha channels of previously processed fragments, sure it would be possible. However, I can't do a screen render for each triangle on the screen, that would be far too slow and would defeat the aim of my dissertation (real-time graphics). I'm pretty sure I can't access the current frame-buffer in a pixel shader either, or I'd have heard about it. Is there any other buffer I have constant access to between rendering individual fragments other than the depth buffer?
Indeed, you can't access a buffer while you're writing to it; most people who need to do that kind of thing exploit the alpha blend readback to do what they want (i.e. convert their desired expression into destValue * factor + otherValue).

What you could try doing is to use the technique I referenced earlier, depth peeling. Your first stage would do a number of depth-only passes, each one progressively building depths 'into the scene.' The first pass is standard depth-only; the second one is similar, but you texkill any pixels with a depth value closer than in the first buffer (via a texture read and comparison). Repeat for as many layers as you want.

Then you can 'collapse' layers together, sampling each and doing whatever you need to with them - should be able to sample at least four layers per pass. Repeat until you've only got one layer left, and that's your result.

Quote:Next best thing is to lock a screen texture and per-pixel process that myself, which would also not be real-time, I've tried that one before.
Aye, you certainly don't want to do that.

Quote:There's only so far you can subdivide a triangle before you reach a pixel (lim -> infinity),
That's basically what microfragments are - pixel-sized triangles (or more usually, sub-pixel-sized ones).

Quote:This would reduce the need for displacement mapping to a certain extent as silhouettes would already be non-polygonal.
No, it wouldn't - the reason displacement maps get used is because they're an efficient way of storing high-frequency geometry information relative to a base plane. It's like the reason why we use texture maps for colour instead of just splitting our triangles up into small enough pieces and using vertex colours for everything.

Quote:Also, normal mapping could be applied to NURBS surfaces just the same way they are applied to polygonal surfaces.
Get displacement mapping working and you won't need normal mapping.

Quote:I'd argue that it would also be easier to render organic objects with this kind of hardware, and it's these kinds of organic objects that really kill a nice looking scene if they look blocky around the edges.
I'll agree that our flat-surface-oriented hardware tends to fail the most on totally-not-flat surfaces, that's pretty much an inherent property. But I don't agree that programmable rasterizers are by any means necessary to get pixel-perfect silhouettes, nor do I agree that they are the best investment given the various other graphics techniques that the hardware is trying to support us in.

Richard "Superpig" Fine - saving pigs from untimely fates - Microsoft DirectX MVP 2006/2007/2008/2009
"Shaders are not meant to do everything. Of course you can try to use it for everything, but it's like playing football using cabbage." - MickeyMouse

Well, whatever, I guess it doesn't matter what path the industry ends up taking towards this goal, it's the end product that matters (and the speed at which the graphics can be rendered). I still think that blended high frequency data is better than discrete high frequency data. For example, perlin noise vs regular noise.

Just out of curiosity, what information will microfragments be constructed from? The usual 3 vertices, etc? They sound suspiciously like voxels... Also, the lines between texturing and vertex colouring will quickly become blurred using these discreet microfragments, you would only need one colour loaded from file per microfragment. How would texture filtering work on such a small scale?

Thanks for the answers by the way, they are much appreciated. :)
Quote:Original post by Leo_E_49
Just out of curiosity, what information will microfragments be constructed from? The usual 3 vertices, etc? They sound suspiciously like voxels...
Nah, there's no volume to them. They're just really small polygons. IIRC they're created by subdividing geometry primitives, including regular polygons, NURBS surfaces, whatever the renderman renderer happens to support I guess.

Quote:Also, the lines between texturing and vertex colouring will quickly become blurred using these discreet microfragments, you would only need one colour loaded from file per microfragment. How would texture filtering work on such a small scale?
You're right, they certainly do blur that line - though you can have more than one colour per microfragment, because it's made up from multiple vertices, so you could have one colour per vertex. I'm not entirely sure how texturing is applied to them, but I do know that Renderman is pretty big on procedural textures - and I don't mean noise-based ones. This book has some good material on how to go about writing procedural textures for, say, stripes, or star patterns, and explains all the principles you'd use. If Renderman does use textures in a traditional way, then I'd imagine it simply works by having the microfragments be smaller than the individual texels.

If you're interested in microfragments and the Renderman rendering pipeline, I'd recommend reading up on it. The Renderman shader repository may shed some light on how it all works.

Richard "Superpig" Fine - saving pigs from untimely fates - Microsoft DirectX MVP 2006/2007/2008/2009
"Shaders are not meant to do everything. Of course you can try to use it for everything, but it's like playing football using cabbage." - MickeyMouse

Thanks for that. :) I'll try to get ahold of a copy of that book and have a read of it.

This topic is closed to new replies.

Advertisement