# Programmable depth buffer/rasteriser?

This topic is 4275 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Is there any development towards making a programmable depth buffer (and presumably a programmable rasteriser) on graphics cards? I would like to be able to program my own depth functions into depth buffers, instead of the standard painter's algorithm in the relatively near future. Imho, this would help incredibly with calculating volumetric lighting and soft shadows. Or am I going to have to wait a while for this technology to become available? :p [Edited by - Leo_E_49 on May 7, 2006 8:47:21 PM]

##### Share on other sites
You mean utilizing the oDepth D3D9 ASM register/DEPTH0 HLSL fragment semantic/whatever the heck GLSL, Cg or GLASM (?) have?

Well, you can, but it shuts off early z-cull resulting in a fairly big perf drop.

##### Share on other sites
No, I'm fairly sure this is outside of a pixel/vertex shader. It's controlled by the rasteriser in the graphics pipeline. I want to eliminate the painter's algorithm altogether :)

In other words, from what I understand:

I want to program the data which is put into the depth buffer by the rasteriser when it would usually be performing the painter's algorithm to setup the depth buffer. Instead of writing normal depth values using zc = -ze* (f+n)/(f-n) - we* 2*f*n/(f-n), I want the ability to write whatever values into the buffer I want (using a shader or whatever ends up being made for it). Also, it would be nice to have the ability to write whatever values into graphics buffers I want, instead of writing normal colours to buffers, vertices could contain fluid dynamics or other mathematical data to be written using the rasteriser to a screen buffer for parallel processing on the GPU.

To clarify even further. Normally the following is the case:

Vertex -> transformed to screen space by a vertex shader or ffp -> rendered to buffer with depths calculated using the standard formula written above, colours are interpolated perspective-correct linearly across the triangles, all this is done in the rasteriser -> pixel shader applied to the buffer created by the rasteriser and passed on to the frame buffer

I want to be able to:

Vertex -> transformed to screen space -> rendered to buffer with depths which I define using my own depth function and also arbitrary data which is interpolated the way I program it to be (not necessarily perspective correct linear interpolation) -> pixel shader applied to the buffer created by the rasteriser and passed on to the frame buffer

I know that the tesselator will be programmable in the near future under the unified shading model, but what about the rasteriser? or is that considered part of the tesselator? (I didn't get this impression from what I read about it)

This would allow me to program deep shadow mapping on a per-pixel basis. The depth function could be calculated in graphics hardware instead of being ray-traced per-vertex on the CPU as described in this document:

I hope someone can tell me whether this functionality is planned in the near future, or even whether it is possible to do. :) Which I'm sure it is, because John Carmack (theoretically) had the ability to do this when he was programming Doom and Quake in pre-hardware acceleration days, when he had to write his own rasteriser. (Abrash 1997, LaMothe 2003)

P.S. Having the ability to program the depth buffer functions could mean that volumetric rendering would be trivial, as would per pixel correct alpha blending without having to sort objects at all. This kind of control over the graphics hardware would be great for graphics which could normally only be programmed using voxels.

P.P.S. I heard someone talking about DirectX 10 like it was going to be the last major upgrade in graphics technology before the only changes in GPUs end up being performance increases. Until all of the abilities of the GPU are available as part of a programmable pipeline will this be the case, I do not think that will be for a long time to come.

BTW: If I'm the first person to come up with this idea (seriously doubt it :p) and someone ends up making this into actual hardware for some strange reason, good on you! XD

In fact, I'm going to push this idea right to its limits. We're truly in the realm of science fiction here:

If it were possible to determine the formula for interpolation of values between vertices, using splines for example, then why not allow for a programmable triangle silhouette? Judging from the rasteriser code I've encountered (LaMothe, 2003), this is very much within the realm of possibility. In which case, no longer would we be dealing with triangles, but we could potentially be dealing with per-pixel accurate 3 sided NURBS surfaces in their place. No longer would we need displacement or normal mapping (although they could be used to add even more detail), real time NURBS surfaces programmed on the rasteriser would have per-pixel accurate silhouettes with a fraction of the cost of displacement mapping for the same level of detail. This would spell the end of the polygon, instead all surfaces in computer graphics would be polyNURBS surfaces. If such technology were available in real time, I would say that real-time graphics have a good shot at approaching photo-realism. Of course, retraining for this kind of graphics programming would hearken back to the days of DOOM and QUAKE programming, and also have close ties to programming for non-real-time graphics. Tools would have to be created to provide vertex data which would calculate silhouettes which tesselate well. But, in future, perhaps this will be possible. :p

References:
LaMothe, A. (2003). Tricks of the 3D Game Programming Gurus-Advanced 3D Graphics and Rasterization. Sams Publishing.
Abrash, M. (1997) Michael Abrash's Graphics Programming Black Book (Special Edition). Coriolis Group Books.

[Edited by - Leo_E_49 on May 7, 2006 10:32:40 PM]

##### Share on other sites
The rasterizer doesn't put the depth data into the depth buffer; it merely passes it to the pixel processing pipeline (pixel shader). So, Cypher19's approach is what you want.

Also, generating things like splines on the GPU is exactly what the geometry shaders in D3D10 can be used for. You'd pass in a single vertex or pair of vertices (depending on how much data you have per-primitive) and then emit a list of lines (or a line-strip) that are set up with the positions/data of the spline as it goes along.

(I wouldn't recommend using a 9-year-old book and a book on software rendering as resources on the operation of modern hardware, btw.)

##### Share on other sites
Just wild speculation, it's good to hear that this sort of thing will be available with geometry shading, however. What puts the depth values in the depth buffer instead of the rasteriser these days then? Also, what does depth occlusion these days instead of it?

I'm still pretty sure that I can't make depth functions using oDepth on the GPU. From what I understand, the depth data passed into the pixel shader already has the painter's algorithm applied to it, so all the depth data of hidden surfaces (depth occluded) is lost. If this were not the case, I'm sure the article I saw on deep shadow mapping would have calculated the depth function in a pixel shader. Without that information, deep shadow mapping must indeed be ray traced per polygon. Also, if this information were in fact available, true penumbral soft shadowing methods would be trivial too, and could be done in the rasteriser in the pipeline stage between the geometry and pixel shaders. Furthermore such algorithms would be pixel perfect, without aliasing artifacts, because they do not depend on scaling the depth buffer and masking it on top of distant objects. I see this as a problem relating to ffp, which could be solved if per pixel depth calculations did not have a depth-buffer replace method (painter's algorithm). In other words, deep shadow mapping requires you to see depth-occluded data, which would normally be culled out by having only a single depth buffer value.

Furthermore, I'm pretty sure that 9 year old book still describes the fundamentals of rasterisation as used today. I've not read any documentation contradicting it. I hope that the graphics card manufacturers haven't changed the basis of their graphics rasterisation without telling anyone. Any rasteriser has to convert screen space triangles to pixels on screen, I presumed that doing so included the process of division by w and subsequent application of the painter's algorithm to receive the final depth value at each pixel before painting it onto the buffer which is passed, pixel by pixel, to the pixel shader. I'm well aware of how modern graphics hardware works barring the rasterisation phase. I hope someone will publish a more detailed article about the rasterisation phase on modern graphics hardware if there have been significant advances in algorithms used.

There's not much documentation on this subject out there (modern rasterisers are a bit of a mystery, with pixel and vertex shaders, and now geometry shaders stealing the limelight), so forgive my ignorance if rasterisation is now done somehow differently. But if it's still simple plotting of pixels in rows forming the outline of a triangle on a buffer, then it can be made programmable, and the benefits of doing so are enourmous.

Edit: According to wikipedia, the sweep-line algorithm described in the 1997 and 2003 books is still used on modern graphics hardware and the rasteriser does allocate values to the depth buffer.

http://en.wikipedia.org/wiki/Rasterisation

If this source is wrong, I will request that it be modified to describe modern rendering standards.

If it isn't, would anyone care to tell me whether it is going to be made programmable?

Edit: Further confirmation that rasterisation still works the same way it did in 1997.
http://herakles.zcu.cz/local/manuals/OpenGl/glspec1.1/node41.html
http://www.cs.fit.edu/~wds/classes/graphics/Rasterize/rasterize/rasterize.html#SECTION00070000000000000000
http://online.cs.nps.navy.mil/DistanceEducation/online.siggraph.org/2002/Panels/01_WhenWillRayTracingReplaceRasterization/cdrom.pdf

Also described
http://www.opengl.org/documentation/specs/version1.1/glspec1.1/node54.html#SECTION00651000000000000000

I quote: "A scan-line rasterizer that linearly interpolates data along each edge and then linearly interpolates data across each horizontal span from edge to edge". This functionality can be made programmable. I do not want to have to use the existing linear interpolation of scanlines, I would like the ability to interpolate these by spline.

Now I'm sure we're still using the same algorithm, if it's on the OpenGL website. Why are we still using an outdated, inflexible rasterisation algorithm from 1997 in our graphics hardware? Why layer programmability on top of such a clearly inflexible basis of rendering?

[Edited by - Leo_E_49 on May 8, 2006 4:48:42 AM]

##### Share on other sites
The rasterizer will not be programmable with D3D10, and probably not for the foreseeable future.

I've heard talks about programmable interpolators though, which if these could be applied to the depth value, would be able to achieve the same effect. By applied to the depth value, I mean applied to the same value as used by the hardware depth buffer and early z rejection, not any user defined parameter. I don't have any sources for this available, so I'm sorry I can't back up my information.. I might be wrong as well, if my memory doesn't serve me right, but it does seem like a natural step.

Personally, I'd like to move towards getting a general purpose, fully programmable, vector processing monster in there, instead of this special purpose hardware. It might set us back a couple of years in terms of graphic fidelity, but the possibilites from there are so much less limited. It could also be used for other types of processing, such as physics, and would keep all this data in the same place. Naturally you'd want it to be virtualized, so you can have a larger dataset than what can fit on the actual hardware. It seems this is the direction graphics hardware is heading, just in a round about way [smile]

##### Share on other sites
It's a pity that no development is being done towards making a flexible (non-linear interpolation) rasteriser. I would have liked to program real-time fog and variable density objects with deep shadows in real time :( Vector graphics, the bane of spline surfaces. :p

To represent a programmable rasteriser ASCII-graphically:

Vector rasterisation: (produces triangles) Fixed function rasteriser

         x        .      ..    ...  .....x.....    ..     x

Here, interpolation between vertices is done using straight lines.

Non-vector rasterisation: (produces NURBS surfaces) Programmable rasteriser

        .x     ...   ...  .... ....x....  ...    .x

Here, interpolation between vertices is done using splines.

Graphics cards of today are unfortunately running on an outdated paradigm. :(

[Edited by - Leo_E_49 on May 8, 2006 9:14:02 AM]

##### Share on other sites
Quote:
 Original post by Leo_E_49What puts the depth values in the depth buffer instead of the rasteriser these days then?

All that the rasterizer does, all that it ever did, was convert polygons to fragments/pixels. You may be used to a rendering architecture where the depth-test-and-write happened immediately after the rasterizer, but the two operations are by no means linked.

Quote:
 Also, what does depth occlusion these days instead of it?
The "z-buffer hardware," comprising units that perform the z-test, and write new z-values to the buffer, updating things like hierarchical z structures in the process.

Quote:
 From what I understand, the depth data passed into the pixel shader already has the painter's algorithm applied to it, so all the depth data of hidden surfaces (depth occluded) is lost.
Not if you're writing oDepth in the pixel shader. Sure, if you're not then modern cards will perform the depth-test beforehand (and also the write, though the write could happen in parallel) because it saves you from having to run shaders on occluded pixels (handling the Z values at this point in time is known as 'early Z'). However, if you are modifying the depth in the pixel shader, then the hardware cannot perform the test-and-write until after the shader is done because it doesn't know what the final depth value is. This is why modifying the depth values in the pixel shader causes you to lose early Z for the rest of the frame; the hardware's not equipped to feed your newly calculated depth value back into the hierarchical processor and must go direct to the buffer, so the hierarchy gets out of sync with what's actually in the buffer, making it useless.

Quote:
 If this were not the case, I'm sure the article I saw on deep shadow mapping would have calculated the depth function in a pixel shader.
If you'd care to point us to the article in question, I'm sure we can try and figure out for you why they do it the way they do.

Quote:
 Furthermore, I'm pretty sure that 9 year old book still describes the fundamentals of rasterisation as used today.
Fundamentals, sure. Details and deep performance characteristics, probably not.

Quote:

Take a look through this lot. In particular, this chapter from GPU Gems 2 describes the architecture of the GeForce 6, including full coverage of where Z operations are performed in the pipeline.

Quote:
 I hope that the graphics card manufacturers haven't changed the basis of their graphics rasterisation without telling anyone.
Don't be ridiculous.

Quote:
 Any rasteriser has to convert screen space triangles to pixels on screen, I presumed that doing so included the process of division by w and subsequent application of the painter's algorithm to receive the final depth value at each pixel before painting it onto the buffer which is passed, pixel by pixel, to the pixel shader.
Your presumption is ill-advised. The rasteriser calculates the depth value that will later be used by the z-test hardware, but it does not perform the test itself. (As noted, early Z means that the test might happen immediately afterwards, but only in some situations).

Quote:
 Edit: According to wikipedia, the sweep-line algorithm described in the 1997 and 2003 books is still used on modern graphics hardware and the rasteriser does allocate values to the depth buffer.http://en.wikipedia.org/wiki/RasterisationIf this source is wrong, I will request that it be modified to describe modern rendering standards.
It's misleading at best. The rasteriser's responsibility is purely to break triangles into fragments, with appropriately interpolated values as per-fragment data. Nothing else.

##### Share on other sites
Quote:
 Original post by superpigAll that the rasterizer does, all that it ever did, was convert polygons to fragments/pixels. You may be used to a rendering architecture where the depth-test-and-write happened immediately after the rasterizer, but the two operations are by no means linked.

Thanks for that, I wasn't aware they were separate operations. :) However, it would be nice if this were programmable too.

Quote:
 The "z-buffer hardware," comprising units that perform the z-test, and write new z-values to the buffer, updating things like hierarchical z structures in the process.

This is what I'd really like access to. The z-cull part of the graphics pipeline.

Quote:
 Not if you're writing oDepth in the pixel shader. Sure, if you're not then modern cards will perform the depth-test beforehand (and also the write, though the write could happen in parallel) because it saves you from having to run shaders on occluded pixels (handling the Z values at this point in time is known as 'early Z'). However, if you are modifying the depth in the pixel shader, then the hardware cannot perform the test-and-write until after the shader is done because it doesn't know what the final depth value is. This is why modifying the depth values in the pixel shader causes you to lose early Z for the rest of the frame; the hardware's not equipped to feed your newly calculated depth value back into the hierarchical processor and must go direct to the buffer, so the hierarchy gets out of sync with what's actually in the buffer, making it useless.

Let me make sure that I've got the concept behind this correct (I usually use GLSL so I'm not that familiar with oDepth). I write a modified value to oDepth for each pixel and then when the shader is complete, it does the painters algorithm on these values? I'm afraid that will not suffice for what I'm talking about. The depth functions used depend on the layering of depth values of more than one fragment. It would almost certainly have to be done in the z-cull process, instead of applying painters, for example, the hardware might average all the depths it encounters at a single fragment.

Quote:
 If you'd care to point us to the article in question, I'm sure we can try and figure out for you why they do it the way they do.

Quote:
 Fundamentals, sure. Details and deep performance characteristics, probably not.

What book can I read on the topic which will be relavent to modern hardware architectures? I checked in Real-Time Rendering, it's got the same algorithm listed on page 18 too. Has there been a replacement for the painter's algorithm and scan-line rasterisation?

Quote:
 Take a look through this lot. In particular, this chapter from GPU Gems 2 describes the architecture of the GeForce 6, including full coverage of where Z operations are performed in the pipeline.

Thanks, that explains a great deal. Things haven't changed that much though from what I can see here. (Relative to what I was talking about that is)

Quote:
 Don't be ridiculous.

I was being a bit wasn't I? :p

Quote:
 Your presumption is ill-advised. The rasteriser calculates the depth value that will later be used by the z-test hardware, but it does not perform the test itself. (As noted, early Z means that the test might happen immediately afterwards, but only in some situations).

So, the rasteriser generates buffers containing the z values of each polygon independently, storing them all and then later, the z-cull is applied to these fragments after a fragment shader is applied to them?

In any case, this is not the issue in question. The ability to program the value which ends up stored in the depth buffer is the issue, I'm a bit tired of using the painter's algorithm, when other algorithms would be so much more flexible.

Quote:
 It's misleading at best. The rasteriser's responsibility is purely to break triangles into fragments, with appropriately interpolated values as per-fragment data. Nothing else.

Thanks for clarifying.

Perhaps I should break my proposition up into two sections:

1. Programmable z-cull unit (or whatever it's called).

Are there plans to make the z-cull unit programmable? So that I could write any depth function on post-pixel-shader data that I need?

2. Programmable rasteriser.

By the same token, are there plans to make the rasteriser programmable? So that linear interpolation is not the only interpolation method available to graphics programmers? Spline interpolation of vertex data is very appealing as is the prospect of spline "triangle" outlines, so that polyNURBS surfaces could be used in real time? (Actually, is this even possible?)

[Edited by - Leo_E_49 on May 8, 2006 11:36:47 AM]

##### Share on other sites
there is an article in ShaderX4 on how to use deep shadow maps on modern graphics hardware. It is called "Real-Time Soft Shadows Using the PDSM Technique". PDSM stands for Penumbra Deep Shadow Maps. This is very close to Deep Shadow maps :-)

##### Share on other sites
Quote:
 Original post by Leo_E_49Let me make sure that I've got the concept behind this correct (I usually use GLSL so I'm not that familiar with oDepth). I write a modified value to oDepth for each pixel and then when the shader is complete, it does the painters algorithm on these values?
Yes, that's correct; it passes the value that you wrote to oDepth to the z-cull hardware. (I'd probably recommend moving away from using 'painters algorithm' to talk about that operation, because you can of course change the comparison function it uses, allowing you to only write fragments where the depth value is greater than the existing value, or equal to it, etc).

Quote:
 I'm afraid that will not suffice for what I'm talking about. The depth functions used depend on the layering of depth values of more than one fragment. It would almost certainly have to be done in the z-cull process, instead of applying painters, for example, the hardware might average all the depths it encounters at a single fragment.
Problem is, how/where do you intend to store these extra depth values? The z-buffer only has room for one value per fragment. The only technique I know that stores multiple depth values per fragment is called Depth Peeling and works by using a seperate z-buffer for each encountered depth value.

Quote:

Ahhh, ok.

Firstly, be aware that the technique described in the paper is not realtime; they cite render times in the order of seconds, not milliseconds.

Secondly, where does it say that they didn't compute the depth functions in a pixel shader? Personally, I suspect that they didn't, because these guys are at Pixar and thus were probably using Renderman for much of their work; Renderman is quite different to the graphics hardware inside your average PC. So chances are they didn't use pixel shaders simply because the concept does not exist in Renderman. (It uses microfragments instead, IIRC).

Quote:
 What book can I read on the topic which will be relavent to modern hardware architectures? I checked in Real-Time Rendering, it's got the same algorithm listed on page 18 too. Has there been a replacement for the painter's algorithm and scan-line rasterisation?
The basic approach is still the same - hence my comment that those books are probably fine for giving you the fundamentals - but papers like that GeForce 6 one are what you should be reading if you want to understand exactly how the approach is being implemented and the resulting idiosyncrasies. ATI have similar papers on their developer site, I believe.

Quote:

Quote:
 Don't be ridiculous.
I was being a bit wasn't I? :p
Yes, a little :P

Quote:
 So, the rasteriser generates buffers containing the z values of each polygon independently, storing them all and then later, the z-cull is applied to these fragments after a fragment shader is applied to them?
The rasterizer generates fragments. Each fragment contains it's position, z-value, texture coordinates, diffuse colour, etc, calculated by interpolation. If the current fragment shader does not modify the depth of the fragments (i.e. does not write to oDepth), then the hierarchical Z system will process the fragments, culling those that it can quickly see are occluded, and leaving those that are not. If the current fragment shader does modify the depth of the fragments, then the hierarchical Z system is skipped over (it's actually disabled until the Z buffer is next cleared). After possibly being early-Z-culled, the fragments are passed to the pixel pipeline, which - on modern cards - means that they go into a queue to be picked up by one of a number of pixel shader 'threads.' After the pixel shader has finished processing the fragment - which at this point is just a position, depth, and colour-per-rendertarget - it is passed to hardware which does the proper, fully-precise depth test (and possibly other things like the scissor test), and then if it passes that, on to the alpha blending and depth writing hardware.

Quote:
 In any case, this is not the issue in question. The ability to program the value which ends up stored in the depth buffer is the issue, I'm a bit tired of using the painter's algorithm, when other algorithms would be so much more flexible.
So you want to use seperate values for the depth test and the depth write?

Quote:
 Perhaps I should break my proposition up into two sections:
I think that's sensible...

Quote:
 1. Programmable z-cull unit (or whatever it's called).Are there plans to make the z-cull unit programmable? So that I could write any depth function on post-pixel-shader data that I need?
Not to my knowledge. The fact that the depth test can only be one of a small number of functions (NEVER, LESS, LEQ, EQ, NEQ, GEQ, GREATER, ALWAYS) allows the chip designers to make some significant optimizations (like the addition of early-Z). Making it programmable would be a major speed hit.

Quote:
 2. Programmable rasteriser.By the same token, are there plans to make the rasteriser programmable? So that linear interpolation is not the only interpolation method available to graphics programmers? Spline interpolation of vertex data is very appealing as is the prospect of spline "triangle" outlines, so that polyNURBS surfaces could be used in real time? (Actually, is this even possible?)

Again, I know of no plans to make the rasterizer programmable; the speed hit would be immense.

However, we already have the ability to do Bezier patches in realtime, and have done for several generations of graphics hardware - it's simply handled by tesselating the surface into triangles pre-rasterizer. If you tesselate it into small enough pieces (like, say, microfragments), then you'll get the exact same results as if you used a 'programmable tesselator.' This is the sort of thing you could use D3D10's geometry shader for; I believe there may even already be a demo around of a renderer that passes control point geometry to the GS, which synthesises the actual patch geometry from it.

##### Share on other sites
Quote:
Quote:
 1. Programmable z-cull unit (or whatever it's called). Are there plans to make the z-cull unit programmable? So that I could write any depth function on post-pixel-shader data that I need?

Not to my knowledge. The fact that the depth test can only be one of a small number of functions (NEVER, LESS, LEQ, EQ, NEQ, GEQ, GREATER, ALWAYS) allows the chip designers to make some significant optimizations (like the addition of early-Z). Making it programmable would be a major speed hit.

In addition to the speed hit, WHY would you need the z-cull unit to be programmable? Culling is a boolean operation and hence you can only have boolean ops like never, less, etc. as pig outlined above. The inputs can be changed though, and that's where the oDepth register can be utilized to meet your ends.

If you're feeling REALLY ballsy though, you can utilize the clip()(HLSL)/texkill(D3DASM) instructions, in conjunction with depth buffer lookups in the pixel shader to perform the operations that you want to do, since that'd let you cull pixels based on whatever criteria you want, including z-vals.

##### Share on other sites
Quote:
 Original post by superpig(I'd probably recommend moving away from using 'painters algorithm' to talk about that operation

What are these called collectively these days then?

Quote:
 Original post by superpigProblem is, how/where do you intend to store these extra depth values? The z-buffer only has room for one value per fragment. The only technique I know that stores multiple depth values per fragment is called Depth Peeling and works by using a seperate z-buffer for each encountered depth value.

Quote:
 Original post by superpigSo you want to use seperate values for the depth test and the depth write?

I don't want to store multiple values in the depth buffer at all. I want to store a density value in the depth buffer instead. If I just wanted to add more depth values, I'd have asked whether there would be more depth buffers being added to the graphics pipeline.

This is similar to the depth functions which are used with each fragment in the paper which I linked to above. Basically, if a depth is written into a position on the depth buffer, instead of overwriting the existing value, it could be combined with it. For example, taking the average of the existing value and the new value and storing that in the buffer instead (you could think of it as depth-blending perhaps?). This could mean that the depth buffer could be used to easily render volumetric data. Of course, you'd cap the depth value at 1.0 (or whatever the current maximum value in the buffer is), in order to render solid (opaque) objects. (Presumably, you would only want to add to the "density buffer" as it were, but perhaps subtraction would provide an interesting effect)

Perhaps instead of making this completely programmable, there could be a number of blend flags (add, subtract, modulate, etc), similar to alpha blending back in the "old days".

Quote:
 Original post by superpigAgain, I know of no plans to make the rasterizer programmable; the speed hit would be immense.

Well, I would argue that it may eventually get to the point where it isn't, perhaps when the destinction between triangles and pixels begins to decrease. All the new data you'd need per vertex is a pair of control vectors to handle the splines for the edges of the 2D NURBS patch. The interpolation would use a spline tracing algorithm to create a modified form of scan-line rasterisation, instead of a Bresenham's derivative scan-line rasterisation. I'm sure there are plenty of people who can do the maths, I might even give it a shot myself just out of interest. Whether this will be more or less efficient than rendering scenes polygonally in the future is the question. I'll warrant that just one of these NURBS patches would save a couple of hundred triangles on the path to photo-realism.

Quote:
 Original post by Cypher19In addition to the speed hit, WHY would you need the z-cull unit to be programmable? Culling is a boolean operation and hence you can only have boolean ops like never, less, etc. as pig outlined above. The inputs can be changed though, and that's where the oDepth register can be utilized to meet your ends.If you're feeling REALLY ballsy though, you can utilize the clip()(HLSL)/texkill(D3DASM) instructions, in conjunction with depth buffer lookups in the pixel shader to perform the operations that you want to do, since that'd let you cull pixels based on whatever criteria you want, including z-vals.

This is purely a volumetric rendering issue (i.e. not using the depth buffer for culling) and I was just being hopefully optimistic that this technology may someday be available. Unfortunately, it doesn't look like this will happen in the forseeable future. :(

Being able to store density values in the depth buffer would probably reduce the processing required to apply the above-described algorithm. Although I will give your method a try, thanks for the suggestion. :)

[Edited by - Leo_E_49 on May 8, 2006 4:58:11 PM]

##### Share on other sites
Quote:
 Original post by Leo_E_49This is similar to the depth functions which are used with each fragment in the paper which I linked to above. Basically, if a depth is written into a position on the depth buffer, instead of overwriting the existing value, it could be combined with it. For example, taking the average of the existing value and the new value and storing that in the buffer instead (you could think of it as depth-blending perhaps?). This could mean that the depth buffer could be used to easily render volumetric data. Of course, you'd cap the depth value at 1.0 (or whatever the current maximum value in the buffer is), in order to render solid objects.
So, you want to test against the current value, and if the test passes, blend the new value with the current value. Are you sure that the alpha channel can't be made to do this?

Quote:
 Well, I would argue that it may eventually get to the point where it isn't, perhaps when the destinction between triangles and pixels begins to decrease. All the new data you'd need per vertex is a pair of control vectors to handle the splines for the edges of the 2D NURBS patch. The interpolation would use a spline tracing algorithm to create a modified form of scan-line rasterisation, instead of a Bresenham's derivative scan-line rasterisation. I'm sure there are plenty of people who can do the maths, I might even give it a shot myself just out of interest. Whether this will be more or less efficient than rendering scenes polygonally in the future is the question. I'll warrant that just one of these NURBS patches would save a couple of hundred triangles on the path to photo-realism.
The chap who gave the 'Future of Direct3D' talk at the Microsoft Developer Day this past GDC agrees with you; he was talking about moving away from triangles as rendering primitives, and using patches instead. Personally, I'm not so sure. Photorealism doesn't really have anything to do with our ability (or inability) to render curved surfaces; it's much more about detail, about high-frequency geometry.

##### Share on other sites
Quote:
 Original post by superpigSo, you want to test against the current value, and if the test passes, blend the new value with the current value. Are you sure that the alpha channel can't be made to do this?

Well, if I could test the alpha channels of previously processed fragments, sure it would be possible. However, I can't do a screen render for each triangle on the screen, that would be far too slow and would defeat the aim of my dissertation (real-time graphics). I'm pretty sure I can't access the current frame-buffer in a pixel shader either, or I'd have heard about it. Is there any other buffer I have constant access to between rendering individual fragments other than the depth buffer?

Next best thing is to lock a screen texture and per-pixel process that myself, which would also not be real-time, I've tried that one before.

Quote:
 Original post by superpigThe chap who gave the 'Future of Direct3D' talk at the Microsoft Developer Day this past GDC agrees with you; he was talking about moving away from triangles as rendering primitives, and using patches instead. Personally, I'm not so sure. Photorealism doesn't really have anything to do with our ability (or inability) to render curved surfaces; it's much more about detail, about high-frequency geometry.

There's only so far you can subdivide a triangle before you reach a pixel (lim -> infinity), why not start with pixel-perfect curved surfaces in the first place? This would reduce the need for displacement mapping to a certain extent as silhouettes would already be non-polygonal. Also, normal mapping could be applied to NURBS surfaces just the same way they are applied to polygonal surfaces. I'd argue that it would also be easier to render organic objects with this kind of hardware, and it's these kinds of organic objects that really kill a nice looking scene if they look blocky around the edges.

It's good to hear that someone (in a position of slightly higher regard than myself) thinks the same way about the issue. :)

##### Share on other sites
Quote:
 Original post by Leo_E_49Well, if I could test the alpha channels of previously processed fragments, sure it would be possible. However, I can't do a screen render for each triangle on the screen, that would be far too slow and would defeat the aim of my dissertation (real-time graphics). I'm pretty sure I can't access the current frame-buffer in a pixel shader either, or I'd have heard about it. Is there any other buffer I have constant access to between rendering individual fragments other than the depth buffer?
Indeed, you can't access a buffer while you're writing to it; most people who need to do that kind of thing exploit the alpha blend readback to do what they want (i.e. convert their desired expression into destValue * factor + otherValue).

What you could try doing is to use the technique I referenced earlier, depth peeling. Your first stage would do a number of depth-only passes, each one progressively building depths 'into the scene.' The first pass is standard depth-only; the second one is similar, but you texkill any pixels with a depth value closer than in the first buffer (via a texture read and comparison). Repeat for as many layers as you want.

Then you can 'collapse' layers together, sampling each and doing whatever you need to with them - should be able to sample at least four layers per pass. Repeat until you've only got one layer left, and that's your result.

Quote:
 Next best thing is to lock a screen texture and per-pixel process that myself, which would also not be real-time, I've tried that one before.
Aye, you certainly don't want to do that.

Quote:
 There's only so far you can subdivide a triangle before you reach a pixel (lim -> infinity),
That's basically what microfragments are - pixel-sized triangles (or more usually, sub-pixel-sized ones).

Quote:
 This would reduce the need for displacement mapping to a certain extent as silhouettes would already be non-polygonal.
No, it wouldn't - the reason displacement maps get used is because they're an efficient way of storing high-frequency geometry information relative to a base plane. It's like the reason why we use texture maps for colour instead of just splitting our triangles up into small enough pieces and using vertex colours for everything.

Quote:
 Also, normal mapping could be applied to NURBS surfaces just the same way they are applied to polygonal surfaces.
Get displacement mapping working and you won't need normal mapping.

Quote:
 I'd argue that it would also be easier to render organic objects with this kind of hardware, and it's these kinds of organic objects that really kill a nice looking scene if they look blocky around the edges.
I'll agree that our flat-surface-oriented hardware tends to fail the most on totally-not-flat surfaces, that's pretty much an inherent property. But I don't agree that programmable rasterizers are by any means necessary to get pixel-perfect silhouettes, nor do I agree that they are the best investment given the various other graphics techniques that the hardware is trying to support us in.

##### Share on other sites
Well, whatever, I guess it doesn't matter what path the industry ends up taking towards this goal, it's the end product that matters (and the speed at which the graphics can be rendered). I still think that blended high frequency data is better than discrete high frequency data. For example, perlin noise vs regular noise.

Just out of curiosity, what information will microfragments be constructed from? The usual 3 vertices, etc? They sound suspiciously like voxels... Also, the lines between texturing and vertex colouring will quickly become blurred using these discreet microfragments, you would only need one colour loaded from file per microfragment. How would texture filtering work on such a small scale?

Thanks for the answers by the way, they are much appreciated. :)

##### Share on other sites
Quote:
 Original post by Leo_E_49Just out of curiosity, what information will microfragments be constructed from? The usual 3 vertices, etc? They sound suspiciously like voxels...
Nah, there's no volume to them. They're just really small polygons. IIRC they're created by subdividing geometry primitives, including regular polygons, NURBS surfaces, whatever the renderman renderer happens to support I guess.

Quote:
 Also, the lines between texturing and vertex colouring will quickly become blurred using these discreet microfragments, you would only need one colour loaded from file per microfragment. How would texture filtering work on such a small scale?
You're right, they certainly do blur that line - though you can have more than one colour per microfragment, because it's made up from multiple vertices, so you could have one colour per vertex. I'm not entirely sure how texturing is applied to them, but I do know that Renderman is pretty big on procedural textures - and I don't mean noise-based ones. This book has some good material on how to go about writing procedural textures for, say, stripes, or star patterns, and explains all the principles you'd use. If Renderman does use textures in a traditional way, then I'd imagine it simply works by having the microfragments be smaller than the individual texels.

If you're interested in microfragments and the Renderman rendering pipeline, I'd recommend reading up on it. The Renderman shader repository may shed some light on how it all works.

##### Share on other sites
Thanks for that. :) I'll try to get ahold of a copy of that book and have a read of it.