Sign in to follow this  
quaker

Re-Starting TriStrips

Recommended Posts

When drawing triangle strips, there's a situation where the strip is stopped and needs to be restarted within the single draw command. I remebr that a dummy vertex introduced that tells the driver to restart the primitive. What's that vertex? The last vertex used or the one before? Thanks.

Share this post


Link to post
Share on other sites
I'm not aware of there being a standard code for this in Direct3D - I've got a feeling that the XBox (or maybe another console) supported it though.

It won't always work, but you could try degenerate triangles (where 2 vertices are the same position/properties). These can be useful if you need to go around sharp corners etc...

hth
Jack

Share this post


Link to post
Share on other sites
Quote:
Original post by quaker
Im' pretty sure it's mentioned in the Real-Time rendering book
I just had a flick through the my copy of Real-Time Rendering (2nd Ed) and I can't find any obvious mention of this.

Got a page, chapter or section reference?

Jack

Share this post


Link to post
Share on other sites
I had another quick look - admittedly I don't have the time to actually re-read every word...

Even Google doesn't pull up anything D3D related.

To be completely honest, I don't think it exists. As I previously stated, I have heard of such a feature - but on one (or more) of the consoles; not regular desktop parts.

Jack

Share this post


Link to post
Share on other sites
Quote:
Original post by quaker
When drawing triangle strips, there's a situation where the strip is stopped and needs to be restarted within the single draw command.


I assume you mean that the situation is when you want to start a new strip in the same draw command. This can be done by adding degenerate triangles, which is done by duplicating vertices.

Don't do that. Just use indexed vertices. They're simpler to understand and more efficient.

Share this post


Link to post
Share on other sites
Well this feature does exists in low-end gaming hardware, and not only consoles. Tp prove so, please check out the extension NV_primitive_restart, which works for all primitive types.

I guess I could figure out how to do that using standard OpenGL/Direct3D drawing commands. Just make the next tri-strip vertex the last one used. The driver will stop there either drawing a nonvisible triangle (line) and treat the next sequence of vertices as a new tri-strips command.

Thanks.

Share this post


Link to post
Share on other sites
Thanks for the NV_primitive_restart reference. There's nothing like this in Direct3D that I know of (although it's possible that there's an NVIDIA hack). As I said before, I'd suggest using an indexed triangle list.

Share this post


Link to post
Share on other sites
Unfortunately I tried it with DX, it did not work. But how come that book, RTR, says something that does not really exist. Could it be NVIDIA hardware that supports such feature by using this extension either in DX or OGL?

Thanks.

Share this post


Link to post
Share on other sites
Like ET3D said, there's nothing like this in D3D that I know of. I suspect it's implemented internally as a degenerate triangle.
What do you mean you tried it and it didn't work? What did you try, creating a degenerate triangle? If so, what code did you use to do that, I've done it before when I was making a terrain renderer. It works absolutely fine if you do it correctly.

Share this post


Link to post
Share on other sites
Don't worry about strips. The best you can do with a triangle strip is one triangle drawn per vertex transformed. If you instead create a triangle list that's sorted for the vertex cache, you can get one triangle drawn per 0.7 vertices transformed, or better. I e, you'll actually get better throughput with a list than with a strip.

Share this post


Link to post
Share on other sites
I do beleive the feature is pretty logical and it exists in at least some driver/hardware. I did not invet the information, it's mentioned in one of the greatest text books in CG, Real-time Rendering.
Degenrate triangles are something else, and it's obviously a tricky way to connect more than strips in a single drawing call. This proves that the need for re-start feature exists, as the NV_primitve_restart does. And if there's an extension in OGL which can do that, then it should be in D3D somewhere.

Share this post


Link to post
Share on other sites
Quote:
Original post by quaker
And if there's an extension in OGL which can do that, then it should be in D3D somewhere.

Doesn't work that way. OpenGL has extensions, so the chip companies can provide access to specific features of their hardware. Direct3D doesn't, and while it's possible to trick it, it's less straightforward.

It's possible that this functionality is available for NVIDIA hardware. However, as has been said a few times, this functionality is for a time when people where obsessed with strips. Since the invention of the vertex cache indexed triangle lists have become the way to go, so it's just a good idea to forget about strips. You could continue to dig for a hack to do what you want, but why?

Share this post


Link to post
Share on other sites
Quote:
Original post by hplus0603
Don't worry about strips. The best you can do with a triangle strip is one triangle drawn per vertex transformed. If you instead create a triangle list that's sorted for the vertex cache, you can get one triangle drawn per 0.7 vertices transformed, or better. I e, you'll actually get better throughput with a list than with a strip.

But strips consume just about 33% of memory consumed by lists (provided there aren`t many degenerate vertices). This can be a major issue with some terrain renderers.
As for throughput, why would you get better throughput with lists than with strips ? Provided they`re sorted same way, the vertex cache takes transformed vertex from last 16 transformed vertices whether it`s a strip or list.
Leaving first 16 vertices aside, every new triangle requires one new transformed vertex - be it with lists or strips.
EDIT: I just checked with ATI papers, and it seems that post-VS cache has 16 entries on GF2&GF4MX; then there are 24 entries on all NV-shader HW, and just 14 entries on ATI HW.

If your lists are really long, then the higher bandwidth requirements of lists (3 times larger IB) can have negative impact on your performance. It`s just few percent with large lists, but it`s nonetheless a slowdown (on top of higher memory requirements).

Now, wait a second ! I may just have discovered what you mean during writing of this reply. With regularly ordered list (as with heightmaps) by rows, you always have to transform one new vertex to complete a triangle (assuming, 2 other vertices are in post-VS cache) - thus 1 vertex transformed per 1 triangle. But, if I changed the ordering in such a way, that I would be forming sort-of mosaic shape (first just 1 quad, then 2x2 quad, then 3x3 quad,...), I would be able to use all of previous vertices with only 1 transformed vertex per 2 new triangles (except turn-arounds). Is that what you had on mind ? I`m gonna draw it on paper to see how much can be gained actually.

Share this post


Link to post
Share on other sites
Quote:
Original post by VladR
But strips consume just about 33% of memory consumed by lists (provided there aren`t many degenerate vertices). This can be a major issue with some terrain renderers.
They use 33% of the indices as a triangle list. An index is 2 bytes (usually), so it's not really a problem. The vertex data is going to be far larger (32 bytes for XYZ, normal and 1 set of texture coordinates), meaning the amount of memory used by the index data doesn't matter much.

I suppose it's one of these speed vs memory tradeoffs...

Share this post


Link to post
Share on other sites
Quote:
Original post by ET3D
Direct3D doesn't, and while it's possible to trick it, it's less straightforward.
For an example of such a trick, take a look at ATI's R2VB demo.

Share this post


Link to post
Share on other sites
Quote:
Original post by Evil Steve
Quote:
Original post by VladR
But strips consume just about 33% of memory consumed by lists (provided there aren`t many degenerate vertices). This can be a major issue with some terrain renderers.
They use 33% of the indices as a triangle list. An index is 2 bytes (usually), so it's not really a problem. The vertex data is going to be far larger (32 bytes for XYZ, normal and 1 set of texture coordinates), meaning the amount of memory used by the index data doesn't matter much.

I suppose it's one of these speed vs memory tradeoffs...


I'd have to second what Steve said. Its been my experience that the index over head is dwarfed by the vertex size. And considering that some people bake in other info to their vertex data the index overhead ratio only becomes smaller.

Cheers
Chris

Share this post


Link to post
Share on other sites
Quote:
Original post by Evil Steve
They use 33% of the indices as a triangle list. An index is 2 bytes (usually), so it's not really a problem.
Well, it depends on the terrain resolution and your HW requirements. Recently I`ve been doing one terrain renderer where it was necessary to store 2048x2048 heightmap into few MBs of VB&IBs. Consider the size of IB in a case of tri-list: 96 MB. Tri-strip would take just 32 MB. Of course, even this would be ridiculous, so I switched to 16-bit indices and VB switching (VB per chunk). Now, there`s just one IB needed per LOD and this IB has to index just one chunk of 65k vertices, so IB for Tri-List takes just 0.75 MB in highest detail. Here it doesn`t matter whether we have 0.75 MB for Tri-List or 0.25 for Tri-strip - that`s for sure.

Quote:
Original post by Evil Steve
The vertex data is going to be far larger (32 bytes for XYZ, normal and 1 set of texture coordinates), meaning the amount of memory used by the index data doesn't matter much.

According to my latest comparison, it`s just few percent (8% at most) difference, but you must have very large Index Buffers for this difference (strips vs lists, same cache-friendly order) to appear.

Besides, a naive vertex interpretation surely shall tak more than ideal 32 Bytes.
But since current game applications are never transform-limited anyway, why not use this fact and compress the vertex data ? I`m currently having a 14-Bytes long Vertex with position,texture coords, baked lighting information and texture splatting alpha value inside it. And since we have regular heightmap, we could use this fact and instead of X,Z position just store an index of vertex, and calculate the X,Z & tex coords inside the shader. Then, the Vertex shall consume just 8 Bytes for few more Vertex Shader instructions.

Quote:
Original post by Evil Steve
I suppose it's one of these speed vs memory tradeoffs...

Exactly, as it always is ...


I just finished my drawing of ideal order of Indices for a Tri-list. I found out that a 6x6 quad of 72 triangles can be rendered with just 48 transforms (assuming a post-VS cache of 24 entries on NV-shader HW). Such a 6x6 quad can be easily continued throughout whole heightmap. Thus, the ratio of 0.67 between transformed vertices and triangles rendered.

I would like to thank hplus0603 for sparkling this idea because if he didn`t say that you can do better than 1 transformed vertex per rendered triangle (as is the case with regular implementation of indexed tri-lists/strips), I wouldn`t start thinking of it and wouldn`t come up with this 6x6 quad method. Ratings Up, Man for sparkling my interest in new methods !

Now I only wonder, why on earth haven`t this been said specifically in all those nVidia/ATI papers. They always just say the same thing and thus indirectly lead you to a classical solution of indices by rows (thus 1 transform per 1 new triangle), whereas there is a much better alternative (this 6x6 quad).

And if I consider how easy it actually is, just writing it on paper and it`s obvious !

Share this post


Link to post
Share on other sites
Quote:
Original post by VladR
I just finished my drawing of ideal order of Indices for a Tri-list. I found out that a 6x6 quad of 72 triangles can be rendered with just 48 transforms (assuming a post-VS cache of 24 entries on NV-shader HW). Such a 6x6 quad can be easily continued throughout whole heightmap. Thus, the ratio of 0.67 between transformed vertices and triangles rendered.
You'll actually probably get better results using vertex cache priming.

Share this post


Link to post
Share on other sites
Quote:
Summerised from real-time rendering 2nd edition page 456
To "generalize" sequential strips you introduce a swap operation which swaps the order of the two latest vertices. This is implemented in Iris GL as an actual command ...in OGL and D3D however you have to resend the last vertex. This results in a triangle with no area. Restarting a triangle strip would require 2 vertices so a single vertex is a lesser penalty.


This to me sounds like a degenerate triangle [wink]

Share this post


Link to post
Share on other sites
Quote:
Original post by Promit
You'll actually probably get better results using vertex cache priming.
Thanks for the link, I`ll study it.

I`ve done some more calculations:
For a heightmap 252x252 (must be divisible by 6), there are 63504 unique vertices, and with above method of 6x6 quads, we need just 77826 transforms (for 127008 triangles - i.e. a ratio of just 0,61 !), which means there are just 22% of duplicate transforms.
I`ll look into that thread if and how he manages to get less duplicate transforms.

Share this post


Link to post
Share on other sites
Hugues Hoppe did work on optimising for the vertex cache. His work is part of D3DX, in the form of Optimize for meshes and D3DXOptimizeFaces and D3DXOptimizeVertices. Using these functions should result in pretty good ordering of vertices.

BTW, for regular grids I see no reason to use huge index buffers. It's possible to use a large vertex buffer and then draw it in parts using a smaller index buffer (say, 64x64). Sure it's more draw calls, but the overhead should be small enough, and breaking it down this way is likely to be useful for culling, anyway. So index memory shouldn't be a real issue.

[Edited by - ET3D on April 24, 2006 6:36:27 PM]

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this