• Create Account

## DirectX: What are the (dis)advantages of multi-stream multi-index rendering?

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

4 replies to this topic

### #1Xcrypt  Members

Posted 20 November 2012 - 10:25 AM

What are the advantages and disadvantages of multi-stream multi-index rendering? (as is done in a d3d10 sample)
Is it often used in graphics engines? When should I bother to implement it?

Edited by Xcrypt, 20 November 2012 - 11:29 AM.

### #2Hodgman  Moderators

Posted 21 November 2012 - 01:16 AM

It should probably only be used in cases where optimising for memory usage is your highest priority.

The default method of using indexed rendering only supports a single index stream, which is used to fetch all attributes. This method definately has specialized hardware designed for it in the GPU so that indexing is fast.

The multi-index method is implemented by the user making use of the general-purpose shading hardware, so it will have greater overheads, and likely perform worse. The advantage in indexing each attribute separately is that you might end up with less data.

e.g. a cube, with a square texture applied to each face, requires 8 unique positions and 4 unique UV coordinates.
However, in the default method, because a single index value is used to address all attributes, then you need at least 8 UV coordinates (1 for each position). Also, it's likely that although each face shares positions with it's neighbours, it may not use the same tex-coords. As a worst case, you end up with 24 (4 verts * 6 faces) unique vertices (position+UV combinations).

There might also be a motivation to use it if you're writing visualisation software for formats that use multiple index streams natively, such as OBJ and COLLADA files, and you don't want to bother converting the data to single-index format.

Edited by Hodgman, 21 November 2012 - 01:20 AM.

Posted 21 November 2012 - 05:10 AM

So would a AAA console game where memory (and/or bandwidth) is, or is preferably, tight typically use separate streams or would they use the default stream? Or would it be on a game-by-game basis?

### #4hupsilardee  Members

Posted 21 November 2012 - 06:25 PM

IIRC from something I read elsewhere, when the hardware reads in a vertex the read is done in 32 byte chunks from the vertex buffer, every time. So let's say your vertex shader input looks like this:

struct Vertex
{
float3 Position;
float3 Normal;
float2 TexCoord;
};


and your vertex buffers are set up like this:
// Vertex buffer 1 - Positions
pos1 | pos2 | pos3 | pos4 | ...
// Vertex buffer 2 - Normals
norm1 | norm2 | norm3 | norm4 | ...
// Vertex buffer 3 - TexCoords
tex1 | tex2 | tex3 | tex4 | ...


Then the device has to do 3 32 byte reads per vertex, for a total of 96 bytes, 64 of which is useless and will be discarded. However, if I packed one vertex buffer
// Vertex buffer 1 - Positions, normals and texcoords all interleaved
pos1 | norm1 | tex1 | pos2 | norm2 | tex2 | pos3 | norm3 | tex3 | pos4 | norm4 | tex4 | ...

Then the device only reads one 32 byte chunk which is a 3x bandwidth saver.

If your're complicating things further by using different indices, that's another lot of Buffer<ushort>::Load() calls slowing down the shader. There's no point sacrificing texture samples to save a little GPU memory, I bet your entire game's vertex content weighs less than your 2048x2048 shadow map anyway.

(Doesn't everybody have a 2048x2048 shadow map these days? Or several )

Edited by hupsilardee, 21 November 2012 - 06:30 PM.

### #5Hodgman  Moderators

Posted 21 November 2012 - 07:33 PM

So would a AAA console game where memory (and/or bandwidth) is, or is preferably, tight typically use separate streams or would they use the default stream? Or would it be on a game-by-game basis?

When making modern games on 6 year old hardware, everything is done on a game-by-game basis.

Going by publicly accessibly information (to respect NDAs), Wikipedia says the PS3's GPU uses the G70 architecture, which is DX9-level. This multiple-index-stream technique requires a DX10-level GPU that can perform manual fetching from buffers in the vertex shader. However, the PS3 has got the SPUs, which are fully programmable (and much more powerful than it's GPU) so in theory you could use the SPUs to do your vertex shading, but this would require careful synchronisation between the SPUs and GPU... Options to be evaluated game-by-game