Jump to content

  • Log In with Google      Sign In   
  • Create Account


DirectX: What are the (dis)advantages of multi-stream multi-index rendering?


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
4 replies to this topic

#1 Xcrypt   Members   -  Reputation: 154

Like
1Likes
Like

Posted 20 November 2012 - 10:25 AM

What are the advantages and disadvantages of multi-stream multi-index rendering? (as is done in a d3d10 sample)
Is it often used in graphics engines? When should I bother to implement it?

Edited by Xcrypt, 20 November 2012 - 11:29 AM.


Sponsor:

#2 Hodgman   Moderators   -  Reputation: 24060

Like
1Likes
Like

Posted 21 November 2012 - 01:16 AM

It should probably only be used in cases where optimising for memory usage is your highest priority.

The default method of using indexed rendering only supports a single index stream, which is used to fetch all attributes. This method definately has specialized hardware designed for it in the GPU so that indexing is fast.

The multi-index method is implemented by the user making use of the general-purpose shading hardware, so it will have greater overheads, and likely perform worse. The advantage in indexing each attribute separately is that you might end up with less data.

e.g. a cube, with a square texture applied to each face, requires 8 unique positions and 4 unique UV coordinates.
However, in the default method, because a single index value is used to address all attributes, then you need at least 8 UV coordinates (1 for each position). Also, it's likely that although each face shares positions with it's neighbours, it may not use the same tex-coords. As a worst case, you end up with 24 (4 verts * 6 faces) unique vertices (position+UV combinations).

There might also be a motivation to use it if you're writing visualisation software for formats that use multiple index streams natively, such as OBJ and COLLADA files, and you don't want to bother converting the data to single-index format.

Edited by Hodgman, 21 November 2012 - 01:20 AM.


#3 RobMaddison   Members   -  Reputation: 608

Like
0Likes
Like

Posted 21 November 2012 - 05:10 AM

So would a AAA console game where memory (and/or bandwidth) is, or is preferably, tight typically use separate streams or would they use the default stream? Or would it be on a game-by-game basis?

#4 hupsilardee   Members   -  Reputation: 485

Like
0Likes
Like

Posted 21 November 2012 - 06:25 PM

IIRC from something I read elsewhere, when the hardware reads in a vertex the read is done in 32 byte chunks from the vertex buffer, every time. So let's say your vertex shader input looks like this:

struct Vertex
{
	 float3 Position;
	 float3 Normal;
	 float2 TexCoord;
};

and your vertex buffers are set up like this:
// Vertex buffer 1 - Positions
pos1 | pos2 | pos3 | pos4 | ...
// Vertex buffer 2 - Normals
norm1 | norm2 | norm3 | norm4 | ...
// Vertex buffer 3 - TexCoords
tex1 | tex2 | tex3 | tex4 | ...

Then the device has to do 3 32 byte reads per vertex, for a total of 96 bytes, 64 of which is useless and will be discarded. However, if I packed one vertex buffer
// Vertex buffer 1 - Positions, normals and texcoords all interleaved
pos1 | norm1 | tex1 | pos2 | norm2 | tex2 | pos3 | norm3 | tex3 | pos4 | norm4 | tex4 | ...
Then the device only reads one 32 byte chunk which is a 3x bandwidth saver.

If your're complicating things further by using different indices, that's another lot of Buffer<ushort>::Load() calls slowing down the shader. There's no point sacrificing texture samples to save a little GPU memory, I bet your entire game's vertex content weighs less than your 2048x2048 shadow map anyway.

(Doesn't everybody have a 2048x2048 shadow map these days? Or several :))

Edited by hupsilardee, 21 November 2012 - 06:30 PM.


#5 Hodgman   Moderators   -  Reputation: 24060

Like
0Likes
Like

Posted 21 November 2012 - 07:33 PM

So would a AAA console game where memory (and/or bandwidth) is, or is preferably, tight typically use separate streams or would they use the default stream? Or would it be on a game-by-game basis?

When making modern games on 6 year old hardware, everything is done on a game-by-game basis.

Going by publicly accessibly information (to respect NDAs), Wikipedia says the PS3's GPU uses the G70 architecture, which is DX9-level. This multiple-index-stream technique requires a DX10-level GPU that can perform manual fetching from buffers in the vertex shader. However, the PS3 has got the SPUs, which are fully programmable (and much more powerful than it's GPU) so in theory you could use the SPUs to do your vertex shading, but this would require careful synchronisation between the SPUs and GPU... Options to be evaluated game-by-game Posted Image

I bet your entire game's vertex content weighs less than your 2048x2048 shadow map anyway

That's a good point -- vertices are cheap. You can fit half a million of your hypothetical vertex structure into the same space as that single texture.
If you did want to save vertex space, you might be better off storing the normal and tex-coord in 16-bits per component instead of full 32-bit float.




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS