multiple streamsources

Started by
9 comments, last by mattnewport 19 years, 5 months ago
We all do love IDirect3DVertexDeclaration's, but i've found out that i don't love it anymore :( Using VertexDeclarations you can specify to send geometry data down through one streamsource, vertex normals through another streamsource and texture coordinates through a third stream - as long as your device is capable of multiple streams. I tried to send vertex data over several streamsources (1 face over streamsource 0 and another face on streamsource 1) however this seems to result in unpredictable behaviour and i'll turn my nose here: Is there anyone here who has successfully "parallellisized" geometry data over several streamsources using IDirect3DVertexDeclarations?
Ethereal
Advertisement
What unpredictable behavior have you experienced? You *should not* be using a separate stream source for every single vertex element (ie a position stream, normal stream, tex coord stream, tangent stream, ect...), for many reasons:

(1) You allocate one additional vertex buffer for each stream. The memory overhead of each additional buffer isn't that high, but it can degrade performance in areas such as device reseting.

(2) You have to call SetStreamSource() for each stream. SetStreamSource() is one of the most expensive functions in Direct3D, so calling it should be kept to a minimum.

(3) What if you need to access to all elements of all vertices in software? You'll have to perform a Lock() and compile the results. And I bet you guessed this: locking vertex or index buffers is even worse than SetStreamSource(), because it forces a change from user mode to kernel mode (and back again).

If you have a small vertex format, this may be work fine for you. It's when you start using vertices that hold a lot of data (ie skinning info, ect...) that you run into problems. Using a few streams (2-3 seems common) can be effective in many situtations. I've used it in many projects where I'm rendering models with different vertex types, or the model data is segmented when it's loaded. Just don't go overboard with it, and it can be really useful.

As for which devices support multiple streams, a quick check of the Caps database yeilded the following:

   VENDOR             CARD        MAX STREAMS------------    --------------    ----------- ATI             Rage LT Pro           1 ATI             Radeon 7000           8 ATI             Radeon 7500           12 ATI             Radeon 8500           12 ATI             Radeon 9000           8 ATI             Radeon 9700           16 ATI             Radeon 9800           16 Matrox          Millennium G550       1 Nvidia          TNT2                  0 Nvidia          GeForce2              16 Nvidia          GeForce 3             16 Nvidia          GeForce 4             16 Nvidia          GeForceFX             16 3dfx            Voodoo3               0 S3              Virge                 0 Intel           i815/845/865          1


The maximum number of streams is 16.
Dustin Franklin ( circlesoft :: KBase :: Mystic GD :: ApolloNL )
My current engine is set up to do either single stream, or multiple streams (geometry, bump, bones, tex coords). Stress testing has shown that multiple streams are a good bit slower than a single stream in the same situation, probably because I'm doing 2 - 4 SetStreamSource calls instead of 1 for each vb. In addition, the added complexity of managing multiple streams for each vertex buffer was becoming a headache. I'm happily back to single streams for now.

joe
image space
I'm not sure how you expect to send 1 face on stream0 and 1 face on stream2... both faces will receive data from both streams.

I've successfully used 1, 2, and 3 streams on both PC and XBox, and it's always worked as advertised. Granted, there is overhead to streams and I've since rewritten the code to use just one stream.

For example, a quad in 3 streams (and no, this isn't how my 3 streams were that I mentioned above... one was for bonedata, one was for custom optional data)

Stream0: offset 0 = pos, offset 12 = normal, stride = 24
0: Pos0
12: Norm0
24: Pos1
36: Norm1
48: Pos2
60: Norm2
72: Pos3
84: Norm3

Stream1: offset 0 = D3DCOLOR, stride = 4
0: Diffuse0
4: Diffuse1
8: Diffuse2
12: Diffuse3

Stream3: offset 0 = UVFloat2, stride = 8
0: U0
4: V0
8: U1
12: V1
16: U2
20: V2
24: U3
28: V3

You're all right, it is slower to use multiple streamsources and yes I have managed to stream vertexdata over one stream and texturecoords over another stream. This was just a simple test to see if it worked to send geometrydata over several streams
Ethereal
Hmm.. I really do think it is a good idea anyway to use multiple streams, even though it's fairly expensive. What if you'd like 5 or 6 different vertex formats: geometry data, Normal data and Texture coords for one mesh and geometry data, normal data, texture coords and diffuse data for another mesh you got some options: Create a roughly general vertex format covering all possible mixtures of data

// even though meshes lacks diffuse color, we'll use this formatstruct VERTEX { float x, y, z; float nx, float ny, float nz; float u, v; unsinged long diffuse;}


or create one vertex structure for each and every combination of format with minimal overhead, but tons of structures

// vertexformat for meshes without diffuse datastruct VERTEX_A{ float x, y, z; float nx, ny, nz; float u, v;}


If you'd use multiple streams here, you can create one vertexbuffer with geometry data, one with normal data, one with texture coords and one with diffuse data and use multiple streamsources and vertex declarations to interleave the data you currently need:

pseudo:if(use_vertex_data) interleave(Pipeline::VertexData, stream0)if(use_normal_data) interleave(Pipeline::NormalData, stream1)if(use_diffuse_data) interleave(Pipeline::DiffuseData, stream2)Pipeline::VertexData->SetAsStream(0);Pipeline::NormalData->SetAsStream(1);Pipeline::DiffuseData->SetAsStream(0);CreateApproperiateShaderDeclaration()->Bind();DrawPrimitives();


This will be more flexible than using single-stream vertexbuffers and if you'll keep your vertexbuffer at optimal size, the overhead of changing streamsource will be small
Ethereal
The problem with that is when you try to index the data... Say you use diffuse on one mesh, don't on a second, and then use it again on the third.

VB0:pos[100],pos[50],pos[75]
VB1:diffuse[100],diffuse[75]

You can't index the streams differently like that. You could use byteoffset in SetStreamSource, but that needs a DX9 card that supports that new feature... instead you'll do this:

VB0:pos[100],pos[50],pos[75]
VB1:diffuse[100],wasted[50],diffuse[75]

Instead, I create a (or many) VB(s) per vertex format. I allow multiple streams, sure, but currently everything the engine does itself uses one stream.

Mesh needs position, normal, diffuse, and 1 UV? Ok, search for or make a format, and find or create some VB storage for the mesh.

Mesh needs position, normals, diffuse, uv, boneweights, and boneids? Ok, search for or make a format for that, and find or create some VB storage for the mesh.

Meshes with the same format can share VBs. Meshes with different formats have different VBs. I only ever *need* 1 VB, but if the user of the engine wants to use several VBs in multiple streams, then sure, I support that too... A multi-stream format is just another format. If meshes share the same multi-stream format, they can share those VBs.
If I got your point correct, that can be solved by sorting the meshes by vertex declarations. First, batch all the meshes with diffuse and then the meshes without diffuse...
Unfortunatly i'm high on caffein and other weird things so I can't think clearly atm... i'll return when i've implemented a beta of the system
Ethereal
Quote:Original post by Metus
Hmm.. I really do think it is a good idea anyway to use multiple streams, even though it's fairly expensive.


It's effective to hold groups of data separate (ie buffer for pos, normal, texcoord...buffer for skinning data...buffer for extra data), but having a buffer per element is a little much. I don't really agree with splitting the position, normal, and texcoord elements up, because:

(1) Almost all vertices have them. Think of a time when you render vertices that don't have a position, normal, and tex coords. Of course, you could be doing 2D stuff, but that isn't normal 3D rendering, anyways.

(2) If you ever need to get position and normal data out of the buffers (say, for per-triangle collision detection), your performance is going to drop dramatically, since it's going to require at least twice the amount of locks.

Quote:
This will be more flexible than using single-stream vertexbuffers and if you'll keep your vertexbuffer at optimal size, the overhead of changing streamsource will be small

This is incorrect. The cost of a SetStreamSource() call is not directly correlated to the size of the vertex buffer(s) you are setting. Just because you are setting a buffer 1/4 the size *doesn't* mean the SetStreamSource() call is going to be 4 times as fast.
Dustin Franklin ( circlesoft :: KBase :: Mystic GD :: ApolloNL )
Quote:Original post by circlesoft

This is incorrect. The cost of a SetStreamSource() call is not directly correlated to the size of the vertex buffer(s) you are setting. Just because you are setting a buffer 1/4 the size *doesn't* mean the SetStreamSource() call is going to be 4 times as fast.


Ofcourse, but if you can lock one big vertexbuffer instead of 4 smaller vertex buffers it's good, and if you can lock 4 big vertexbuffers instead of 16 smaller vertex buffers it's even better...

But as i may not have stated; this is nothing i've seen in action, it's just theoretically good on papers

edit: yes it might be a little overkill to split position, normal and texturedata into 3 separate streams...
Ethereal

This topic is closed to new replies.

Advertisement