geometry shader discard count stream out

Started by
10 comments, last by rouncer 12 years, 7 months ago
Is this even possible?

Is there some way to detect how many verts passed through the geometry shader to a stream out, you actually have to declare a static buffer for the streamout to go into... or do you?

I heard of this guy last time I asked ages ago (yes I feel the need to ask again) said you could access some kind of debug structure in d3d at runtime to check?

Remember, I cant just check the size of the buffer cause its static, only leaving it with tailing -1's I can copy it to another buffer but only after ive sent the whole damn array from gpu to cpu and all I needed was the bit of the buffer I used?

You see, I need to discard some, so I cant just calculate it from how many max output verts from the geometry shader, it could be different every time.

Ive got no idea, its why im asking.
Advertisement
With DX10/11 it's definitely possible to find out how much geometry was spawned by a GS. However, you still need to allocate a buffer big enough to store the highest possible number of vertices to be generated, beforehand.

One way is to use ID3D11Device::CreateQuery() with D3D11_QUERY_SO_STATISTICS_STREAM0, stream-out (i.e. issue a draw-call with stream-out), and finally ID3D11Device::GetData() and look into D3D11_QUERY_DATA_SO_STATISTICS::NumPrimitivesWritten. Other thing is to use ID3D11DeviceContext::DrawAuto(), which will automatically determine the amount of data in a buffer that was previously used for stream-out (you'll connect this buffer to input assembler stage).

A query might inflict a performance penalty, as the driver will have to finish some things and might let the CPU wait. On the other hand, drawAuto will not tell the CPU how much was rendered/generated. They are two completely different things but I thought it might have to do something with your question.
Ah! thanks HEAPS man! ill test it out.
i tried querying, but its just giving me garbage numbers when I ask for the primitive count, can you see a problem with this code?



D3D11_QUERY_DESC queryDesc;
queryDesc.Query=D3D11_QUERY_SO_STATISTICS_STREAM0;
queryDesc.MiscFlags=NULL;
ID3D11Query * pQuery;
dev->CreateQuery(&queryDesc, &pQuery);
dc->Begin(pQuery);
dc->Draw(4096*4096,0);
dc->End(pQuery);
D3D11_QUERY_DATA_SO_STATISTICS queryData;[color="#008000"][color="#008000"]
dc->GetData(pQuery, &queryData, [color="#0000ff"][color="#0000ff"]sizeof(D3D11_QUERY_DATA_SO_STATISTICS), 0);
pQuery->Release();
chunk_size[chunks]=queryData.NumPrimitivesWritten;

You have to check the HRESULT returned by GetData to see if the data you're querying is actually there yet (it will take a while for the GPU to catch up with the commands issued by the CPU).
How bad does it hurt performance, am I better off copying all the data?
Its returning S_FALSE... what does that mean?

Ok I read up on it, I put it in a while loop till it was S_OK. how bad does it hurt performance tho?
i went from 4 seconds to 6 seconds generating a 1024x1024x1024 voxel sphere... is there some quicker way to do this?
Or is it possible to make lots of queries, then grab all the data at the end?
It'd be best to begin the query, then render some more stuff (something unrelated) and read the query at the very end, when it's really necessary. One way or another, you'll have to wait until the result is ready.
The hit here is that the driver won't have any chance of hiding latencies by reorganising work as it sees fit and will have to wait for this particular draw to end, in order to give you the results.

This topic is closed to new replies.

Advertisement