CPU - GPU code execution

Started by
5 comments, last by MJP 15 years, 10 months ago
(Please ignore my previous post without any subject.Extremely sorry for that) I have a basic question about the CPU and GPU code execution mode. We write the following code in ( in the draw call) application part ( here *.cpp file)


for( UINT uPass = 0; uPass < uPasses; ++uPass )
{
         g_pEffect->BeginPass( uPass );
	 g_pd3dDevice->SetStreamSource( 0, g_pVertexBuffer, 0, sizeof(Vertex) );
	g_pd3dDevice->SetFVF( D3DFVF_CUSTOMVERTEX );
	g_pd3dDevice->DrawPrimitive( D3DPT_TRIANGLESTRIP, 0, 2 );
        g_pEffect->EndPass();
}

Now my question is what happens here. Is the control passed to GPU program ? Are the vertices of the vertexBuffer passed to GPU in one go? Or are they passed one by one everytime draw call is invoked. Does the pixel shader returns one pixel (per draw call) or all pixels?
Advertisement
First of all, and this is important, get the SetStreamSource and SetFVF out of the loop. You only have to set them once. Hell, even if you use different shaders on objects in the same vertex buffer, you only need to set the vertex stream once before rendering various parts in it.


Now then, your question. DrawPrimitive instructs the GPU to start rendering (in your case) 2 triangles (6 vertices) from the current vertex stream, starting at vertex #0. In principle, you can think of it like control being passed to the GPU, while the CPU waits. In practice, it's much more complex, where the CPU can still execute stuff while the GPU is busy as well, but may have to wait when it wants to use resources that the GPU is working on.


When rendering a triangle, the GPU will run the vertex shader for every vertex of the triangle. This will transform the points to some location on the screen. The GPU then rasterizes the triangle, meaning it calculates which pixels are on and off based on those three points. It effectively 'fills in' the triangle. The pixel shader is run for every pixel in the triangle. The output of the pixel shader is the color that should be written.

That's basically how it works. Besides this there's stuff like culling (not rendering triangles you can't see), depth testing (checking if pixels are in front of already written pixels), alpha blending (mixing colors), and so on. But it looks like you need to get the basics first.


And vertices are typically located in the GPU's memory itself (that's typically pool DEFAULT), but if in system memory, they may be moved at the draw call, or at SetStreamSource. Either all at once, or in parts, or maybe not at all. Maybe the driver does some sneaky memory mapping. Either way, it's all details that you really shouldn't concern yourself with.
Million-to-one chances occur nine times out of ten!
Exactly what happens depends on what else has been happening that frame, and what the GPU driver is doing. Typically when you submit a Draw call the command is stuck at the end of a queue by the user-mode component of the D3D runtime. When this queue (called a command buffer) fills up, the runtime has to hand things over to the kernel-mode driver. Switching to kernel mode is an expensive operation...this is why it's always recommended that you limit the number of DrawPrimitive calls in order to cause as few switches to kernel mode as possible.

When the driver has commands, they may not be executed immediately. Since the GPU operates asynchronously from the CPU, the CPU may in fact be submitting commands way ahead of what the GPU is doing. This is typical in GPU-bound scenarios where VYSNC is disabled: the runtime will let your app submit calls up to 2 frames ahead of what the GPU is doing, at which point IDirect3DDevice9::Present will block your app and wait for the GPU to catch up.

Anyway the point is...definitely don't expect things to "happen" when you make the API call. If you're looking for some more info, check the docs.

Oh and just though I'd mention...this applies only to D3D9. D3D10 uses a completely different display driver structure.
I'll give a simpler version of the above answers:
The calls to D3D functions internally fill a command buffer and return immediately, except the DrawPrimitive() call which also flush the command buffer to GPU before it returns. But note they only send commands to GPU. When and how to execute those commands is GPU's own business.

Pixel shader is run for every pixel of the geometry you drew.
[quote

Pixel shader is run for every pixel of the geometry you drew.

But some text books say, Pixel shader computes one pixel and returns.
I could not understand.

I have one more question

If the output of the pixel shader is stored in a Texture (called some Render Target).Can this texture be given as input to another pixel shader technique?

Thanks for all the previous post.
Sorry for the new post. I am a student of computer science.








Quote:Original post by serious_learner07
Quote:Pixel shader is run for every pixel of the geometry you drew.


But some text books say, Pixel shader computes one pixel and returns.
I could not understand.

It is 'called' for every pixel. It calculates the color of a single pixel and returns. Then it gets called for another pixel, and so on.
(Actually, several instances run in parallel)

Quote:Original post by serious_learner07
If the output of the pixel shader is stored in a Texture (called some Render Target).Can this texture be given as input to another pixel shader technique?

Yes. You can create a texure with the Render Target option, set it as the render target, and then after rendering to it, bind it as a normal texture to use for something else.
Million-to-one chances occur nine times out of ten!
Quote:Original post by serious_learner07

But some text books say, Pixel shader computes one pixel and returns.
I could not understand.


When you render a triangle, the vertices are first run through the vertex shader to determine where those vertices are positioned (typically they're multiplied with the world, view, and projection transforms). Once the position of each vertex is figured out, rasterization is used to "fill in" all the pixels that would make up the triangle. The pixel shader is run for each of these pixels, and the output color is calculated.

Now things get a little tricky with z-buffering and overlapping triangles. Let's say we're not using z-buffer at all, and we render two triangles. The first is rendered just as I described above. The second is also rendered just as described above. If those triangles overlap at all, then you'll have pixels where the pixel shader was run twice (once for each triangle), but only the result of the second triangle is kept. Now let's say we do have z-buffering active, and we once again render those two triangles. Once again, the first renders normally, except the depth of each pixel is stored to the depth buffer. Now for the second triangle, for any pixel that overlaps instead of just rendering it the depth is checked against the value in the z-buffer. If the depth is lower, the new pixel is rendered. If not, the pixel is skipped which saves pixel shader computation and fillrate.



This topic is closed to new replies.

Advertisement