OverviewThis tutorial builds on the previous tutorial, and covers rendering a framerate independent rotating coloured triangle, vertex buffers, and provides a small polish point and exercises you can implement if you wish. It's assumed that you've read and completely understood the previous articles, and only the differences in code are shown for brevity. The source code (Downloadable at the end of the page) is standalone and contains the code from the previous article as well as this.
Vertices, the Flexible Vertex Format and Resource poolsVertices
You probably already know that graphics cards deal with vertices. Everything drawn in Direct3D is made up from vertices (Usually with textures or shaders applied, but we'll cover them in a later tutorial). In the most basic sense, a vertex is just a 2D or 3D point (Depending on if the vertex is in screen space or 3D space). Vertices can also contain a host of extra information, such as diffuse and specular colour (We only use diffuse here), vertex normals for dynamic lighting, texture coordinates, weighting for skinned meshes (Meshes with a skeleton formed from bones), and various other bits and pieces.
We need to give D3D a list of vertices, and then we tell it how to use those vertices to make up shapes. D3D handles points, lines and triangles naively, and this tutorial draws as triangles.
However, the GPU and CPU run independently - in fact, the GPU can run up to 2.5 frames behind the CPU if it wishes. Because of that, if we give D3D a list of vertices, the only way it can safely use them is by immediately copying them into an internal buffer. Then, that buffer can be freed when the frame using them has been rendered. That's obviously not very optimal, and although D3D does support rendering in that way (Similar to OpenGL's immediate mode), it's usually not a good idea to use unless you want to get something quick and dirty working for testing.
What D3D needs is a way of managing a list of vertices with some sort of access restrictions on it, so D3D can "own" the vertices and not have to copy them around every frame. This is where vertex buffers come in. A vertex buffer is just that; a buffer of vertices which D3D is in charge of. You can write vertices into the vertex buffer by locking the buffer. This tells D3D "I'm about to start doing things to this buffer, so I'd like you to give me a pointer to the buffer that I can read from or write into".
The Flexible Vertex Format
Because we've already said that a vertex can contain information other than a position (And even that can be 2D or 3D), D3D needs a way to know what is in a vertex, and what memory offset to find it at. That's where the Flexible Vertex Format or FVF comes in. This is a bit-mask that tells D3D what components the vertex has. As for finding the memory address for each of the components, D3D manages that by requiring that all elements in a vertex are in a particular order. Since D3D knows what components are in a vertex because of the FVF, and it knows the size of each of the types (It's implicit, a 3D position is always 3 floats, which is 12 bytes for instance), it can cleverly deduce the offset for each of the components. This does however mean that if you screw up the order of the elements in the vertex, or provide the wrong FVF code, you'll end up with either broken rendering, or nothing rendering.
There used to be a diagram in the SDK documentation that indicated the required order of elements in a vertex. Since things are moving more towards shaders and vertex declarations these days (And in D3D10, the fixed function (Default shaders effectively) and FVF codes have been removed), that diagram seems to have been removed. It's shown on the right here (Resurrected from an old SDK I have on my hard drive).
Resource memory pools
Most resources that you create with D3D - including vertex buffers - are created in a particular memory pool. The available resource pools are defined by the D3DPOOL enumeration, which is defined as follows:
typedef enum D3DPOOL
D3DPOOL_DEFAULT = 0,
D3DPOOL_MANAGED = 1,
D3DPOOL_SYSTEMMEM = 2,
D3DPOOL_SCRATCH = 3,
D3DPOOL_FORCE_DWORD = 0x7fffffff,
} D3DPOOL, *LPD3DPOOL;
System memory is the main RAM that you have in your machine and is only accessible by the CPU. Video memory is memory that is on the graphics card, and is only accessible by the GPU, and AGP/PCI-Express memory is "shared" memory used for transferring data between the CPU and GPU. This segment of memory is accessible by both the CPU and GPU, and is managed by the video driver. In some cases such as with integrated graphics chipsets, the video memory does not exist, and the graphics chip only has a share of the system memory - usually pre-reserved in the system BIOS.
The documentation covers the meaning of all of these values in depth, but we'll quickly cover them here anyway.
- D3DPOOL_DEFAULT - This effectively means that the resource will be in video memory or AGP/PCI-Express memory. There's no way to actually force one or the other, it's up to the driver, and the driver is also free to put the resource in system memory if it feels like it. The reason for that is to keep things abstracted. We don't want to have to deal with edge cases like where this graphics card is an integrated card without dedicated video memory. Generally you'll use D3DPOOL_DEFAULT for dynamic resources (Covered in a later article), and resources that have to be in the default pool. Because resources in the default pool are often in video memory, if you create too many resources in the default pool you'll run the risk of running out of video memory (And D3D will return D3DERR_OUTOFVIDEOMEMORY), particularly on video cards with a low amount of VRAM available.
- D3DPOOL_MANAGED - This is the pool that most of your resources will be in. Resources in the managed pool are managed by D3D, and are swapped in and out of the default pool as required. Because of this, you don't need to worry about running out of video memory when using this pool (Unless you have one resource which is larger than the total available VRAM), which is good for us. The only down side of this is that D3D keeps a system memory copy of the data so it can move data to and from the default pool, which means you'll require more system RAM to keep them around. This isn't a huge problem however, and if you do find your application's RAM usage getting too high, you can always free up some managed resources and re-create them and/or load them from disk when you need them.
- D3DPOOL_SYSTEMMEM - Unsurprisingly, this means "Put my resource in system memory". These resources usually can't be used for rendering, but there are some cases where having a system memory copy of a resource can be useful - for instance using the resource to quickly update a default pool version of the resource (Which is what the managed pool does internally). You'll usually not be using this pool except for some specific cases which won't be covered in this tutorial.
- D3DPOOL_SCRATCH - This is a pool that doesn't have restrictions on the format of data you put in it, but can't be used for rendering. This is basically used for getting D3D to do the legwork for loading resources, and is rarely used.
- D3DPOOL_FORCE_DWORD - Used to make the D3DPOOL enumeration compile to 32-bits in size to avoid compile problems and isn't actually used as a pool type.
The Vertex BufferCreating the vertex buffer
So now we know what a vertex buffer is and what resource pools are, let's look at the code for creating a vertex buffer:
// Create the vertex buffer
hResult = m_pDevice->CreateVertexBuffer(sizeof(Vertex)*3, D3DUSAGE_WRITEONLY, Vertex::FVF,
D3DPOOL_MANAGED, &m_pVB, NULL);
// Error handling
static const DWORD FVF = D3DFVF_XYZ | D3DFVF_DIFFUSE;
The usage for the vertex buffer is a hint to D3D and the display driver about how you're going to use the buffer, which combined with the pool type, allows the driver to make a better judgement on where to put the resource. System memory resources always end up in system memory, but default or managed pool resources can be in video memory or AGP/PCI-Express memory. Generally, resources that aren't going to be touched often will go in VRAM, where it's very fast to render from but slow to write to and extremely slow to read from, and resources which are going to be updated frequently will go into AGP or driver managed memory which is a good trade off between access speed from the GPU and CPU. We pass in D3DUSAGE_WRITEONLY to tell D3D that we'll only ever be writing to this resource, and we promise never to read from it. If you attempt to read from a buffer created with D3DUSAGE_WRITEONLY, it might work, or it might crash - its undefined behaviour. At the time of writing, even the debug runtimes will let you try to read from a write-only buffer unfortunately, so you'll need to be careful about this on your own.
It's usually a very bad idea to create and release resources in your main render loop (I.e. every frame), since resource creation is generally a very slow thing to do, will probably stall the GPU (We'll see what that means in a moment), and will end up fragmenting video memory. Some drivers might handle this well; others (I'm looking at you, Intel) will start behaving strangely, and if you're particularly lucky eventually give you a blue screen of death.
Locking the vertex buffer
If creating the vertex buffer fails, we log the error, clean up and bail out. Otherwise, we move on to lock the vertex buffer with a call to IDirect3DVertexBuffer9::Lock. As we mentioned earlier, this tells D3D that you want to read or write into the buffer. Unless you specify otherwise (By passing D3DLOCK_READONLY as the last parameter), D3D assumes you want to write into the buffer (And will let you read from it, unless the buffer was created with D3DUSAGE_WRITEONLY). So, the code for locking the buffer:
hResult = m_pVB->Lock(0, 0, (void**)&pVertex, 0);
// Error handling
As always, we must check the return value and if the function fails we log the error, clean up and return.
Filling in the vertices
Now that D3D's given us a pointer to write into, we can actually write into it. You need to be very careful not to write past the end of the buffer provided to you - even the debug runtimes won't detect buffer overruns or underruns. The code for filling in the vertices goes as follows:
// Fill in the vertices...
pVertex->vPos = D3DXVECTOR3(0, 0.5f, 0); // Top vertex
pVertex->dwColour = D3DCOLOR_XRGB(255, 0, 0);
pVertex->vPos = D3DXVECTOR3(0.5f, -0.5f, 0); // Bottom right vertex
pVertex->dwColour = D3DCOLOR_XRGB(0, 0, 255);
pVertex->vPos = D3DXVECTOR3(-0.5f, -0.5f, 0); // Bottom left vertex
pVertex->dwColour = D3DCOLOR_XRGB(0, 255, 0);
// Unlock the vertex buffer to tell the device that we're done with the pointer it gave us.
// There's not much point in checking the return value, since if something is wrong, it'll
// get caught at render time.
When you call IDirect3DVertexBuffer9::Unlock, the display driver will likely start uploading the data you provided to VRAM if the buffer exists there. For buffers in AGP memory or driver managed system memory, the driver will likely either have given you direct access to that memory with the lock, or it'll have given you a temporary buffer and will copy the memory into the buffer used by the GPU. Locking a buffer can also cause the GPU to stall - if it is currently reading from a vertex buffer and you lock it, the driver will have to finish using the buffer before it can transfer data in or out of it - and since the GPU can be up to 2.5 frames behind the CPU, that can be a (relatively) incredibly long delay.
3D rendering statesSetting up the render states
We've given D3D three vertices to draw with, which have 3D coordinates in them. Since the screen is only 2D, we need to tell D3D how to convert the 3D coordinates into 2D screen coordinates. That's all done in the SetupState function, which is called after all of the D3D objects are created and fully set up - although there's no reason you couldn't set up the state immediately after creating the device. Here's the SetupState function:
// Create the projection matrix (45 degrees FOV), and set it on the device
float fAspect = (float)m_thePresentParams.BackBufferWidth /
D3DXMatrixPerspectiveFovLH(&matProj, D3DXToRadian(45.0f), fAspect, 0.1f, 1000.0f);
// Create the view matrix (Camera), and set it on the device
D3DXVECTOR3 vEye(0, 0, -3.0f); // Camera position
D3DXVECTOR3 vAt(0, 0, 0); // Camera look-at position
D3DXVECTOR3 vUp(0, 1.0f, 0); // Camera "up" direction
D3DXMatrixLookAtLH(&matView, &vEye, &vAt, &vUp);
// Tell the device not to cull back-facing triangles
// Tell the device not to do any dynamic lighting
The projection matrix
Lets look at the projection matrix first. You can consider the projection matrix to be like the lens of a camera. You can use different lenses to give different effects like fish-eye or a narrow (zoomed in) view. One other important thing the projection matrix is responsible for is setting the distance of the near and far clip planes (Which we'll cover shortly). The projection matrix is created with the utility function D3DXMatrixPerspectiveFovLH. This function creates a perspective projection matrix (So objects further away appear smaller), given a Field Of View, and uses a left-handed coordinate system (Which I won't go in to, but means that "into" the screen is +Z, "up" the screen is +Y and to the "right" of the screen is +X. You also have to specify the aspect ratio of the matrix, which is simply the width of the view area divided by the height of the view area - which we can get from the backbuffer size.
It's worth noting that the FOV is the vertical FOV, so that widescreen monitors still look correct. So, a 45 degree vertical FOV, with an aspect ratio of 1.333... (Since the backbuffer is 640x480) gives a horizontal FOV of 60 degrees.
We use the D3DX utility macro D3DXToRadian to convert from degrees to radians, since most of the D3D and D3DX function that take angles use radians.
D3DXMatrixPerspectiveFovLH just fills in the matProj matrix; we need to pass it to D3D to use. We do that with the IDirect3DDevice9::SetTransform function. This function can be used to set a number of different transform matrices as we'll see for the view matrix...
The view matrix
Next up, we set up the view matrix. This is a bit like a camera, in the same way as the projection matrix is like a camera lens. The view matrix can be used to position the camera, and determine its orientation. We also need an "up" vector, since there's an infinite number of possible orientations with different "up" directions. If you look straight ahead and roll your head so your ear is on your shoulder, you've not changed your position, or the point you're looking at, but the "up" direction has changed, which makes your view of things change.
We create the view matrix with the D3DXMatrixLookAtLH function, which takes the position, look at point and "up" vector and fills in our matView matrix for us. The values we've used here put the camera at (0, 0, -3), looking at the origin, with (0, 1, 0) (+Y) being straight up, so the camera appears to be sitting upright.
Like the projection matrix, we use the IDirect3DDevice9::SetTransform function to pass the matrix to D3D.
Misc. render states
The final two lines of the SetupState function disable backface culling and dynamic lighting by calling the IDirect3DDevice9::SetRenderState function, which is used to set a huge number of different rendering parameters. See the D3DRENDERSTATETYPE enumeration for a full list of parameters you can change with this function.
Normally, D3D will cull triangles that are facing away from the camera. By default, a triangle is considered "back facing" if the three vertices are rasterized in anti-clockwise order. Because our triangle is going to be spinning on the Y axis, after half a turn, it'll be back facing and D3D won't draw it. Back facing triangles are culled by default as an optimisation, since usually you don't care what the back of a model looks like because it's obscured by the front.
Dynamic lighting is also disabled, since we don't set up any lights in our scene, and if we don't tell D3D not to do dynamic lighting it'll draw the triangle as black.
As the triangle spins, the order the vertices are rendered in changes from clockwise to anti-clockwise and then back again over a full rotation.
The D3DX library
The D3DX utility functions come in a separate library to make things a bit more lightweight in the core D3D library. Because of that, we have to link to a new .lib file, which pulls in the D3DX DLL file (The exact filename depends on the SDK version). There's also a debug version of the D3DX library which performs more validation on the data you pass it, but performs a bit slower. Since we want as much debug info as we can get, we'd like the debug version for debug builds - which we can get with some pre-processor work:
#pragma comment(lib, "d3d9.lib")
# pragma comment(lib, "d3dx9d.lib")
# pragma comment(lib, "d3dx9.lib")
RenderingFinally it's time to render our triangle. We need to render the entire scene every frame, since the scene will (most likely) changing every frame. For reasons we'll see later, it's a good idea to have a function we can call that will render the entire scene in one go, and show the result (By Present()ing). So, if you compare the Tick() function from the last tutorial and this one, you'll see that we've moved some of the rendering out into its own function; DrawFrame(), which is the function we can call to draw a frame, surprisingly. Let's start dissecting that function.
Clearing the screen and preparing for rendering
The first thing this function does is call Clear(). This has been covered in the previous tutorial; have a look back if you need a refresher on what it does:
// Clear the screen
m_pDevice->Clear(0, NULL, D3DCLEAR_TARGET, D3DCOLOR_XRGB(128, 128, 255), 0.0f, 0);
// Tell the device we want to start rendering
HRESULT hResult = m_pDevice->BeginScene();
// Error handling
Rotating the triangle
Following that, we do a bit of framerate independent rendering - that is, we render an animation that will animate at the same rate, no matter what framerate we get; it'll depend on the elapsed time instead. If we have a higher frame rate (I.e. we're running on a better system), then the animation will be smooth. This makes things look nicer on better hardware, which is a better option than rendering at a fixed 30 FPS or so, and then dropping to 15 FPS if we can't sustain that frame rate.
// Create a rotation matrix, with a rotation based on the current time (180 degrees per
// second), and set it as the current world matrix
float fTimeDifferenceInSeconds = (float)(GetTickCount() - m_dwStartTime) / 1000.0f;
float fAngleInDegrees = fTimeDifferenceInSeconds * 180.0f;
What is more useful is the difference is time, which is what we use to initialise the fTimeDifferenceInSeconds variable. The m_dwStartTime variable is the system time when the window was created, which we recorded at the very end of the Create function:
// Done - copy HINSTANCE variable and record current time
m_hInstance = hInstance;
m_dwStartTime = GetTickCount();
Now we have the angle to spin the triangle at, we have to get it spinning. We could do it the hard way, which would mean locking the vertex buffer and updating the position of all the vertices by hand, using some trigonometry. Thankfully, D3D gives us an easier and more efficient way to do things by using another transform matrix. In this case we use the world transform to move all of the vertices in such a way that they'll be in the same place as if we'd rotated them all by the rotation angle above.
I'm not going to go into depth about matrices just now - that'll come in a later tutorial. For now, all you need to know is that you can use the various D3DXMatrix* functions to manipulate a D3DMATRIX struct. Here we use the D3DXMatrixRotationY function to create a matrix that describes a rotation around the Y axis. This function takes a pointer to a D3DMATRIX structure which it fills in, and a rotation angle in radians. We use the D3DXToRadian macro to convert from degrees to radians just like we did with the projection matrix.
Now, those of you who are following closely may notice that I said that D3DXMatrixRotationY takes a D3DMATRIX structure, but we pass in a D3DXMATRIXA16 structure. So how does that work? Well, the D3DXMATRIXA16 structure is derived from the D3DMATRIX structure, which allows us to pass a pointer to a D3DXMATRIXA16 where a D3DMATRIX is expected. The D3DXMATRIXA16 structure is aligned to a 16-byte boundary when you declare it as a local, static or global variable (But not if you declare it as a member variable), which allows the D3DX library to use some SSE2 instructions if the CPU supports it. SSE2 is an advanced instruction set available on newer processors which can perform up to 4 mathematical operations in a single instruction, so long as all 4 are sequential in memory and the first one is aligned on a 16-byte memory address. So, since it costs us next to nothing, and speeds up matrix operations, we should use D3DXMATRIXA16 wherever we can.
Now that we've got our rotation matrix set up, we tell D3D to use transform everything from now on by that matrix by setting it as the world matrix though the very same IDirect3DDevice9::SetTransform function as we used to set out view and projection matrices earlier.
Telling D3D what vertices to use
Although we created a vertex buffer earlier, D3D won't automatically use it for rendering because you might have several vertex buffers created. So, we need to tell D3D to use it, and what format the vertices are in, because for some reason D3D doesn't actually use the FVF code that the vertex buffer was created with unless you explicitly tell it to. Here's the code for all that:
// Tell the device the format of the stream of vertices it's getting
// Tell the device where to read the stream of vertices from, and the size of one vertex
m_pDevice->SetStreamSource(0, m_pVB, 0, sizeof(Vertex));
Finally we can actually draw the triangle. That's done with the IDirect3DDevice9::DrawPrimitive function:
// Draw a single triangle
m_pDevice->DrawPrimitive(D3DPT_TRIANGLELIST, 0, 1);
If you like, try passing D3DPT_LINESTRIP as the first parameter and 2 for the last to get D3D to interpret the vertices as a line strip (Sometimes called a poly-line), or pass D3DPT_POINTLIST as the first and 3 as the last parameters to render the vertices as points.
As mentioned near the start of this tutorial, there's also a way of rendering without using a vertex buffer, which is far less efficient by using the IDirect3DDevice9::DrawPrimitiveUP function (Where UP stands for User Pointer). I only mention it here for completeness. If you're interested in the details, see the documentation for a description of the parameters; they're the same as the parameters for IDirect3DDevice9::DrawPrimitive, but it takes a pointer to the start of the vertex data - which is just what would be the contents of the vertex buffer.
Telling D3D we're done
Since we had to tell D3D that we were starting to render by calling IDirect3DDevice9::BeginScene, it's probably not too much of a surprise that there's a matching IDirect3DDevice9::EndScene function that we have to call to let D3D know that we've finished doing 3D rendering. Like BeginScene, this function takes no parameters:
// Tell the device we're finished rendering. Don't bother checking return value, there's
// nothing we can really do if it fails, and if there's a real problem it'll be caught on the
// next frame anyway.
hResult = m_pDevice->Present(NULL, NULL, NULL, NULL);
// Error handling
The ResultIf all goes well, you should have a display like the following:
The image doesn't really do it justice, but the triangle should be spinning clockwise around the Y axis.
A Bit of PolishThere's a small "feature" with the code in its current state - if you try moving the window, you'll notice that rendering stops, and if you're on Windows XP or below, the window doesn't redraw. This is because when the user drags your window, the OS goes into its own little world and DefWindowProc doesn't return until the user stops moving the window. That means that your Tick function stops being called, which means rendering stops. There's a (somewhat horrible unfortunately) way around this though. When the user starts to move or resize a window, the OS sends a WM_ENTERSIZEMOVE message to your window procedure. It then processes messages for your window internally, and never returns until the user stops moving the window, at which point it sends WM_EXITSIZEMOVE. Now, because the OS is pumping window messages for us, we can set up a timer and get the OS to repeatedly send us a WM_TIMER timer notification message, and from there we can call the DrawFrame function, which will update the window. To do that, it's simply a case of using the SetTimer function in our WM_ENTERSIZEMOVE handler, and then call KillTimer in our WM_EXITSIZEMOVE handler to stop getting the timer messages:
SetTimer(m_hWnd, 1, USER_TIMER_MINIMUM, NULL);
If you put the above code into your message procedure, the window should still update when the user drags it. There's still a short delay if the user holds down the mouse button on the title bar of the window, this is a "feature" of the OS, and isn't really something that can be easily fixed without risking breaking things on future versions of windows. Take a look on Google if you're interested in the details - long story short is that DefWindowProc doesn't return for around 500ms.
ExercisesSince the result of this tutorial is a lot more interesting than the last one (I hope!), I'll suggest a few exercises you can try yourself if you're keen. Each one is quite simple, partly because I lack imagination, but mostly because you shouldn't try diving in too deep or you'll just drown. So, here are a few things you can try:
- Play around with the vertex positions and colours that are put into the vertex buffer to see what different colours and shapes of triangle you can get.
- Play around with the projection matrix FOV setting to see how this affects the rendered image
- Make the triangle rotate around the X or the Z axis instead (Hint, D3DXMatrixRotationX and D3DXMatrixRotationZ).
- Add another vertex to the vertex buffer, and copy the first vertex (So the vertices in the buffer are top, bottom right, bottom left, top), and draw an outline of the triangle instead of the filled triangle (Hint: You'll want to use D3DPT_LINESTRIP as the first parameter to IDirect3DDevice9::DrawPrimitive).
- Add another 3 vertices to the vertex buffer with positions and colours of your choosing, and draw two triangles instead of one in a single DrawPrimitive() call.
- Add another 3 vertices to the vertex buffer with positions and colours of your choosing, and draw two triangles instead of one using two DrawPrimitive() calls (Hint: You'll need to set the second parameter to DrawPrimitive() to 3 for the second draw call).
- Try to draw the triangle with the IDirect3DDevice9::DrawPrimitiveUP function instead of IDirect3DDevice9::DrawPrimitive. You won't need to call IDirect3DDevice9::SetStreamSource.
Source CodeThe source code for this article is available Here with project files for Microsoft Visual Studio 2008 and 2005.
The end. Let me know what you think!
EDIT, 22/02/09: Fixed typo (+X is right, not +Z [smile], thanks to NewBreed)