Efficient Smoke Trail Rendering

Started by
6 comments, last by unbird 14 years, 1 month ago
I am trying to include smoke trails in my PC game, which are an alpha-blended particle effect, and I'm having problems with efficient rendering. I'm using DirectX 9, but I suspect what I need is a new methodology or algorithm, not a change in the DirectX-specific graphics code, so I thought it best to post here (if we can't figure this out, I'll post in the DirectX forum later). Allow me to explain how my smoke trail algorithm works. If you have a suggestion for improvement or an entirely different way of doing it (which I suspect will be the best solution), please let me know. Smoke trails are to follow moving missiles in my game. They take a long time to die away (~20 seconds), so they will end up being quite long and may be visible in a portion of the game universe long after the missile itself is beyond view. They are relatively thin (from 4 - 15 pixels, depending on how closely one zooms in with the camera), whereas they can easily be tens (or even hundreds) of thousands of pixels long (though only up to 32,000 pixels of that length can be on-screen at once, when the camera is zoomed out to maximum). The smoke trail is saved as an array of points. Each frame, the last point in the array is overwritten with the current missile's position. Every so often, either once the missile has traveled ~500 pixels or if the missile has turned sharply enough, the last point in the array won't be overwritten - instead, the array will be incremented to include one more point (the new "last point" is given the missile's position and is continually overwritten until the array must be expanded once more). This array of distant points is great for describing the general trajectory of the smoke trail, but they are too far separated for rendering. In order to render, I first determine which points are visible on the screen during the given frame. Then, for each of these points, I use the size of my particle texture and my camera zoom level, as well as the distance to the next smoke trail point, to determine how many particle textures I must render between the points to make a smooth column of smoke, the position of the first particle texture (in camera coordinates), and the separation (in camera coordinates) between particle textures that will be rendered in sequence. For instance, let's say only two points are visible on the screen - at camera coordinates (0, 100) and (100, 100). Let's say that the particle width and height are 2. Based on the creation time of the two visible points in my smoke trail, I know I need to draw them with an alpha of 0.5 and 0.4, respectively. Then, for rendering, I calculate the following:


vector_2d First_particle_cam_coords = (0,100);
vector_2d Final_particle_cam_coords = (100,100);
float First_particle_alpha = 0.5;
float Final_particle_alpha = 0.4;
float particle_size = 2;

int Num_particles_to_render = Length(First_particle_cam_coords, Final_particle_cam_coords) / particle_size; // = 100 / 2 = 50
vector_2d delta_position = (Final_particle_cam_coords - First_particle_cam_coords) / Num_particles_to_render; // = (100, 0) / 50 = (2, 0)
float delta_alpha = (Final_particle_alpha - First_particle_alpha) / Num_particles_to_render; // = (0.4 - 0.5) / 50 = -.002


In the actual rendering loop, I simply draw my particle texture in a for loop:

vector_2d particle_pos = First_particle_cam_coords;
float particle_alpha = First_particle_alpha;

for (int i = 0; i < Num_particles_to_render; i += 1)
{
	Draw_Particle_Texture(particle_pos, particle_alpha);
	particle_pos += delta_position;
	particle_alpha += delta_alpha;
}

I've investigated my code and I've found the following: -When several smoke trails are on the screen, the frame rate drops significantly. -The problem is exacerbated by larger zoomouts and larger resolutions (more is visible to the camera). A lot more time is spent in the for loop (as expected) in this case. In these situations, I can expect to be in the for loop for as many as ~250 iterations per smoke trail, so ~2500 iterations for 10 smoke trails. -The code in the "for loop" is the major culprit (my code for mantaining the general smoke trail array, determining which array points are visible each frame, and calculating initial values is fairly fast). By "frame rate drops significantly," I mean that I can go from a consistent ~400 FPS to ~60 FPS or lower. Now while 60 is not shabby by itself, my game currently does not yet implement any AI, have any other special effects, barely runs any collision detection, and is missing about 98% of the actual "game" that has yet to be built. Basically, the smoke trails are one of the first things I've added, and I can't afford to be spending so much computation time on them when most of the heavy hitting needs to be saved for other, more intensive needs. So I'm interested in hearing what alternative methods there are for rendering smoke trail effects. Thanks.
Advertisement
Check out Cliffki's post on this.
Anthony Umfer
Quote:Original post by CadetUmfer
Check out Cliffki's post on this.


It looks like he does pretty much the same thing I do - only his smoke trails are a lot more spaced out and he doesn't zoom out as far (so a much smaller portion of his smoke trail is visible at one time). He has puffs of smoke with gaps in between, whereas I'm looking for a smooth column or stream of white smoke (that's faded to invisibility at one end and is brightly visible just as it leaves the missile's tail). Thus, I need to render far more particles (especially when zoomed out), so I wouldn't be surprised to learn that he isn't running into performance issues whereas I am.

(To get an idea of the effect I'm going for, think "Battlestar Galactica" missile trails:)




So unless I'm missing something, I'm not sure what bearing that post has on my issue...
You could render your smoke trails in a texture at a lower resolution and without blending, then blend it on screen at a certain layer of your scene.
If you need depth it gets a little more complicated though, but not impossible.
Quote:...not a change in the DirectX-specific graphics code...

Maybe you do, actually. From your loop, it's not clear how the particles are finally drawn. Does Draw_Particle_Texture(...) batch your particles or do you call device.DrawPrimitive (or whatever) every time ? If the latter is the case, your bottleneck is probably there. Can you show us the code of this function, please ?
Quote:Original post by unbird
Quote:...not a change in the DirectX-specific graphics code...

Maybe you do, actually. From your loop, it's not clear how the particles are finally drawn. Does Draw_Particle_Texture(...) batch your particles or do you call device.DrawPrimitive (or whatever) every time ? If the latter is the case, your bottleneck is probably there. Can you show us the code of this function, please ?


Certainly. I'm using the ID3DXSprite interface, because this is a 2D game.

// The following is called before any smoke trail rendering:pD3DDevice9->SetRenderState(D3DRS_SRCBLEND, D3DBLEND_SRCALPHA);pD3DDevice9->SetRenderState(D3DRS_DESTBLEND, D3DBLEND_ONE);// The following renders a smoke trailvoid CSmokeTrail::Render(){	for (int i = 0; i < m_Num_Render_Entries; i += 1)	{		D3DXVECTOR2 pos = m_Render_Entries.m_vPosition;		float alpha = m_Render_Entries.m_Alpha;		for (int j = 0; j < m_Render_Entries.m_Steps; j += 1)		{			Graphics_System->Draw_Particle(pos, alpha, m_Render_Size);			pos += m_Render_Entries.m_dPosition;			alpha += m_Render_Entries.m_dAlpha;		}	}}void CGraphicsSystem::Draw_Particle(D3DXVECTOR2 i_vPosition, float i_Alpha, float i_Size){	D3DXVECTOR2 t_vScale(i_Size, i_Size);	DWORD t_color = D3DCOLOR_RGBA(255, 255, 255, (int)(i_Alpha * 255.f));	// Texture_Manager->m_Sparticle_Texture.m_pTexture is simply a pointer to a particle texture, LPDIRECT3DTEXTURE9	if(FAILED(pD3DXSprite->Draw(Texture_Manager->m_Sparticle_Texture.m_pTexture,								NULL, &t_vScale, NULL, 0, &i_vPosition, t_color)))	{		char buffer[512];		sprintf(buffer, "Could not draw a sparticle sprite using the particle interface, at x = %f, y = %f, size = %f, alpha = %f.",				i_vPosition.x, i_vPosition.y, i_Size, i_Alpha);		throw Exception(buffer, __FILE__, __LINE__);	}}// The following is called after all smoke trail renderingpD3DDevice9->SetRenderState(D3DRS_SRCBLEND, D3DBLEND_SRCALPHA);pD3DDevice9->SetRenderState(D3DRS_DESTBLEND, D3DBLEND_INVSRCALPHA);// The following is called after all rendering is donepD3DDevice9->EndScene();pD3DDevice9->Present(NULL, NULL, NULL, NULL);// Note: I removed most of my error-checking code for the purposes of this post


I tried commenting out all the code in the for loop EXCEPT for the call to Draw_Particle() - the smoke trail obviously won't come out right, but it's a good test of how the Draw_Particle() call is affecting speed. It turns out that the bottleneck does indeed lie in the multiple calls to Draw_Particle() - all the D3DXVECTOR2 and float calculations done in the for loop have no noticeable effect on frame rate. When I comment out Draw_Particle(), the frame rate is not affected.

But again, I'm not sure if the problem is with my Draw_Particle() code or in my basic methodology.

Is there any code-related suggestion you may have?

Quote:Original post by LogicalError
You could render your smoke trails in a texture at a lower resolution and without blending, then blend it on screen at a certain layer of your scene.
If you need depth it gets a little more complicated though, but not impossible.


Where should I render to, if not in the screen? Is there a term for the technique you're describing? Sorry, but I'm not entirely clear on what it is you're suggesting and would like to understand more.

I don't think depth will be necessary. This is a 2D game.

Instead of rendering particles, you could render a trail, you would need less vertices to get a consistent line:

So instead of individual quads:

[] [] [] [] [] [] [] >====>

Generate a strip:
 ______________________________________________|________|_________|_________|________|________|>====|>
with a smoke trail texture that does something like:
.,.-``,.,-`-,`-`,._.-||                    |``,.,-`-,`-`,._.-'.,-|

You might need two strips, each at 90 degrees to the other, so the cross section of the strip would probably need to look like a + or X. You could also still draw some particle sprites as well (but much lower frequency than you do now) so that it looks round when you view it head on.

[EDIT] Just noticed you said its a 2d game, in that case, drop the particles, and you'll only need one strip [/EDIT]

Regards,

Alex (I'm in an ascii art kinda mood!)

Twitter: [twitter]CaffinePwrdAl[/twitter]

Website: (Closed for maintainance and god knows what else)

Quote:It turns out that the bottleneck does indeed lie in the multiple calls to Draw_Particle()...
Is there any code-related suggestion you may have?


Maybe. As far as I know, the DirectX Sprite class is not that bad, at least I think it will batch your drawing calls (collect "similar" calls and sends them to the GPU together as soon as sprite->End() is called). Anyway, one way to draw particles is to use so-called point sprites (search the Direct X docs). These suits your purpose since your particles

- use the same texture
- aren't rotated
- don't change any render states between calls (you only change the tint, i.e. the color/alpha and size)

The key point is to collect all the particles in a vertex buffer and draw them with one call (device->DrawPrimitive() with point list type).

And I can't say if this really is the solution for you, you have to see for yourself. I consider 2500 particles not to be that a problem. Just to give you an estimate: I can do several thousand particles in 60Hz with physics (simple motion with friction), size and tint in C# on a 2GHz Intel machine with a Geforce 8500 GT, without a shader.

Dunno if you know how to use vertex buffer, though.

Quote:Where should I render to, if not in the screen?


This is called a render target, simply put: a "frame buffer" you draw to which is not your "screen". Later, that buffer will be used as a texture for other draw calls to your actual back buffer. I think LogicalError's idea was that when you draw at a lower resolution, there's less to draw, so you may gain speed. But then again render target may have other drawbacks (in this case you draw the particles twice and the final resolution will be the same, so probably no gain).

This topic is closed to new replies.

Advertisement