• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
Sign in to follow this  
Followers 0
Kurt-olsson

DX11
8800GTX and only 150 Particles before lagging? Help. =)

17 posts in this topic

My method of rendering particle system is.

Have a list of particles with
position
velocity.

Loop list and create two triangles for each particle in a vertex buffer. (in this stage the created triangles are in right position with right rotation)
copy vertexBuffer every frame.

This gives me sooooooo pooor performance.

I have an 8800GTX card and can only render 150 Particles... come on ... 150 particles before it start lagging. Must be somethign big problem with my code.

Here is my Particle class, it is Simple and i have tried to comment every function.
Please let me know if you see something bad.

Another thing is how come the movement is "slow" when particles are visible, i count everything with DeltaTime so shouldent it lag, but the movement/velocity of my player the same even if there is too much drawn on scene?

Here is my Particle class.

[source lang="cpp"]#pragma once
#include <d3d11.h>
#include <d3dx11.h>
#include <d3dx10.h>
#include <vector>

class Particle {
public:
D3DXVECTOR3 position;
D3DXVECTOR3 velocity;
float time;
};

class ParticleSystem
{
private:
struct VERTEX {FLOAT X, Y, Z; D3DXVECTOR3 Normal; FLOAT U, V; D3DXCOLOR Color;};

D3D11_MAPPED_SUBRESOURCE ms;
ID3D11Buffer *m_vertexBuffer, *m_indexBuffer;
int m_vertexCount;
int m_indexCount;
int number_of_particles;
VERTEX* model_vertices;
DWORD* model_indicies;
std::vector<Particle> lstParticles;
int CurrentParticle;

public:

//This is just run onced to create all particles.
void AddParticles() {

float width = 1.0f;
float height = 1.0f;

for (int i = 0; i < 1150;i++) {

/*float rx = (float)rand()/((float)RAND_MAX/0.01f);
float ry = (float)rand()/((float)RAND_MAX/0.001f);
float rz = (float)rand()/((float)RAND_MAX/0.01f);*/
Particle p;
p.position = D3DXVECTOR3(0,0,0);
p.velocity = D3DXVECTOR3(0,0,0);
lstParticles.push_back(p);
}
}


//Set new position and new Velocity.
void Reset(D3DXVECTOR3 start, D3DXVECTOR3 velocity) {

lstParticles[CurrentParticle].position = start;
lstParticles[CurrentParticle].velocity = velocity;
CurrentParticle++;
if (CurrentParticle>=lstParticles.size())
CurrentParticle=0;

}

//This is run every Frame, here is where i set the position and create
//two triangles from a certain position of a particel.
//this makes it easy to just maintain a list of particles with one position instead of 6.

void UpdateParticles(D3DXVECTOR3 mPos,D3DXVECTOR3 mView) {
//float width = 1.0f;
//float height = 1.0f;

D3DXCOLOR particleColor(1.0f,1.0f,1.0f,0.5f);

for (int i=0;i<lstParticles.size();i++) {
int v_index = i*6;
D3DXVECTOR3 particlePos = lstParticles[i].position;

D3DXVECTOR3 look = mView - mPos;
D3DXVec3Normalize(&look,&look);

//This i could move outside becuase it is the same every particle
D3DXVECTOR3 camUp(0,1,0);
D3DXVec3Normalize(&camUp,&camUp);

D3DXVECTOR3 right;
D3DXVec3Cross(&right,&camUp,&look);
D3DXVec3Normalize(&right,&right);

D3DXVECTOR3 up;
D3DXVec3Cross(&up,&look,&right);
D3DXVec3Normalize(&up,&up);

//up = up * height;
//right = right * width;

model_vertices[v_index].Color = particleColor;
model_vertices[v_index].U = 0;
model_vertices[v_index].V = 0;
model_vertices[v_index].X = particlePos.x - right.x * 0.5f + up.x;
model_vertices[v_index].Y = particlePos.y - right.y * 0.5f + up.y;
model_vertices[v_index].Z = particlePos.z - right.z * 0.5f + up.z;
v_index++;

model_vertices[v_index].Color = particleColor;
model_vertices[v_index].U = 0;
model_vertices[v_index].V = 1;
model_vertices[v_index].X = particlePos.x + right.x * 0.5f + up.x;
model_vertices[v_index].Y = particlePos.y + right.y * 0.5f + up.y;
model_vertices[v_index].Z = particlePos.z + right.z * 0.5f + up.z;
v_index++;

model_vertices[v_index].Color = particleColor;
model_vertices[v_index].U = 1;
model_vertices[v_index].V = 0;
model_vertices[v_index].X = particlePos.x - right.x * 0.5f;
model_vertices[v_index].Y = particlePos.y - right.y * 0.5f;
model_vertices[v_index].Z = particlePos.z - right.z * 0.5f;
v_index++;

//Second Triangle

model_vertices[v_index].Color = particleColor;
model_vertices[v_index].U = 1;
model_vertices[v_index].V = 0;
model_vertices[v_index].X = particlePos.x - right.x * 0.5f;
model_vertices[v_index].Y = particlePos.y - right.y * 0.5f;
model_vertices[v_index].Z = particlePos.z - right.z * 0.5f;
v_index++;

model_vertices[v_index].Color = PlaneVerticies[0].Color;
model_vertices[v_index].U = 0;
model_vertices[v_index].V = 1;
model_vertices[v_index].X = particlePos.x + right.x * 0.5f + up.x;
model_vertices[v_index].Y = particlePos.y + right.y * 0.5f + up.y;
model_vertices[v_index].Z = particlePos.z + right.z * 0.5f + up.z;
v_index++;


model_vertices[v_index].Color = particleColor;
model_vertices[v_index].U = 1;
model_vertices[v_index].V = 1;
model_vertices[v_index].X = particlePos.x + right.x * 0.5f;
model_vertices[v_index].Y = particlePos.y + right.y * 0.5f;
model_vertices[v_index].Z = particlePos.z + right.z * 0.5f;
v_index++;

//update position with velocity
lstParticles[i].position+=lstParticles[i].velocity;
}

}

//Just create the Vertex Buffer with as many Particles there is * 6 because we render two triangles for the Quad.
//This is because i don´t know how to draw TRIANGLE_STRIP in different position, something with ResetStrip, but i think
//it only works with shaders.
void Init(ID3D11Device* dev) {
CurrentParticle = 0;

number_of_particles = lstParticles.size();
m_vertexCount = (number_of_particles * 6);
m_indexCount = (number_of_particles * 6);

model_vertices = new VERTEX[m_vertexCount];
model_indicies = new DWORD[m_indexCount];

//This might be a problem? The Indicies are never the same as one vertex, so it is a s big as VertexBuffer.
for (int i = 0; i<(number_of_particles * 6);i++) {
model_indicies[i] = i;
}

// create the vertex buffer
D3D11_BUFFER_DESC bd;
ZeroMemory(&bd, sizeof(bd));

bd.Usage = D3D11_USAGE_DYNAMIC;
bd.ByteWidth = sizeof(VERTEX) * m_vertexCount;
bd.BindFlags = D3D11_BIND_VERTEX_BUFFER;
bd.CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;

dev->CreateBuffer(&bd, NULL, &m_vertexBuffer);

// create the index buffer
bd.Usage = D3D11_USAGE_DYNAMIC;
bd.ByteWidth = sizeof(DWORD) * m_indexCount;
bd.BindFlags = D3D11_BIND_INDEX_BUFFER;
bd.CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;
bd.MiscFlags = 0;

dev->CreateBuffer(&bd, NULL, &m_indexBuffer);



}

int GetIndexCount() {
return m_indexCount;
}

//This method is run EVERY Frame, it takes the Updated Vertex Buffer and then copies it to the RAM.
void CopyAndSetBuffers(ID3D11DeviceContext* devcon) {


// select which vertex buffer to display
UINT stride = sizeof(VERTEX);
UINT offset = 0;

// copy the vertices into the buffer
//THIS uses the D3D11_MAP_WRITE_DISCARD so it should be ok for updating every frame, right?
devcon->Map(m_vertexBuffer, NULL, D3D11_MAP_WRITE_DISCARD, NULL, &ms); // map the buffer
memcpy(ms.pData, model_vertices, sizeof(VERTEX) * m_vertexCount); // copy the data
devcon->Unmap(m_vertexBuffer, NULL);
//copy the index buffers i
//THIS uses the D3D11_MAP_WRITE_DISCARD so it should be ok for updating every frame, right?
devcon->Map(m_indexBuffer, NULL, D3D11_MAP_WRITE_DISCARD, NULL, &ms); // map the buffer
memcpy(ms.pData, model_indicies, sizeof(DWORD) * m_indexCount); // copy the data
devcon->Unmap(m_indexBuffer, NULL);

devcon->IASetVertexBuffers(0, 1, &m_vertexBuffer, &stride, &offset);
devcon->IASetIndexBuffer(m_indexBuffer, DXGI_FORMAT_R32_UINT, 0);
}

void Clean() {
m_indexBuffer->Release();
m_vertexBuffer->Release();
}


};[/source]
0

Share this post


Link to post
Share on other sites
Can you please define "lag"; do you mean that the time per frame increases?
Have you timed [font=courier new,courier,monospace]UpdateParticles[/font] to see how much CPU time it's consuming?
0

Share this post


Link to post
Share on other sites
First, it seems like you have 1150 particles, not 150. Still, that shouldn't be all too slow.. how much is it lagging?

Make sure you compile in Release, not Debug, and move those things you commented yourself outside the loop.
Then switch to only creating 4 vertices per quad instead of 6, but still use 6 indices. Indices can re-use vertices, so you only need 4 vertices and indices [0, 1, 2] and [0, 2, 3] for example, to make 2 triangles. This saves you some bandwidth.

If it's still not good enough, look into using a geometry shader, which can save you a lot of CPU time.
0

Share this post


Link to post
Share on other sites
HodgeMan:
Can you please define "lag"; do you mean that the time per frame increases?
Have you timed UpdateParticles to see how much CPU time it's consuming?

My lag is like this:
I move my camera with a velocityVector lets say (0,0,0.001f*deltaTime)
Without particles it feels like i am moving "fast".
But with all particles i am moving "slow" but the velocity vector is still the same.
I have not times by Particles, dont know how.

Erik Rufelt:
1150 particles, correct my misstake.
I also forgot to mention i do a RenderTo Texture and use that texture to map a cube.
So i render everything twice so that should cut my performance in 50% but i still think it is to slow.
The only thing i draw is a 1500 verticies model and my Particles + Cube.

I think Indicies performance upgrade is next thing to look into, but i still think it is something wrong.
My plan is to draw at least 10 more 1000 vertices models in my level.

hm...
i will move the code as in my samples and try in release mode. Edited by KurtO
0

Share this post


Link to post
Share on other sites
Try displaying deltaTime on the screen, and measure the difference in milliseconds. If you compare drawing 1000 particles to not drawing anything at all, then it should be much slower. Even something that is very fast is infinitely slower than something that takes zero time. Drawing nothing is close to zero.
If you aim for 60 frames per second, that gives you a max deltaTime of ~16.5 milliseconds, so compare the time taken to draw 1000 particles to that, and see how many percent of the target time is spent. Edited by Erik Rufelt
0

Share this post


Link to post
Share on other sites
[quote name='KurtO' timestamp='1352035146' post='4997157']
Without particles it feels like i am moving "fast".
[/quote]
We need some numbers. Get the free version of fraps to display the FPS at least or best to incorporate some kind of time measurement in your code.

Do you send the particles in a single batch to the GPU or are you using a batch for each particle ? The latter will most likely slow down your performance even for only 1150 particles. An other issue would be to paint 1150 large particles, which could result in an huge overdraw rate, an other reason for a slow down.

Best to provide some more data and a screenshot.
0

Share this post


Link to post
Share on other sites
FRAPS was a very good idea!

When i have 1500 particles at the beginning at the same place (0,0,0) and player real close to them my FPS is down to 14FPS.
But when i shoot them away and they are away from the player i get around 250~400 fps.
when the particles are far far away i get as high as 550 fps.

it feels that i cant draw my particles close at the same place... Edited by KurtO
0

Share this post


Link to post
Share on other sites
That's normal enough - you're getting heavy overdraw and bottlenecking on fillrate here. Probably covering a good-ish percentage of the entire screen area 1500 times which will bring any GPU to it's knees.
0

Share this post


Link to post
Share on other sites
Suddenly i have more respect of the game-engines out there. It feels impossible to get the visuals they do from my hardward. =)

I will try implement indexed vertexbuffer for 2 of my 6 vertices of my two triangles as Erik said.
Maybe that will lift the performance a little bit.

Also, how do you get transparacy of color black?

If i have alphaBlending on the FPS drops even more...
0

Share this post


Link to post
Share on other sites
I would recommend that you also make use of the geometry shader stage, that way you only have to use one vertex for the each sprite, here's a good article on how to do it:
[url="http://takinginitiative.net/2011/01/12/directx10-tutorial-9-the-geometry-shader/"]http://takinginitiative.net/2011/01/12/directx10-tutorial-9-the-geometry-shader/[/url]
0

Share this post


Link to post
Share on other sites
In your pixel-shader, try something like:
if(color.a == 0)
discard;

Whether it's faster or not is hard to say. As your problem is clearly fillrate, and your card is a few years old, there might not be all too much that you can do, other than making the particles smaller on the screen.

One technique you can try to reduce fillrate is to draw polygons that aren't squares or quads, so that you get as little area as possible on screen for your particles, like shown for example here: [url="http://www.humus.name/index.php?page=Comments&ID=266"]http://www.humus.name/index.php?page=Comments&ID=266[/url]
0

Share this post


Link to post
Share on other sites
papulko, using a Geometry Shader is clearly my next step. When the game is finishes i might "upgrade" that part. It seems really nice to render all particles on the GPU.

Erik, color.a == 0 check looks like a good way to sort this out.

I will definently try to to use only a triangle with texture coords so that my texture is in the middle, because of my transparacy i really dont need a quad if my texture fits inside my triangle! This is really smart!

Correct me if i am wrong, but if i render all my triangles with different positions, i won¨t gain any performance of index-buffer becuase all my vertex will be on seperate places i guess? right?
0

Share this post


Link to post
Share on other sites
By the way.
Is it better to have a vertex buffer that contains ALL particles and only update position.

OR

create a new VertexBuffer with only the particles that are Alive and then SWAP that vertexBuffer each frame?
0

Share this post


Link to post
Share on other sites
Probably only update the alive ones..
However, in your case this is most likely irrelevant. As you get high FPS when particles are far away, your vertices are not limiting your performance. Because of this both index-buffers and geometry shaders will gain very little.

Using triangles instead of quads could be better or worse, and you probably want to use like 8-corner polygons or something. Look again at the page I linked. The only thing that matters for you is how many Pixels are covered on the screen. If you use 10 vertices to cover 80% as many pixels, then that is a win.

Your graphics card does two things for you:
1. Transform vertices
2. Fill pixels

As your performance is much worse when your particles are close, it means that step 1 is cheap for you and doesn't matter very much. Index-buffers and geometry shaders improve step 1 to be even better. If you get 500 FPS when particles are far away and 14 FPS when particles are close, that gives approximately:
Step 1: 2 milliseconds
Step 2: 70 milliseconds

That means if you make step 1 twice as fast, your FPS close will still be close to 14. So it does not matter much at all.
If you make Step 2 twice as fast, that makes a much larger difference, even if Step 1 gets slower by increasing the vertex count. So choose vertices so that you cover the least number of pixels, if you want many particles covering a large number of pixels on the screen.

However, no matter what you do it is likely impossible to get 1000 particles covering a large part of the screen on your graphics card, it's simply too many pixels. You have to make your particles a bit smaller or draw fewer particles when they get close. If you have 1000 particles very close to the screen, most won't be visible, so you can maybe sort them and remove those behind others or similar.
0

Share this post


Link to post
Share on other sites
Erik, thank you so much for your explanation and your time to write your answer to me.
Now i finally understand that it is the screen pixel coverage that is my problem.

My optimization will be smaller particles and draw fewer when close, that should do the trick!

again, thank you very much. Edited by KurtO
0

Share this post


Link to post
Share on other sites
Holy shit!

you know what you are talking about!
making the particles 0.05f width/height instead of 1,0f makes the particles SUPERFAST!
The fillrate is down and the speed is UP!

5000 particles at same position ~ 200FPS
and all around the place = 450 FPS, hardly no drop at all!

COOOOL!

As you said Erik, i have not optimized index or quad etc, just the size of particle made it superfast!

thanks again.
0

Share this post


Link to post
Share on other sites
Another fairly easy thing you can do is when particles get closer and take up large portions of the screen, you can automatically fade them out, until the point where you don't draw them anymore. Of course this decision has to be made in the vertex shader (or earlier) to avoid the pixel shading cost.

Another much more complicated optimization is to render the particles to a lower resolution render target and apply them to the scene afterward: http://http.developer.nvidia.com/GPUGems3/gpugems3_ch23.html
0

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0

  • Similar Content

    • By YixunLiu
      Hi,
      I have a surface mesh and I want to use a cone to cut a hole on the surface mesh.
      Anybody know a fast method to calculate the intersected boundary of these two geometries?
       
      Thanks.
       
      YL
       
    • By hiya83
      Hi, I tried searching for this but either I failed or couldn't find anything. I know there's D11/D12 interop and there are extensions for GL/D11 (though not very efficient). I was wondering if there's any Vulkan/D11 or Vulkan/D12 interop?
      Thanks!
    • By lonewolff
      Hi Guys,
      I am just wondering if it is possible to acquire the address of the backbuffer if an API (based on DX11) only exposes the 'device' and 'context' pointers?
      Any advice would be greatly appreciated
    • By MarcusAseth
      bool InitDirect3D::Init() { if (!D3DApp::Init()) { return false; } //Additional Initialization //Disable Alt+Enter Fullscreen Toggle shortkey IDXGIFactory* factory; CreateDXGIFactory(__uuidof(IDXGIFactory), reinterpret_cast<void**>(&factory)); factory->MakeWindowAssociation(mhWindow, DXGI_MWA_NO_WINDOW_CHANGES); factory->Release(); return true; }  
      As stated on the title and displayed on the code above, regardless of it Alt+Enter still takes effect...
      I recall something from the book during the swapChain creation, where in order to create it one has to use the same factory used to create the ID3D11Device, therefore I tested and indeed using that same factory indeed it work.
      How is that one particular factory related to my window and how come the MakeWindowAssociation won't take effect with a newly created factory?
      Also what's even the point of being able to create this Factories if they won't work,?(except from that one associated with the ID3D11Device) 
    • By ProfL
      Can anyone recommend a wrapper for Direct3D 11 that is similarly simple to use as SFML? I don't need all the image formats etc. BUT I want a simple way to open a window, allocate a texture, buffer, shader.
  • Popular Now