Basic Questions

Graphics and GPU Programming Programming

Started by LostTime77 June 27, 2008 01:15 PM

12 comments, last by LostTime77 15 years, 10 months ago

112

Author

June 27, 2008 01:15 PM

Hello everybody! This is my first time posting on these forums, but I must say that GameDev and a lot of forum posters have helped me in the past with programming woes. So this is not the first time I have seen this site. Hopefully what I am about the post / ask is in the right forum, if it is not somebody can move it :P. Anyways to the questions. A little background. I am currently trying to develop a small physics engine that supports at least 1,000 moving objects (in this case circles), using Direct X. You can see some of my work here

(is it OK to post that link?) Now in the video, collision detection is off and it uses an on^2 algorithm because I wanted to get a simple demo up quickly. This engine also does not use a vertex buffer. My challenge now is getting that engine to support 1000 objects, maybe up to 10000 objects. As you know, I had to make the circle code myself (41 polygons each) because Direct X does not plot circles. My dilemma is this. How do I effectively load that many objects into a vertex buffer and be able to update the positions of each of them at the same time and still have time left over to do collision detection and response and the other stuff I need to do? The engine in the video uses DrawPrimitiveUP calls to Direct X, in a simple for loop, that iterates through each object's vertices and does a draw call. This works for about 500 objects (assuming I take the collision detection and response with other objects out), and paints them nicely on the screen. I have a vertex buffer that is dynamic (although I have tried static), and basically what happens is I load the vertices on program start up and then lock the buffer once each frame to update all object positions. This however does not work very well in the current project. Only about 3100 vertices can be supported if that. Now assuming we take the matrix transformation bit out and use basic pixel particles for a second here. I update the positions of the particles manually using the CPU. What is the most efficient method to say load 20,000 or more pixel particle vertices into the vertex buffer and achieve high frame rates? Right now I have a class called particle that has properties such as velocity acceleration and of course a single vertex that stores xyz and position data. This vertex gets updated based on the CPU each frame. I also have a vector of particles in the scene. I am filling them into the vertex buffer one by one using methods described in this article -> http://www.gamedev.net/reference/articles/article1946.asp. Of course this can not support as many particles as I need to support. It slows down extremely fast with a high number. What I would really like to do is update the positions of the particles then memcpy them all in one go to the vertex buffer. Would this be efficient for say 20,000 particles? How would I go about memcpying the Vertex positions that are stored in a vector of class particle? By the way, I know all of this is possible. I have seen for example the chipmunk physics engine do this rather easily. It supports a ton of objects. I would like to know how people do the rendering calls for each object, being able to update all those positions. Look at this video ->

I think geometry instancing may be out of the question because each of those objects are unique. How does the maker render all those unique objects, each with position, velocity, acceleration, etc? [Edited by - LostTime77 on June 27, 2008 2:21:36 PM]

Buckeye

10,754

June 27, 2008 01:35 PM

Quote:uses DrawPrimitiveUP calls to Direct X, in a simple for loop

If the vertex buffer has been updated, can you draw all of them with one call, rather than in a looop? I'm assuming that your "for loop" has something to do with drawing objects separately. Eliminating a loop should save some time.

If that's not practical (for some reason) but you're still using a single vertex buffer, you only need to set the stream source once. You may not do that but it's for you to check.

Please don't PM me with questions. Post them in the forums for everyone's benefit, and I can embarrass myself publicly.

You don't forget how to play when you grow old; you grow old when you forget how to play.

LostTime77

112

Author

June 27, 2008 01:46 PM

Thanks for the response.

The new engine does not use a loop to draw all the objects. I simply load the vertices in the buffer and use a single draw call with a point list to draw the points on the screen. I set the stream source once. The real problem I think is the method in which I load the vertices in the buffer. I don't know how to memcpy all of them in at once because of my vector. I don't know how to memcpy one specific part of a class (which is the vertex position) from the vector for all of the vertices in that vector. As a result, I am stuck plugging them in one by one with the pVoid++ method.

Buckeye

10,754

June 27, 2008 02:56 PM

Quote:I don't know how to memcpy one specific part of a class

[smile]Nor do I.

Telling you something you already know --> you won't be able to use memcpy to do that. Sounds like the best you can do is ensure the vertex position is the first variable in the class and loop through the class array, casting each class pointer as BYTE* (or similar) and copying sizeof(D3DXVECTOR3).

Please don't PM me with questions. Post them in the forums for everyone's benefit, and I can embarrass myself publicly.

You don't forget how to play when you grow old; you grow old when you forget how to play.

Mike nl

390

June 27, 2008 03:23 PM

Why not use the vertex buffer for the particle's position? So velocity and acceleration are in your Particles vector, and the position is in the vertex buffer. It's easy to make sure Particles corresponds with VertexBuffer.<br><br>On a side note, I'm wondering how useful it is to have a vertex buffer if you're gonna lock and update it every frame. Would or could a (dynamic) vertex buffer have an advantage over DrawPrimitiveUP?

Million-to-one chances occur nine times out of ten!

LostTime77

112

Author

June 27, 2008 03:30 PM

Im not sure what you mean by velocity and acceleration can be in the particles vector. Do you mean have one velocity and have it constantly changing? To the other question, I do have a dynamic buffer. I am not using DrawPrimitiveUP, I am using DrawPrimitive with the vertex buffer. The DrawPrimitiveUP calls were with my first physics engine. I am redesigning it so hopefully it can handle more objects. The question is, whats the most efficient way to update the position of say 20000 particles (each having their own velocity, forces, acceleration) and stick them in the vertex buffer to be rendered? I know people do this, because the videos on youtube tell me so :)

Thanks for the responses guys.

Mike nl

390

June 27, 2008 03:43 PM

I mean, from what I understand you currently do something like:

for (i = 0; i < particles.size(); i++){    particles.position += particles.velocity * dt;    particles.velocity += particles.acceleration * dt;}D3DXVECTOR3* pVertices;pVB->Lock((LPVOID)pVertices)for (i = 0; i < particles.size(); i++){    pVertices = particles.position;}pVB->Unlock()

And I'm saying, how about this:

D3DXVECTOR3* pVertices;pVB->Lock((LPVOID)pVertices)for (i = 0; i < particles.size(); i++){    pVertices          += particles.velocity * dt;    particles.velocity += particles.acceleration * dt;}pVB->Unlock()

And the question about vertex buffer vs DrawPrimitiveUP was a general one, I'm wondering which results in better performance under what circumstances.

Million-to-one chances occur nine times out of ten!

MasterWorks

496

June 28, 2008 12:44 AM

You also might want to look into using READONLY | NOOVERWRITE in your vertex buffer locking flags, although I'm not sure if this is going to help you with that quantity of particles. The idea is that while the GPU is rendering one 'chunk' of vertices, you are modifying/updating a different 'chunk' so as to avoid a stall.
In other words, using the straightforward approach, the GPU might not yet be finished rendering a previous frame while you are trying to send it new data; you can avoid this by using sequential subsets of the vertex buffer with the appropriate locking flags, and in some cases get a very significant speed improvement.

LostTime77

112

Author

June 28, 2008 12:08 PM

Thanks guys ->

I think I figured out the problem. After testing out MikeNL's algorithm I did a little further testing. It seems my CPU cant handle the amount of particles I am asking it to. It all has to do with the for loops I am using in terms of updating the particle position. If I add more things to the for loop, such as the updating of the vertex buffer, this bogs it down even more with the amount of particles I am asking it to do. The gpu rendering is not the problem -> it is the cpu.

LostTime77

112

Author

June 28, 2008 03:26 PM

Ive got an idea! Its obviously been done before.. but here goes. Right now ive got a circle class that has an array of vertices, velocity, acceleration, and a few member functions. Upon creation, the circle is constructed as well as the vertex array based on sin and cos and the desired radius. My idea is to fill the vertex buffer with the initial vertices of the amount of circle objects I want in the scene upon startup of the program. Then instead of unlocking and locking each frame to move the vertices, I instead use a transform like a matrix to transform all the vertices at once. Currently, after I find the new position of the central vertex on a circle, I have to update and offset the rest of the vertices that make up the circle so they move with it. I am thinking that if I find the new position of the center vertex, then use a matrix for the transform, I can skip the entire offset step. If I have 31 vertices in a circle, thats 31 vertices I have to offset after I figure the new position. With a matrix, I could set the transform and have the gpu take care of the offset for each circle. That would save me 31 iterations each time I move each and every circle.

My second question -> So I have a matrix, that gets updated with the specific position of each object when I update the positions of each object. Then I set a D3DDevice->SetTransform(&translation, Xpos, yPos, zPos). Will this automatically transform the specific vertices I ask it to? Such as say I am updating object 4 for position. I set the transform in the update, will that transform only update object 4's verticees, not object 1, 2, and 3's vertcies? I am thinking about writing a custom vertex shader to do the transformations for me.

Third question -> Is it possible (on each position update of each object) to send a vertex shader the positions of 31 vertices of the object, and instead of sending a matrix, I could send a value for how much each vertex has moved? I know matrix math is easiest, but it seems like overkill for what im doing. What do you all think?

Basic Questions

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Basic Questions

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines