Jump to content
  • Advertisement
Sign in to follow this  
poigwym

Vulkan render huge amount of objects

This topic is 732 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hellow!!!

In modern GPU, modern graphic api dx12,vulkan, how many objects   can be drawn at most in 60fps  ? and with  one light?

My scene with 100 boxes and a direction light  runs 15 fps. I 'm not sure is it normal?

I have a look at horde3d engine, it seems  he draws 100 crowded animated models without using instancing , but still runs smoothly, I guess it may be faster

than 60fps , how can he do it?

Need tutorial/links abount rendering big big scene.

Edited by poigwym

Share this post


Link to post
Share on other sites
Advertisement
Sounds like something is wrong as that is a very low framerate for such a simple scene.

Have you run it though a profiler yet?

Do that first and let us know your results :)

Share this post


Link to post
Share on other sites

What are you using to render (API/libraries etc)? Are you using any form of instancing? Posting your render code might allow people to spot some simple mistakes.

Share this post


Link to post
Share on other sites
Same here, show some details and we'll take a peak. Assuming the boxes are made up of 8 vertices and 12 triangles, I agree that it's not OK.

It also helps to no on which hardware/ GPU you're running it (just to be sure)

Share this post


Link to post
Share on other sites

What cozzie said. Not all 66ms of work are born equal. 66ms of work on a GTX 980 isnt the same as 66ms of work on an Intel HD 2000.

Share this post


Link to post
Share on other sites

I have a profile in my engine, and found that the time spent in matrix multiplication and cbuffer commit are most among all instructions.

 

After I shut down all light, The process is simple,  for every object, update transform cbuffer  and commit cbuffer, and draw.

 

In my engine .all shader share 10 cbuffer(5 for vs, 5 for ps).

one cbuffer look like these:
struct CBTransform /*: register(b0)*/
{
  Matrix4f world_matrix;
  Matrix4f world_invTrans_matrix;
  Matrix4f world_view_proj_matrix;
  Matrix4f light_view_proj_matrix;
};
 
 
setTransform(Renderable node)
{
// update transform cb
 CBTransform *p = reinterpret_cast<CBTransform*>(renderbase->MapResource(_cbtrans, 0, D3D11_MAP_WRITE_DISCARD, 0));
  
  Matrix4f world;
  if (node->_hasBone)
      world.initIdentity();
  else
      world = node->getTransform();
   p->world_matrix = world;
   p->world_invTrans_matrix = world;
   Camera *camera = Engine::sceneManager().getMainCamera();
   Matrix4f view = camera->getView();
   Matrix4f proj = camera->getProjection();
   Matrix4f mwvp = world *view *proj;
    p->world_view_proj_matrix = mwvp;
 
   if (_curlight) {
     Matrix4f lightTrans = world * _curlight->getLightTransform(); // world* viewproj
     p->light_view_proj_matrix = lightTrans;
   }
 
 
   renderbase->unMapResource(_cbtrans, 0);
}
 
// since all shaders share 10 cbuffer, I pass 10 to gpu at every draw call. I'm not sure if the method is right??
_context->VSSetConstantBuffers(0, (int)CBufType::MAX_CBUF_GROUP, _cBufs[(int)ShaderType::VERTEX_SHADER]);
_context->PSSetConstantBuffers(0, (int)CBufType::MAX_CBUF_GROUP, _cBufs[(int)ShaderType::PIXEL_SHADER]);
 

Share this post


Link to post
Share on other sites

hehe, I  forget to say those  100   boxex that I draw have the same look, and use same vertex buffer, but don't use instancing technique.

I need to  update 100 times cbuffer and commit 100 times cbuffer per frame.


Is it possible to draw 100 dynamic boxes that has different vertex buffer and texture and not using instancing technique within 60fps  in modern gpu? My cpu and gpu is a little old.  

Edited by poigwym

Share this post


Link to post
Share on other sites

Put some timing code into the hot-spots that you've found (setTransform, etc) and find out exactly how many microseconds per frame you spend on that logic.

Share this post


Link to post
Share on other sites

If you are on a desktop you can see and old flash demo I did here to test what your gpu can handle.

this is in flash by the way so native you should be able to beat what you see here (it is not massively optimised either)

 

There is no instancing and each object has a unique transform, the only thing constant between draws is the material.

 

Lower end gpus should be 500-1000 no problems, mid range 1500-3000, high end can hit 8,000+

 

http://blog.bwhiting.co.uk/?p=314 

Share this post


Link to post
Share on other sites

Not sure if I understood you correctly, but if you're using 10 CBuffers to render 10 boxes with some (forward) lighting, you could do with 2 constant buffers (not 10):

 

1. a CB per frame, containing possible viewProjection matrix and your light properties (for multiple lights)

2. a CB per object, which you update for each update, after the last one is drawn

 

Both having a corresponding C++ struct in your code.

If you're using 10 different CBuffers, that might explain a part of the unexpected performance.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!