slow instancing

Started by
15 comments, last by _Flame_ 4 years, 11 months ago

Hello.

i'm learning instancing by this article - https://learnopengl.com/Advanced-OpenGL/Instancing.

In this article an author draws 100000 objects.

I'm drawing grass billboards where each object consists of 2 quads. I have about 11-9 fps if i draw only 20000 objects and maybe 15-13 if i don't update billboard transformations.

I have tried to increase the number of triangles per quad but it made it worse.

Can someone shed light on why it can be so slow in my case? Maybe my hardware is insanely slow? I have NVidia GForce 940MX.

 

Advertisement

Crap, i forgot to switch to "High performance NVidia processor" after driver update.

Now I have 60-25 fps(no billboard transformation update). 60 fps when I move camera away from grass and 25 fps when all grass is seen.

Slow video processor anyway? 

I've read a couple of times already that instancing with low triangle counts is not a good idea (see here for an example). Maybe try using a geometry shader to build your quads instead? Or maybe your gpu really is simply too slow ?

There is an example in the blue book that renders a million "grassblades". I just copied it for test in but don't have a frame rate. It is >vblank (GTX 970 type).

Edit: at 10 million it stutters ...

21 hours ago, _Flame_ said:

Now I have 60-25 fps(no billboard transformation update). 60 fps when I move camera away from grass and 25 fps when all grass is seen.

Is it worst or better without instancing ?

3 hours ago, _Silence_ said:

Is it worst or better without instancing ?

It's definitely better but 25 fps for 20000 quads is still very far from what i expect.

6 hours ago, Koen said:

I've read a couple of times already that instancing with low triangle counts is not a good idea (see here for an example). Maybe try using a geometry shader to build your quads instead? Or maybe your gpu really is simply too slow ?

I definitely can do grass/billboards better (without instancing at all) but here i'm focused on instancing. I have tried to have more triangles per quad but it made it worse. I would be relieved if problem is just gpu performance.

Btw i'm using instanced arrays.

3 hours ago, _Flame_ said:

I have tried to have more triangles per quad but it made it worse.

Artificially adding triangles that are not needed to model the required shape is definitely a bad idea. What is meant by the statement about low polycounts, is that performance gains relative to other techniques go down as polycount decreases -- not absolute performance. So instancing with only two triangles per quad should not be slower than using one thousand triangles per quad. It's just that compared to rendering the same geometry without instancing -- so "duplicating" vertex buffers, or at least use more draw calls -- the relative gains will be bigger if your instances have more triangles.

3 hours ago, _Flame_ said:

I would be relieved if problem is just gpu performance.

In the comments under the tutorial from your original post some people shared their framerates for the demo. Maybe that can give an indication. I'm assuming your gpu is a mobile one (you can switch between integrated and discrete, and it has an 'M' in its name ?), so I guess it's kindof expected that it performs a bit less than a desktop gpu of the same generation...
 

Maybe the way you update the billboard transformations each frame is not optimal? Look here for more info.

36 minutes ago, Koen said:

Artificially adding triangles that are not needed to model the required shape is definitely a bad idea. What is meant by the statement about low polycounts, is that performance gains relative to other techniques go down as polycount decreases -- not absolute performance. So instancing with only two triangles per quad should not be slower than using one thousand triangles per quad. It's just that compared to rendering the same geometry without instancing -- so "duplicating" vertex buffers, or at least use more draw calls -- the relative gains will be bigger if your instances have more triangles.

In the comments under the tutorial from your original post some people shared their framerates for the demo. Maybe that can give an indication. I'm assuming your gpu is a mobile one (you can switch between integrated and discrete, and it has an 'M' in its name ?), so I guess it's kindof expected that it performs a bit less than a desktop gpu of the same generation...
 

Maybe the way you update the billboard transformations each frame is not optimal? Look here for more info.

Thanks, i didn't look at posts in the tutorial. There is a guy who has only 5000 instances with 60 fps and his GPU is GeForce GT625.

Other guy stated that nvidia cards below x50 are crap. I think my GPU falls to that category, especially because it's mobile.

I didn't know there is such huge difference between GPUs.

No, i'm not updating instance transformations at all. Thanks, again!

"Talk is cheap, show me your code"

L. Thorvalds (i think)

:-)

Try using a geometry shader, feeding it a vertex buffer which you expand to quads and output to vertex shader, to see what is faster :)

.:vinterberg:.

This topic is closed to new replies.

Advertisement