The 10000 box challenge

Started by
13 comments, last by Narf the Mouse 11 years, 10 months ago
Here's something slightly stupid but potentially fun you can do. Using your own rendering code, create a scene with 10000 similarly-sized boxes in random positions so that they are all visible in the camera view. Have them all use the same as-simple-as-possible material, no textures, and have 1 unshadowed directional light shine on them. You can use instancing if you want, but make sure your engine is otherwise doing everything it usually does (ie. frustum culling, batch grouping/sorting etc.)

Now break out the profiler and check where you are bottlenecked, and if you have the time & possibility, check which commercial or open-source rendering engines you did just beat if you replicate the same scene using them smile.png

Naturally, this does not have direct real-world applicability as usually there are many different objects, materials, lights etc. in a scene but still it should show the raw upper limit of your rendering code's CPU throughput. Personally, this helped me identify cache miss issues in my own code that would otherwise have gone unnoticed.
Advertisement
What I did lately was trying to figure out what the best way to draw metric ton of cubes (or quads actually) from a gpu perspective. Basically put everything into one huge vbo and draw that. I was using 256000 cubes and concluded that on my gtx560 the fastest was to use a plain indexed vbo (~6.3ms), followed by "instancing" via geometry shader (~7ms), unindexed vbo (~10ms) and instancing via divisors in opengl (24ms).
So my preferred approach at the end was the geometry shader because it also has lowest storage requirements in vram.
In the geometry shader case, did you do culling also in the GS? Or in the plain indexed case, would you modify the index buffer to select what to draw? (Disregard if you were always drawing everything)
I modified my gfxapi Geometry demo scene to render 10k cubes (instead of the default 50). Without any other changes to my render code, I get 15-20fps on my Macbook Air. According to Very Sleepy profiler, the majority of the time is spent inside the Intel HD 3000 GPU driver.

The test code shader computes two directional light contributions (one from camera, one towards the camera).

Note though that my code is not apples-to-apples comparable to rendering engines - it does not have a renderer or a scene system: it's simply a hard-coded rendering loop on top of a low-level graphics API abstraction (see gfxapi in my sig).
I wrote a sort of benchmark for 3D with flash using the GPU

post here:
http://blog.bwhiting.co.uk/?p=362
demo here:
http://bwhiting.co.uk/b3d/stress2/ <-----

press "n" twice to select a cube mesh
press "+" to keep adding 500 cubes
press "m" to change material, from very simple colour to normal mapped

wasd/up down left right to to fly around and get all the cubes into the viewport (scene stats on top left)

maybe someone with an EPIC graphics card and an i7 could hit the 10,000 cube mark (on my machine it really starts to chug - 25 fps with flash player 11.3 release build)

press "space" to toggle the rotations (10,000 of these will be quite intensive)


good luck and hope no machines explode
Nice demo bwhiting. On my macbook air, upping the content amount until 10k cubes were visible, I got about 18fps (pressed spacebar to stop the animation, which helps a bit). The fan got quite audible, but no explosions at least :)
Cool demo! On a fairly powerful notebook (GTX 670M) I got 50fps with 10000 objects, which is roughly as fast as Unity :)
I hit 100k no problem with my i7 2600K and AMD 6950, even with rotations and normal mapping turned on. You puny mortals with your laptops can bow before the might of my desktop! tongue.png
100k visible, normal maps + animation = 35fps
Core 2 Quad 2.5GHz, 550GTX Ti

I hit 100k no problem with my i7 2600K and AMD 6950, even with rotations and normal mapping turned on. You puny mortals with your laptops can bow before the might of my desktop! tongue.png

Pathetic desktop. biggrin.png

My laptop hit 100k no problem, normal mapped + anim. 45fps.

i7 quadcore, Nvidia 4200M

As an additional note: there is no difference in framerate on my machine between any of the stages. No shading, normal mapped, or other.

In time the project grows, the ignorance of its devs it shows, with many a convoluted function, it plunges into deep compunction, the price of failure is high, Washu's mirth is nigh.

This topic is closed to new replies.

Advertisement