• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
Sign in to follow this  
Followers 0
AgentC

The 10000 box challenge

14 posts in this topic

Here's something slightly stupid but potentially fun you can do. Using your own rendering code, create a scene with 10000 similarly-sized boxes in random positions so that they are all visible in the camera view. Have them all use the same as-simple-as-possible material, no textures, and have 1 unshadowed directional light shine on them. You can use instancing if you want, but make sure your engine is otherwise doing everything it usually does (ie. frustum culling, batch grouping/sorting etc.)

Now break out the profiler and check where you are bottlenecked, and if you have the time & possibility, check which commercial or open-source rendering engines you did just beat if you replicate the same scene using them [img]http://public.gamedev.net//public/style_emoticons/default/smile.png[/img]

Naturally, this does not have direct real-world applicability as usually there are many different objects, materials, lights etc. in a scene but still it should show the raw upper limit of your rendering code's CPU throughput. Personally, this helped me identify cache miss issues in my own code that would otherwise have gone unnoticed.
1

Share this post


Link to post
Share on other sites
What I did lately was trying to figure out what the best way to draw metric ton of cubes (or quads actually) from a gpu perspective. Basically put everything into one huge vbo and draw that. I was using 256000 cubes and concluded that on my gtx560 the fastest was to use a plain indexed vbo (~6.3ms), followed by "instancing" via geometry shader (~7ms), unindexed vbo (~10ms) and instancing via divisors in opengl (24ms).
So my preferred approach at the end was the geometry shader because it also has lowest storage requirements in vram.
0

Share this post


Link to post
Share on other sites
In the geometry shader case, did you do culling also in the GS? Or in the plain indexed case, would you modify the index buffer to select what to draw? (Disregard if you were always drawing everything)
0

Share this post


Link to post
Share on other sites
I modified my [url="https://dl.dropbox.com/u/40949268/emcc/aabb_obb_sphere.html"]gfxapi Geometry demo scene[/url] to render 10k cubes (instead of the default 50). Without any other changes to my render code, I get [url="https://dl.dropbox.com/u/40949268/code/10kCubes.png"]15-20fps on my Macbook Air[/url]. According to Very Sleepy profiler, the majority of the time is spent [url="https://dl.dropbox.com/u/40949268/code/10kCubes_VerySleepy.png"]inside the Intel HD 3000 GPU driver[/url].

The test code shader computes two directional light contributions (one from camera, one towards the camera).

Note though that my code is not apples-to-apples comparable to rendering engines - it does not have a renderer or a scene system: it's simply a hard-coded rendering loop on top of a low-level graphics API abstraction (see gfxapi in my sig).
0

Share this post


Link to post
Share on other sites
I wrote a sort of benchmark for 3D with flash using the GPU

post here:
[url="http://blog.bwhiting.co.uk/?p=362"]http://blog.bwhiting.co.uk/?p=362[/url]
demo here:
[url="http://bwhiting.co.uk/b3d/stress2/"]http://bwhiting.co.uk/b3d/stress2/[/url] <-----

press "n" twice to select a cube mesh
press "+" to keep adding 500 cubes
press "m" to change material, from very simple colour to normal mapped

wasd/up down left right to to fly around and get all the cubes into the viewport (scene stats on top left)

maybe someone with an EPIC graphics card and an i7 could hit the 10,000 cube mark (on my machine it really starts to chug - 25 fps with flash player 11.3 release build)

press "space" to toggle the rotations (10,000 of these will be quite intensive)


good luck and hope no machines explode
1

Share this post


Link to post
Share on other sites
Nice demo bwhiting. On my macbook air, upping the content amount until 10k cubes were visible, I got about 18fps (pressed spacebar to stop the animation, which helps a bit). The fan got quite audible, but no explosions at least :)
0

Share this post


Link to post
Share on other sites
Cool demo! On a fairly powerful notebook (GTX 670M) I got 50fps with 10000 objects, which is roughly as fast as Unity :)
0

Share this post


Link to post
Share on other sites
I hit 100k no problem with my i7 2600K and AMD 6950, even with rotations and normal mapping turned on. You puny mortals with your laptops can bow before the might of my desktop! [img]http://public.gamedev.net//public/style_emoticons/default/tongue.png[/img] Edited by MJP
1

Share this post


Link to post
Share on other sites
100k visible, normal maps + animation = 35fps
Core 2 Quad 2.5GHz, 550GTX Ti
0

Share this post


Link to post
Share on other sites
[quote name='MJP' timestamp='1339874312' post='4949866']
I hit 100k no problem with my i7 2600K and AMD 6950, even with rotations and normal mapping turned on. You puny mortals with your laptops can bow before the might of my desktop! [img]http://public.gamedev.net//public/style_emoticons/default/tongue.png[/img]
[/quote]
Pathetic desktop. [img]http://public.gamedev.net//public/style_emoticons/default/biggrin.png[/img]

My laptop hit 100k no problem, normal mapped + anim. 45fps.

i7 quadcore, Nvidia 4200M

As an additional note: there is no difference in framerate on my machine between any of the stages. No shading, normal mapped, or other. Edited by Washu
0

Share this post


Link to post
Share on other sites
intel i5 2500

radeon hd 6870 (i think its factory overclocked by a bit)

got something like 17k visible without fps dropping below 60. If i remember right one or more of (material/shape/rotation) didnt really affect fps.

I remember testing this same thing earlier...
0

Share this post


Link to post
Share on other sites
[quote name='Waterlimon' timestamp='1339877683' post='4949883']
intel i5 2500

radeon hd 6870 (i think its factory overclocked by a bit)

got something like 17k visible without fps dropping below 60. If i remember right one or more of (material/shape/rotation) didnt really affect fps.

I remember testing this same thing earlier...
[/quote]
Yes, just went back and tested again, none of the shapes changed the framerate at all (nor the ms per frame), nor did any of the other stages. Which means this demo is CPU bound most likely, probably the culling code.
0

Share this post


Link to post
Share on other sites
[quote name='Washu' timestamp='1339878025' post='4949885']
[quote name='Waterlimon' timestamp='1339877683' post='4949883']
intel i5 2500

radeon hd 6870 (i think its factory overclocked by a bit)

got something like 17k visible without fps dropping below 60. If i remember right one or more of (material/shape/rotation) didnt really affect fps.

I remember testing this same thing earlier...
[/quote]
Yes, just went back and tested again, none of the shapes changed the framerate at all (nor the ms per frame), nor did any of the other stages. Which means this demo is CPU bound most likely, probably the culling code.
[/quote]

Thanks for giving it a whirl, the timing for the culling in ms is displayed in the top left (1st line of green text) and for me doesn't usually go over 2ms even for very large numbers of objects,, as far as I know the culling cannot be speeded up any more without using a hierarchical bounding structure. Am currently writing a post about the technique I use, its nothing new by any means but might help someone having the code out there and someone might be able to improve it.

The demo is definitely CPU bound though for anyone with a good graphics card, and the bulk of the time is spent in the issuing of the drawTriangles() function. If I remember rightly its around the 30% mark maybe even more.

This is a shame as a really eats into time left for the CPU to work on anything else.

For those of you who have a good idea about bottlenecks is this commonplace and is it usually that high, I imagine its to do with how adobe wraps implements things under the hood, it just simply be a limitation of the speed of ActionScript,

If you were to natively issue draw calls without changing state to render a single triangle would you expect it to still be so expensive that 5,000 calls on a middle end machine?
0

Share this post


Link to post
Share on other sites
Neat. I can handle them at about 530fps on a puny laptop with a Radeon 6490m. 100k boxes gives me 73fps, which comfortably sits ~10fps above my refresh rate so I've some headroom for transients.

This is just using some pretty standard D3D11 instancing; per-instance data consists of a matrix and a colour, each box shares a static vertex buffer (8 vertexes, position only) and index buffer (36 indexes), they start out as 1x1 cubes and the matrix expands them to their proper scale.

I haven't profiled but I already know that I'm bottlenecking on CPU-side matrix transforms and instance buffer uploads. I could reduce that by just using the position and the scale as per-instance data and constructing a matrix in the vertex shader, but that would be cheating by optimizing for this benchmark. I might do it anyway for fun.

Update: yeah, that was a useful boost. Trading a CPU-side matrix transform per-box and a larger per-instance vertex versus an extra GPU-side matrix transform per-vertex (as well as constructing a matrix on the fly in my vertex shader) was well worth it - 100k case up to 120fps. Now I gotta look for other areas I can similarly optimize... Edited by mhagain
0

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0