What kind of code allows for 1m entities?

Started by
5 comments, last by Hodgman 6 years, 1 month ago

This video shows 1 million + monsters being spawned in Doom and the game is still running perfectly smooth.

How does doom handle physics so that 1m monsters don't make any lag? Each of the monsters seem to behave individually. How is it possible to run so smooth? 

For some who might have followed some of my threads, I'm working on a strategy game which I try to push for the maximum amount of troops possible. What kind of tricks did John Carmack use for Doom to handle 1m monsters smoothly?

Advertisement

From this distance, it seemed to be always the same model so you have just 1 mesh in your graphics memory and just push different animation states and positions to the shader (or whatever was rendering this), so at the end drawing isnt the problem here. I dont know much about the old Doom's background code but physics was not as hard as it is today with Rigidbodies and accurate timing. You can have simple physics just by do some basic hit testing and this is not much in complexity or calculation. A simple quad-tree with small cells to allow fast lookup of just a hand full of possible collision targets and thats it.

At least I would do it in this "simple" (for modern standards) kind of game ;)

Is there any evidence this is running in real time on an unmodified version of the original code? All I can find are videos, not downloads, and some of the movement in the video is clearly not vanilla Doom running in real-time.

7 hours ago, Shaarigan said:

From this distance, it seemed to be always the same model so you have just 1 mesh in your graphics memory

Doom's enemies are all sprites, usually with no more than a couple of frames of animation for each of the 8 angles you view them from. So it's barely more expensive than a particle system in rendering terms.

26 minutes ago, Kylotan said:

Is there any evidence this is running in real time on an unmodified version of the original code? All I can find are videos, not downloads, and some of the movement in the video is clearly not vanilla Doom running in real-time.

My bad. You are right.

I read the mod thread and the video was indeed pre-rendered. The author said it took "15 hours to render" :

Quote

The preview video was recorded with PrBoom+'s (actually GLBoom+) handy -viddump parameter, as mentioned above by Linguica. The end output message was "Timed 3227 gametics in 1897887 realtics = 0.1 frames per second". If my calculations are correct, this means the 92.2 seconds of footage took roughly 15 hours to render.

 

It should be possible to achieve something along these lines in realtime, by suitably leveraging the GPU. It won't necessarily be easy though :)

One can bake all the frames of animation into a single texture array, which should enable rendering this entire thing as a single particle simulation. Simulations of 1 million particles with basic physics are pretty achievable on modern GPUs.

Your main problem is likely to be fill-rate, since those 1 million sprites are quite large and overlapping. You could potentially use a GPU sorting algorithm to sort the particles back-to-front, and let the depth test reject later sprites...

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

Even though that video is not 'real' --

A CPU from Doom's era could do maybe 11 million general instructions per second. A CPU from Doom(2016)'s era could do maybe 400000 million floating point instructions per second. Given that speed up, if the original game could have 30 enemies on screen, a modern CPU should cope with a million.

However, CPU speed is not the only factor. The memory bandwidth stat's have gone from maybe 50MB/s to 50GB/s.

If we assume the original game ran at 60Hz, maxed out memory bandwidth, and performance only relies on the number of entities, which is 30, we get 50MB/60Hz/30 = about 28KB of memory transfers per entity per frame. Scale that up to a million entities and suddenly you need 1.6TB/s memory bandwidth! Seeing we've only got 50GB/s on our modern PC, that means we can only cope with around 31k entities due to the RAM bottleneck, even though our CPU is capable of processing a million! 

So, as for programming techniques to cope with a million entities, focus on memory bandwidth. Keep the amount of storage required per entity as low as possible. Keep the amount of memory touched (read + written) per entity as low as possible. Leverage the CPU cache as much as possible. Optimise for locality of reference. If two entities are both going to perform similar spatial queries on the world geometry, batch them up and run both queries back to back so that the world geometry is more likely to be present in cache for the second query. 

This topic is closed to new replies.

Advertisement