Even though that video is not 'real' --
A CPU from Doom's era could do maybe 11 million general instructions per second. A CPU from Doom(2016)'s era could do maybe 400000 million floating point instructions per second. Given that speed up, if the original game could have 30 enemies on screen, a modern CPU should cope with a million.
However, CPU speed is not the only factor. The memory bandwidth stat's have gone from maybe 50MB/s to 50GB/s.
If we assume the original game ran at 60Hz, maxed out memory bandwidth, and performance only relies on the number of entities, which is 30, we get 50MB/60Hz/30 = about 28KB of memory transfers per entity per frame. Scale that up to a million entities and suddenly you need 1.6TB/s memory bandwidth! Seeing we've only got 50GB/s on our modern PC, that means we can only cope with around 31k entities due to the RAM bottleneck, even though our CPU is capable of processing a million!
So, as for programming techniques to cope with a million entities, focus on memory bandwidth. Keep the amount of storage required per entity as low as possible. Keep the amount of memory touched (read + written) per entity as low as possible. Leverage the CPU cache as much as possible. Optimise for locality of reference. If two entities are both going to perform similar spatial queries on the world geometry, batch them up and run both queries back to back so that the world geometry is more likely to be present in cache for the second query.