This is how my tiny brain understands the problem.
Ultimately locality is at the core of the it (server side).
Suppose an entity is processed 30 times in 1 second.
In order for this entity to process its behaviour within that slice of time it must take in information of its nearby space. For example a brick falling through space will need to know about its neighbours so it can go bumpity-bump-bump with other bricks. We can cheat a bit by generally disregarding entities a long way away to reduce the number of interactions from squared to linear.
When working within a single process we can do entity neighbour lookups very quickly as RAM access is pretty quick (we have seen this used to great effect with CUDA physx demos and so on). 10,000 entities will mean 300,000 neighbour queries on top of physics, behaviour calculations etc per second. Quite manageable within one process.
However, to scale up to more entities we want to split the workload across two or more nodes and the processes cannot access each other's RAM. non-local entities (entities from different process) must talk to each other by some other medium.
Quick, large distributed shared memory isn't an option with current hardware (although surely some hardware guru could build it), so we use something like standard networking. Because of this our communication speed between non-local entities has dropped by a factor of about 200 or worse.
To compound this latency we find that highly dynamic environments such as MMO's will vary the load which can mean high volumes of traffic in concentrated areas. Some entities will travel insanely fast across multiple nodes (speeding bullets, airplanes, cars).
Also some very selfish entities are particularly inconsiderate and want to exchange HIGH volumes of data and want to do it instantly. Transactions in market places springs to mind. Or perhaps a car with a 10 hour pre-planned route. Or an entity with a daily schedule.
We can't really ever get around these limitations until we (somehow) increase the speed of access across all nodes to be as quick as if they were all a single unified node.
Meanwhile designing the game/simulation is critical to making the whole experience balance out. I don't even think its possible in the generic sense to make a scaleable MMO - with current hardware. Actors, services, workers, etc I view as a kind of syntax flim-flam, froo-froo. It does little to address the limitations.
My personal pet favourite tech to tackle this problem is MPI https://en.wikipedia.org/wiki/Message_Passing_Interface