See if you can handle some of the operations as seperate staged/phased data waves(?) within the server with seperate core-threads working each 'phase' in a pipeline fashion.
Thats to allow use of coarse locks on large chunks of data, which get handed off to the next phase en masse (With minimal lock manipulations)
That is independant groupings of processings like network processing(client sessions) and decoding inbound commands, actioning and arbitrating game mechanics events, outbound data to clients, etc... With the 'turns' pipelining (internet delays mean the game probably can be processed in steps without affecting perceived timing). When each thread finishes wuth its turn data (and is waiting for its next turn to start) it might do some secondary less time dependant tasks as filler...
Note that this frequently does require some data replication and buffering between the pipelined phases to keep them independant (the data that phases processing is taken in, integrated, and then what its producing marshalled to be passed on to the next phase). Some heavy phase processing might be run on more than one core with the inbound data read-locked and outbound data designed to be inedpendant (just gets queued up for the next step down the pipeline.
This kind of thing is more used in Clients which may be doing prep work 2/3 rendering frames ahead (with seperate core-threads working in parallel) and the handoffs of phase completion and with tasks broken up to try to keep all the cores as busy doing useful work as much as possible.