If the sounds are completely in sync, then the issue has to do with clipping - identical sample values get added, causing the waveform to become too loud for playback so it gets truncated (clipped), which sounds like distortion (in fact, this is how guitar distortion works - in very broad terms).
If the sounds are being played almost in sync, but not completely, then you're getting what is known as comb filtering, which generally sounds even "worse" than outright clipping. Comb filtering is used in music and sound production to achieve certain effects, but generally it's considered undesirable.
The problem is you can't just adjust the volume of your triggered samples as that'll mess up any balance (and positioning) you might have. Also, that wouldn't take care of comb filtering. The solution, therefore, is to store your sound files at something like half volume, so you can do some layering and limit synchronous sounds to something like two instances. You'll also need to add a "minimum delay" that the game waits before triggering the same sound more than twice. The worst comb filtering happens with delays of 10ms or less so something like a 100 or 50 ms delay would be a good place to start experimenting.