MMORPG with voicechat?

Started by
8 comments, last by GameDev.net 18 years, 5 months ago
Hello everyone, I'm not making an MMORPG, but I'm playing some with the thought.. And at the moment, I'm thinking about voicechat.. Would it be a good idea to use p2p for this, over UDP? Or would the entire thing be a bad idea because some people would listen to music in the background and so on? And what about background noice? And microphone volume? Are those things easy to take care of? Would it sound cool with like echoes and stuff, and surround so you can hear where it comes from? Or would it be problematic from a security point of view since the server would have to send everyones IP to everyone with the functionality enabled? What are your thoughts? oh, and does anyone have any brief tutorials on sound input, and how to stream sound over network?
Advertisement
Quote:Would it be a good idea to use p2p for this, over UDP? Or would the entire thing be a bad idea because some people would listen to music in the background and so on? And what about background noice? And microphone volume? Are those things easy to take care of? Would it sound cool with like echoes and stuff, and surround so you can hear where it comes from?


I think p2p would be a very bad choice, imagine having 250 players at the same place, you'ld have to send the data 249(all other players) times. Instead send it to the server which then sends all data to the players this would result in much less data being send of the network. I don't think you should worry about background noise, but imagine having 100 players speaking at the same time and 10 of them just screaming or something to be annoying.

I definately don't think it should be a "global" thing everyone near you can hear. It might be useful in groups (like parties in WoW). In a group you are often trying to accomplish the same thing so you will all talk about the same subject and the number of players will be more limited, also if someone is "spamming" he can be kicked from the group.

So it could be a good idea, but don't make everyone hear everyone else voices.

Quote:Or would it be problematic from a security point of view since the server would have to send everyones IP to everyone with the functionality enabled?


I don't see how it would be a security problem, you would recieve some sound from the server, or if you use P2P you would get the other players IP, but you would probably get it anyway if you use P2P for text chat.
Yes, the 250 player scenario would get problematic.. But it would require an insane amound of bw from the serverside to accomplish it..

Also, lets think about a smaller mmorpg, where never really have more than 20-25 people around you (that is, 5 on screen, and perhaps 20 somewhere near).. Do you think it could work there?
Bandwidth for voicechat will be costly. A better plan might be to implement a voice chat client inside your game client, and allow this to use an external server that could be private for groups - then again, your players could always have Teamspeak or similar running in the background.
Winterdyne Solutions Ltd is recruiting - this thread for details!
I've also been thinking about this. The best thing to do would be to have the users host the voice chat session that can handle a small number of people, about 20-30.
That's why I'm thinking about P2P.. Should be fairly easy to implent, and basicly, only the client who enable it will be affected.. You could easily implent a feature to block certain users, or only send to certain users..

I think it might work.. As long as you greatly decrease the volume of people you aren't talking to/listening to (it might not be the easiest thing to solve, but definitly not impossible)..
We do voice chat in the MMO based on our platform. We do it through the server. Because our server clusters are well connected, there really isn't a lot of extra lag introduced (unless you're in Japan, chatting with someone else in Japan, using a US server cluster). To manage bandwidth, we only send up to the 4 "most interesting" voice chat sources to you at any one time.

The bandwidth cost really isn't that bad, because most of the time, people don't chat like a telephone conversation. Even with a telecon, the total bandwidth (including UDP overhead) is on the order of 15 kbps per channel (at worst, times 4).

You can auto-tune the voice chat voxing such that it won't typically pick up background music; you should also support push-to-talk because some users just prefer it that way.
enum Bool { True, False, FileNotFound };
FWIW, Turbine recently announced that Dungeons & Dragons Online will have integrated voice chat. Sigil Games have also mentioned in their community forums that it is a possible feature for their upcoming Vanguard: Saga of Heroes. I'm sure it won't be long after they are released that someone sniffs their packets, decodes them, and has an outline of the protocol posted online as part of a server emu project (while currently of questionable legality, those emu projects are a great way to learn how the big boys implement their protocols). Then we'll be able to see how they handle it.
Operation Flashpoint, while not an MMO, had a nice integrated voice chat feature. It had various channels built in which would filter out who the data was sent to.

The most interesting of the channels would be the "Direct" channel - effectively, it bound your voice to your character and worked three dimensionally, with range and so forth, much like talking in real life. Given that this was a realistic war game, charging at a sniper while screaming bloody murder was an interesting diversionary tactic. [grin]


A possibility (that Ive mentioned many placed last few years) might be to use Voice Recognition to convert the players speech into text and transmit that via server to other players (eliminating mixing and high bandwidth requirements) and then output on the client end either as text (above head as per UO or on a chat box like NWN) which would handle the inevitable 'everyone speaking at once' problem and allow review of past speech...

Maybe the text/phoneme codes could even be converted back into speech (would help if there are advances in voice generation that allow better customizing multiple
voice characteristic to make more individualized voices ).

I did a test more than a year ago using the MS Voice SDK as an example. It took approx 200mhz worth of CPU to do continuous speach recognition. No doubt better quality will take more than that, but at least it isnt to big a drain on current systems (and upcomming multi-core systems).

This topic is closed to new replies.

Advertisement