Voice Synthesis

Started by
1 comment, last by jefferytitan 11 years, 11 months ago
Hi people,

I'm curious if/when anybody thinks there may be usable speech synthesis in games, e.g. doesn't sound terrible or robotic. I know that there's continuing research, e.g. the below, but I haven't heard much about results that can be used in a product or library.
http://www.coli.uni-saarland.de/courses/FLST/2009/slides/ArtSynIntro-forstudents.pdf

My position is that the almost exclusive use of canned recordings contributes to both the linear cinematic nature of many games, and the lack of believability of NPCs. I'm not expecting that NPCs would suddenly have things worth saying, but they could at least be topical (e.g. knowing your gender/name/what gear you're holding/what faction you belong to). Or just riffing variations on a theme when idle rather than repeating the same 3 lines every minute or so.

So I guess my question is: how useful do you think it would be, how far away are we, and what approaches may work? And I guess any workarounds because it doesn't exist now (to the best of my knowledge).

Thanks,

JT
Advertisement
Synthesized speech these days is not terrible (e.g., Siri), but synthesized voice acting is a different matter: The inflections, the right pauses... I think that will take many years to come.

In the meantime, if you want to give more variety to your NPCs' lines there are a few tricks you can use. One of them (I forgot where I read about it originally) is to record a base version of a sentence and also more detailed versions that will only be triggered when certain conditions are met (character gender and weapon selection are good examples).

- That way!
- He went that way! (requires male character)
- He went down the stairs! (requires male character that just went down some stairs somewhere where the NPC would have seen him)
etc.

Having a few very specialized sentences can make for memorable moments when an NPC says the exact right thing for the situation, which can add a lot of color even if it doesn't happen most of the time.
I know what you mean. I'd even be happy if emotion and pacing could be marked up. Most sentences sound essentially the same if you substitute in a different word, as long as the word is said in the right tone. If I had to mark up how to say each noun/verb in a happy/bored/angry way... it's a pain, but it would still be fine. I realise that it wouldn't flow 100%, but it would sound more like a bad actor than fake. The ability to add affect to each piece of dialogue would be very nice... imagine if the merchants said the same old dialogue but sounded afraid if you'd just killed someone in front of them. Once again, not ideal, but better. I have heard of research into adding emotion to voice, but no idea on the progress.

I like your idea on specific dialogue, although the problem for any indie is that it's very actor heavy.

This topic is closed to new replies.

Advertisement