a voice/syntisizer API in games ?

Started by
18 comments, last by Dreddnafious Maelstrom 19 years, 10 months ago
Quote:Original post by Laki

I would not be able to listen to 98 pages of computer robot talk.....

I think you should cut the script down and record speech


wel - there is a problem - let me give jou an lithel insight on our plan - sou you can understand the real isue :)

What we are trying to do with our game is inplement an very complex AI scripting - wich woud probably be unique for each NPC - so there will be no default's for user behavior - almost all wil be random. also NPC will interact with each other - family, friends - if you kill an enemy or a brother of an NPC - he will come after you , and similar things. The "MAYOR" add on will be the AI quest asingment - he will hawe an list of quests that will depend on his relations, status ... - and you can restart the game - you will probably get the totaly other quests ...
So with this complexity you see why we woud need the voice/text API. I coud yust put the text - buth i hate this in games.

Does anybody knows somthing about "caffeineaddict" proposition with phonems ?
Red Drake
Advertisement
would it be simpler to record common words rather than trying to use phenomes (if that's what they're called?). I seem to remember from trying to learn french that 95% of words used are only a relatively small number of words, so if you recorded these for each character (with maybe a couple of variations on the most common ones) you might get a reasonable system.

So

1) common english words (and, is, go, I etc)
2) common words in your game (kill, attack, jump etc)
3) common names in your game (bob, mctavish, roodolf, etc)
Try these links:

Festival
Open Source

IBM ViaVoice Text To Speech

Easy and very configurable voices but, still sound like a voice synthesizer.

Nuance
Best quality but very expensive and only a few voices the last time I checked.

Links
Quote:Original post by Uncivil

Try these links:

Festival
Open Source

IBM ViaVoice Text To Speech

Easy and very configurable voices but, still sound like a voice synthesizer.

Nuance
Best quality but very expensive and only a few voices the last time I checked.

Links


Festival looks promising - buth is it free - like GPL?
I hate those software that make you hawe to publish your source code or similar - I don't understand much of this licensing things - and i can't fund any info :(
Red Drake
Quote:Original post by Red Drake
Festival looks promising - buth is it free - like GPL?

This page describes it as a variation of the X11 license. I dunno what that is exactly but it sounds much less restrictive than a GPL license. It reads like a fairly open license, I doubt you'd have any trouble with it. (IANAL etc.)

Its an interesting idea, not really usable for character voices IMHO (no emotion for a start) but would be great for UT style announcer text or in-game computers/devices that talk back.
Quote:Original post by OrangyTang

Its an interesting idea, not really usable for character voices IMHO (no emotion for a start) but would be great for UT style announcer text or in-game computers/devices that talk back.


I am sory - buth what does the "IMHO" means ;) ?
An a what wou you propose ? - voice sintisizer, word recording (phareses) ....... ?
Red Drake
Quote:Original post by Red Drake

Quote:Original post by OrangyTang

Its an interesting idea, not really usable for character voices IMHO (no emotion for a start) but would be great for UT style announcer text or in-game computers/devices that talk back.


I am sory - buth what does the "IMHO" means ;) ?
An a what wou you propose ? - voice sintisizer, word recording (phareses) ....... ?

IMHO = in my humble opinion.

Of course if you really need generated speach, theres ways around it. Something like Baulders gate uses small speach snippets (maybe the first sentance) and then text for the rest. Or you could mask the speach some way (make all of them robots, or make them communicate via intercom or similar with static to hide the wooden speach somewhat).
Quote:Original post by OrangyTang

Of course if you really need generated speach, theres ways around it. Something like Baulders gate uses small speach snippets (maybe the first sentance) and then text for the rest. Or you could mask the speach some way (make all of them robots, or make them communicate via intercom or similar with static to hide the wooden speach somewhat).


Well for start - i hate the way Baulders gate & Newerwinter ... does the voice - text thing. It's one of the most anoying thigs i the game - i bearly survived trough the end of NVN - becouse they do put voices at key point's.

"make all of them robots, or make them communicate via intercom or similar with static to hide the wooden speach somewhat."
We are makeing an RPG - in past - swords, shealds .... - it woud be hard to make an robot wilage with short swords and bucklers ;)

Is there a way to compres (raely compres) a human voice widouth losing much quality - and widouth geting HUGE files?
Red Drake
More links:

speex
I haven't used it but they say its "a free codec for free speech".
It uses a variant of the BSD license.

comp.speech
Everything speech related (compression, synthesis, etc)

I'm sure you can find something that will give you a ratio of at least 10:1 or higher. I've used some commercial codecs that get ratios between 50:1 and 100:1. The drawback is that it will only compress human voice with telephone quality playback. Oh that and it'll cost you a nice chunk of change.
what you can do, is you microsofts text to voice API, capture the sound buffer in direct sound, and play with the attributes to generate a more pleasant voice. this is a real pain in the butt but technically isnt that difficult. what IS difficult is messing with the text to get the text to voice to generate the proper phoenics initially. *hint proper spelling goes out the window immediately.
"Let Us Now Try Liberty"-- Frederick Bastiat

This topic is closed to new replies.

Advertisement