Sign in to follow this  
nihonlvr

Speech Programming Question

Recommended Posts

Well I really had no idea where to post this thread. In fact, I am not even sure that I know the right terminology to ask this question. I will do my best. I am interested in learning the technology behind the voice with the computer. Specifically the technology behind voice/speech analysis, like the voice analysis judging the accuracy of someones accent in a language. I would guess this would be very similar as to what is seen in programs like Rosetta Stone. What is this even called? Where would I go to learn this? The closest thing that I came across was SAPI. From what I understand it has more to do with text-to-speech. Any help or direction would be much appreciated. Thanks

Share this post


Link to post
Share on other sites
You are correct about SAPI; it's a text-to-speech dealio.

As for the rest of your woes, I haven't done anything with speech analysis, so I couldn't tell you where to begin... However, I did find some resource that may be of use to you.


These are some short resources for a general concept of voice recognition, and what sort of things to account/watch for, but they don't provide anything as far as programming such a thing:

http://www.hitl.washington.edu/scivw/EVE/I.D.2.d.VoiceRecognition.html
http://en.wikipedia.org/wiki/Voice_analysis

A reverse sort of look at things; speech synthesis, which isn't what you wanted, but it seems like understanding this may come in handy overall for anything dealing with speech analysis:

http://en.wikipedia.org/wiki/Speech_synthesis

If all you want to do is just know comprehensive terminology, then this resource may be useful. It's a thesis, done by some... person... heh.

http://web.media.mit.edu/~moo/thesis/YEK_thesis.pdf

Umm, then this one is for a bunch of speech analysis software:

http://liceu.uab.es/~joaquim/phonetics/fon_anal_acus/herram_anal_acus.html


And... I'm spent. Heh.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
The Microsoft Speech interface SDK was available a few years ago ( I havent looked recently) and with it you can do C++ speech recognition (several sample programs/ code included in SDK). I was looking at using it as an input interface enhancement for a simulation client (issuing simple orders instead of using a mouse driven menu or typed input). Single words (small vocabulary) and simple sentence stuctures were doable (took about 200mhz CPU processing load on P4) but I wasnt impressed with the general speech to text (too many mistakes).

They may have improved it somewhat since the version I played with.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this