Sign in to follow this  

Speech Programming Question

This topic is 4307 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Well I really had no idea where to post this thread. In fact, I am not even sure that I know the right terminology to ask this question. I will do my best. I am interested in learning the technology behind the voice with the computer. Specifically the technology behind voice/speech analysis, like the voice analysis judging the accuracy of someones accent in a language. I would guess this would be very similar as to what is seen in programs like Rosetta Stone. What is this even called? Where would I go to learn this? The closest thing that I came across was SAPI. From what I understand it has more to do with text-to-speech. Any help or direction would be much appreciated. Thanks

Share this post


Link to post
Share on other sites
You are correct about SAPI; it's a text-to-speech dealio.

As for the rest of your woes, I haven't done anything with speech analysis, so I couldn't tell you where to begin... However, I did find some resource that may be of use to you.


These are some short resources for a general concept of voice recognition, and what sort of things to account/watch for, but they don't provide anything as far as programming such a thing:

http://www.hitl.washington.edu/scivw/EVE/I.D.2.d.VoiceRecognition.html
http://en.wikipedia.org/wiki/Voice_analysis

A reverse sort of look at things; speech synthesis, which isn't what you wanted, but it seems like understanding this may come in handy overall for anything dealing with speech analysis:

http://en.wikipedia.org/wiki/Speech_synthesis

If all you want to do is just know comprehensive terminology, then this resource may be useful. It's a thesis, done by some... person... heh.

http://web.media.mit.edu/~moo/thesis/YEK_thesis.pdf

Umm, then this one is for a bunch of speech analysis software:

http://liceu.uab.es/~joaquim/phonetics/fon_anal_acus/herram_anal_acus.html


And... I'm spent. Heh.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
The Microsoft Speech interface SDK was available a few years ago ( I havent looked recently) and with it you can do C++ speech recognition (several sample programs/ code included in SDK). I was looking at using it as an input interface enhancement for a simulation client (issuing simple orders instead of using a mouse driven menu or typed input). Single words (small vocabulary) and simple sentence stuctures were doable (took about 200mhz CPU processing load on P4) but I wasnt impressed with the general speech to text (too many mistakes).

They may have improved it somewhat since the version I played with.

Share this post


Link to post
Share on other sites

This topic is 4307 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this