Jump to content
  • Advertisement
Sign in to follow this  
Peter van Hespen

Unity [Unity Asset] Dialog Editor and integrated Speech recognition - feedback requested

This topic is 631 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hello all,
 
I hereby present to you a package to implement dialogs and speech recognition at the same time!
I've been working on this as part of a graduation project (University of Applied Sciences Rotterdam) and I'd love for you guys to give me some feedback. I hope it'll make speech and dialog implemantations easier for everyone.
 
Feature list:
  • Dialog editor, for normal dialogs or dialogs with user options.
  • Dialogs to implement cutscenes or cutscenes with user options
  • Option to use different grammar files for each dialog (Sphinx 4)
  • Export to Json & Load Json
  • Typewriter animations
  • Delay before a next dialog shows!
  • Out of the box support for 3 speech recognition systems. Google Speech, Wit.ai and Sphinx 4.
  • Automatic selection of dialog answers on speech result. Let's calculate that accuracy!
  • Audio input analyzer. When did the user start talking and when did he stop? Let's cut that audio out, and recognize it!
  • Voice Activation Volume adjuster
  • Callbacks for timers, automatic answer selection and current state of audio input (no audio, listening & analyzing speech)
  • Includes a working Sphinx 4 Server, and text to language model tool. (out of the box support for English, Dutch and German)
  • Automatic grammar files to dictionary to decrease server load
  • Docs available in source and here: https://hespen.net/Portfolio/UnityDialogEditor/annotated.html
 
The project can be found here:
 
 
It contains a Unity Package for easy implementation, the Unity project, a Sphinx 4 Server and a Sphinx 4 Text to Language Model Tool.
 
To make things easier for you to test, I've added keys for the Google Speech API and Wit.ai API. (Wit.ai has an English key, Change the language of Google in the code) Select the speech system you'd like to use on the main camera object
 
I advise you to take a look at Sphinx though. Just import it into your IDE, Gradle Make it, And run the Base object.
 
Remember this is not a finished product, as there still are some bugs in the dialog editor. And I haven't had the time to make it beautifull yet. I did implement this in a VR game, but as that is part of a company, I can't share that one.
 
I'd love to know what you guys think of it!
 
Screenshots:
 
 
 
Demo video (crappy quality, no audio):
 
 
How to use:
Enable Microphone Setup Object and run it and toggle the button for like 5 seconds while being silent. Toggle it off, and speak. When you speak the square should become green. (saved in prefs automatically)
 
Stop the game, disable the microphone setup object. Select the speech system you'd like to use on the Main Camera object. Press enter to start the dialog.
 
Remember: Google en Wit.ai are really slow, use Sphinx 4 for the fastest result! I did research on the implemented speech recognition systems and their accuracy and speed. Sphinx is the fastest with an average of 200ms recognition (external server)! Where Google and Wit.ai will need atleast 2-5 seconds. (tested with 3200 audiofiles, 2 languages)
 
 
The editor:
Windows -> Nodes Editor. Right click to create new nodes or load the json file. Demo Json in Resources folder. Middle click to drag, scroll to zoom. Right click to export to json. You can attach the json to the main camera!
 
  • Keywords are used for speech recognition! They determine the accuracy.
  • Delay in Seconds before dialog is shown
  • Time until next node is a delay before the next node is shown. This one starts counting after the first delay has passed.
 
Last thing: The used google key in this project is attached to a trial account. If I've spend my cash, it won't work anymore. You can however in that case, set up your own trial account for free on the Google website.

Share this post


Link to post
Share on other sites
Advertisement

Sounds interesting, Im dealing myself with the need to implement a dialog editor, but my modest knowledge is not enough. Just a question: why is speech recognition required? I would appreciate much more lip synch.

Share this post


Link to post
Share on other sites

Sounds interesting, Im dealing myself with the need to implement a dialog editor, but my modest knowledge is not enough. Just a question: why is speech recognition required? I would appreciate much more lip synch.

Speech recognition is not required, you can use it without it. But I've been doing research on it for quite some time, and I think it will be used more often in VR worlds. 

If you want to know how the dialog editor is setup, just download the code! I've made sure to heavily comment everything. It's not a finished product yet, but you can learn how it is made (I hope).

 

I'm not quite sure what you mean by "lip synch" though, could you explain that to me a bit further?

Share this post


Link to post
Share on other sites

 

Sounds interesting, Im dealing myself with the need to implement a dialog editor, but my modest knowledge is not enough. Just a question: why is speech recognition required? I would appreciate much more lip synch.

Speech recognition is not required, you can use it without it. But I've been doing research on it for quite some time, and I think it will be used more often in VR worlds. 

If you want to know how the dialog editor is setup, just download the code! I've made sure to heavily comment everything. It's not a finished product yet, but you can learn how it is made (I hope).

 

I'm not quite sure what you mean by "lip synch" though, could you explain that to me a bit further?

 

 

Lip synchronization sinchronizes the lips of the character model with the speech.

Share this post


Link to post
Share on other sites

 

 

Sounds interesting, Im dealing myself with the need to implement a dialog editor, but my modest knowledge is not enough. Just a question: why is speech recognition required? I would appreciate much more lip synch.

Speech recognition is not required, you can use it without it. But I've been doing research on it for quite some time, and I think it will be used more often in VR worlds. 

If you want to know how the dialog editor is setup, just download the code! I've made sure to heavily comment everything. It's not a finished product yet, but you can learn how it is made (I hope).

 

I'm not quite sure what you mean by "lip synch" though, could you explain that to me a bit further?

 

 

Lip synchronization sinchronizes the lips of the character model with the speech.

 

Ok well that is definitely something else. That's not part of the package. That would be part of a Text to Speech solution

Share this post


Link to post
Share on other sites

I'm not quite sure what you mean by "lip synch" though, could you explain that to me a bit further?

 

 

Lip synchronization sinchronizes the lips of the character model with the speech.

 

Ok well that is definitely something else. That's not part of the package. That would be part of a Text to Speech solution

 

 

Well, if Im not too wrong, in a game would be more useful to synchronize prerecorded speech from dubbing actors with model's mouth. I dont think games would be using text to speech anytime soon, at least until voice synthesis improve a lot. Notice Im talking about games here, maybe Text to Speech is a valid solution in other environments.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!