U Speech
LATEST VERSION 1.1 (13.09.2011)
Description:
URBI module for text-to-speech functionality based on Microsoft SAPI. SAPI incarnation of Microsoft's Speech API - allowing programmers to add speech recognition and synthesis into their programs. It is based on a COM interface. SAPI (TTS) extensible markup language (XML) tags.
More about SAPI: http://msdn.microsoft.com/en-us/library/ms723627(v=vs.85).aspx
This Module provides visemes synchronisation (eventing by SAPI). You can use viseme events to generate ex. mouth animations.
XML Tags:
SAPI text-to-speech (TTS) extensible markup language (XML) tags fall into several categories.
- Voice state control
- Direct item insertion
- Voice context control
- Voice selection
- Custom Pronunciation
Example:
The Volume tag controls the volume of a voice. The Volume tag has one required attribute: Level. The value of this attribute should be an integer between zero and one hundred.
<volume level="50">
This text should be spoken at volume level fifty.
<volume level="100">
This text should be spoken at volume level one hundred.
</volume>
More examples you can find on this page:
http://msdn.microsoft.com/en-us/library/ms717077(v=vs.85).aspx
Visemes:
SPVISEMES lists the SAPI Viseme set. This set is based on the Disney 13 Visemes. Examples given are for the SAPI English Phoneme set.

Viseme name English examples
SP_VISEME_0, // silence
SP_VISEME_1, // ae, ax, ah
SP_VISEME_2, // aa
SP_VISEME_3, // ao
SP_VISEME_4, // ey, eh, uh
SP_VISEME_5, // er
SP_VISEME_6, // y, iy, ih, ix
SP_VISEME_7, // w, uw
SP_VISEME_8, // ow
SP_VISEME_9, // aw
SP_VISEME_10, // oy
SP_VISEME_11, // ay
SP_VISEME_12, // h
SP_VISEME_13, // r
SP_VISEME_14, // l
SP_VISEME_15, // s, z
SP_VISEME_16, // sh, ch, jh, zh
SP_VISEME_17, // th, dh
SP_VISEME_18, // f, v
SP_VISEME_19, // d, t, n
SP_VISEME_20, // k, g, ng
SP_VISEME_21 // p, b, m
Module functions:
USpeech.new(); - initialize TTS
USpeech.speak("Hello world"); - start speech synthesis
USpeech.getVoiceAll(); - returns all available in the system voices
USpeech.visemeTrig - module set this flag if new viseme occur
USpeech.visemeId - current viseme ID
USpeech.nextVisemeId - next viseme ID
USpeech.visemeTime - current viseme time execution [ms]
USpeech.isSpeaking - speaking flag, 1 during speech synthesis and 0 when finish
Urbiscript example:
loadModule("USpeech");
var speech=USpeech.new();
speech.&visemeTrig.notifyChange( closure() {
robot.setMouth(speech.visemeId,speech.visemeTime);
});
// or just
speech.&visemeTrig.notifyChange( closure() {
echo("viseme no. "+speech.visemeId+" exec. time "+speech.visemeTime);
});
speech.speak("Hello world");