Version 16 of Speech Synthesis, or Talk to me Tcl

Updated 2006-04-04 22:57:42

It hurts me to advocate Windows but using tcom with the free speech engine from [Microsoft is almost too easy (yes I know Microsoft and free speech sounds like an oxymoron :)

 package require tcom

 set voice [::tcom::ref createobject Sapi.SpVoice]

 $voice Speak "Hello World" 1

 after 3000
 exit

The speech SDK is available from http://www.microsoft.com/speech/download/sdk51/ (51MB), there is a beta of .Net Speech at http://www.microsoft.com/speech/ (200MB) but I haven't tried that yet. - VPT

MG Anyone happen to have a link for more info on this? I've been using it successfully for a couple of weeks in something, but someone just found an error -

  $voice Speak "<> test <" 1

returns "0x80045042 {Unknown error}". I did a search on the Microsoft website for the code which, naturally, came up blank. Any ideas would be appreciated :)

NEM On Mac OS X there is a command-line "say" program for accessing the built-in speech synthesis capabilities of the OS:

 exec /usr/bin/say "Hello World"

GPS Rsynth is a nice public domain package that I've used for speech synthesis. I wrote a say_this.tcl script that built a GUI for tweaking the voice. I've been thinking about improving Rsynth, because it seems to be at the moment dead. Another tool that I've heard good things about was Festival[L1 ]. CMU's Sphinx[L2 ] is another tool that may be good, but I haven't heard from users of it.

See Festtcl for a Tcl interface to Festival.


Tcl'ers may be interested in the CSLU Toolkit [L3 ] developed at the Oregon Graduate Institute's Center for Spoken Language Understanding. It includes a RAD environment which supports Tcl and provides tools to do Speech Recognition as well as speech synthesis. -- aricb


JKM would like to know if MG found any solution to his "0x80045042 {Unknown error}". I'm getting the same error, but it may be for a different reason. By default, the SAPI tries to interpret the string with XML tags if the first character is '<'. Change your 1 to a 17 and you should be alright. I'm getting the same error with

  $voice Speak "c:/code/tcl/SAPI2.txt" 5

any help would be appreciated.

MG never did, I'm afraid

MG Having looked on the Microsoft website - for once, it's actually returned something sensible, searching for "sapi.spvoice" on www.microsoft.com - it seems that '5' means "speak a file", the 17 you mentioned before "speak without parsing XML", and 1 is the default. I can replicated the "Unknown Error", using '5', when the file in question doesn't exist. To quote one page of the MS website, the argument must be "a null-terminated, fully qualified path to a file". The page in question is [L4 ], and seems (as of July 10 2005, before they change the address) to be about the first page to start looking at, for this method of speech.


ET: Hey, this is pretty cool. I also downloaded the documentation and found these flags, so the 5 would be 4+1 or async and filename:

 Enum SpeechVoiceSpeakFlags
    'SpVoice flags
    SVSFDefault = 0
    SVSFlagsAsync = 1
    SVSFPurgeBeforeSpeak = 2
    SVSFIsFilename = 4 
    SVSFIsXML = 8
    SVSFIsNotXML = 16
    SVSFPersistXML = 32

    'Normalizer flags
    SVSFNLPSpeakPunc = 64

    'Masks
    SVSFNLPMask = 64
    SVSFVoiceMask = 127
    SVSFUnusedFlags = -128   
 End Enum

SVSFDefault

        Specifies that the default settings should be used. The defaults are: 
        To speak the given text string synchronously (override with SVSFlagsAsync), 
        Not to purge pending speak requests (override with SVSFPurgeBeforeSpeak), 
        To parse the text as XML only if the first character is a left-angle-bracket (override with SVSFIsXML or SVSFIsNotXML), 
        Not to persist global XML state changes across speak calls (override with SVSFPersistXML), and 
        Not to expand punctuation characters into words (override with SVSFNLPSpeakPunc). 

SVSFlagsAsync

        Specifies that the Speak call should be asynchronous. That is, it will return immediately after the speak request is queued. 

SVSFPurgeBeforeSpeak

        Purges all pending speak requests prior to this speak call. 

SVSFIsFilename

        The string passed to the Speak method is a file name rather than text. 
        As a result, the string itself is not spoken but rather         
        the file the path that points to is spoken. 

SVSFIsXML

        The input text will be parsed for XML markup. 

SVSFIsNotXML

        The input text will not be parsed for XML markup. 

SVSFPersistXML

        Global state changes in the XML markup will persist across speak calls. 

SVSFNLPSpeakPunc

        Punctuation characters should be expanded into words (e.g. "This is it." would become "This is it period"). 

SVSFNLPMask

        Flags handled by SAPI (as opposed to the text-to-speech engine) are set in this mask. 

SVSFVoiceMask

        This mask has every flag bit set. 

SVSFUnusedFlags

        This mask has every unused bit set. 

Category Speech Synthesis