Version 1 of Syllable Counting

Updated 2016-02-02 12:12:55 by WJG

WJG (02/02/16) Obtaining the number of syllables in an English word is quite tricky because spellings can be irregular. For many a languages simple a vowel count would be sufficient but even this will throw up some inaccuracies in English. The following procedure shows a relatively easy approach to the problem.

* Remove initial 'y' (y is a semi-vowel and here acts as a consonant).

* Count the number of vowels (including y as a vowel).

* Reduce the count by the number of dipthongs.

* Reduce the count by silent vowel endings or modifying 'e's.

* If the total is less than 1, must be 1. (Aspirated, eg. psst!)

# syllables.tcl
exec tclsh "$0" "[email protected]"

# Obtain number of syllables in an English word
# Arguments:
#        str        word
# Returns:
#        number of syllables
proc syllables { str } {
        set res 0
        # functions as a semi-vowel, i.e. as a consonant.
        set str [string trimleft $str y]
        # count total number of vowels
        foreach item {a e i o u y} {
                incr res [llength [regexp -all -inline (?=$item) $str]]
        # discount dipthongs, includes reversals
        foreach item {ai ie ei io ee ou oo oi ea ue ui} {
                incr res -[llength [regexp -all -inline (?=$item) $str]]

        # discount irregular word endings, typically containing e
        foreach item {ce nge me te ne ve re ye ue ze se eye} {
                incr res -[llength [regexp -all -inline (?=$item) $str]]

        # any word, even if it has of vowels has at least 1 syllable, eg. psst!, shhh!
        if { $res <= 1 } {
                set res 1

        return $res

        set words "
                colour allure yatch yahoo 
                yeti jeeze employees footy 
                early yearly psst phut 
                eye lye lie hectic 
                pneumatic aromatic automatic clinique"

        puts "syl.\tword\n[string repeat = 30]\n"        
        foreach word [lsort $words] {
                puts "[syllables $word]\t$word"