views:

201

answers:

4

I would like to make a function IsWordPronounceable(SomeWord:String): boolean; "english language" i am working with SAPI Speech Recognition and i need this function. I use delphi compiler, C/C#/C++ or any language is ok.. please help. i dont know how to start...

from the start, i thought adding grammar rule could solve the problem. the scenario is highlight the text that is being said to the user. but the engine cannot recognize the words that is not pronounceble.

+4  A: 

This is not exactly easy to do. The way I would do it is with some simple statistical analysis.

Start off by downloading a dictionary of English words (or any language, really - you just need a dictionary of words that are "pronounceable"). Then, take each word in the dictionary and break it up into 3-letter blocks. So given the word "dictionary", you'd break it up into "dic", "ict", "cti", "tio", "ion", "ona", "nar", and "ary". Then add each three-letter block from all the words in the dictionary into a collection that maps the three letter block to the number of times it appears. Something like this:

"dic" -> 36365
"ict" -> 2721
"cti" -> 532

And so on... Next, normalize the numbers by dividing each number by the total number of words in the dictionary. That way, you have a mapping of three-letter combinations to the percentage of words in the dictionary that contain that three letter combination.

Finally, implement your IsWordPronounceable method something like this:

bool IsWordPronounceable(string word)
{
    string[] threeLetterBlocks = BreakIntoThreeLetterBlocks(word);
    foreach(string block in threeLetterBlocks)
    {
        if (blockFrequency[block] < THRESHOLD)
            return false;
    }
    return true;
}

Obviously, there's a few parameters you'll want to "tune". The THRESHOLD parameter is one, also the size of the blocks might be better off being 2 or 3 or 4, etc. It'll take a bit of massaging around to get it right, I think.

Dean Harding
One word: "syzygy". I bet your threshold would have to be very low for your algorithm to mark this as pronouncable.
DJClayworth
thanx for the idea. i tried it and it works but the thing is, its kinda huge library. i look for another options....
XBasic3000
A: 

Hey,

This means you can't use only text-to-speech but you also needs to check that the words given are fine as per the language or not. Also you need to use the training engine kind of thing for text-to-speech data. So that that data will be usable for your function.

If you only want to check the correctness of the word (I mean no speech, only check the validity of word), than the answer given by codeka is quite cool. You can check it from the dictionary of particular language.

thanks.

Paarth
A: 

This functionality is typically handled by the speech engine itself. If your goal is simply to get the text-to-speech engine to pronounce some things and spell others, speech engines other than the default may do a sufficient job. Check out Acapela for example.

To write this functionality yourself, I'd hit the low hanging fruit first.

  • check the input for numbers/unpronounceable characters, fail if found
  • check the input against a dictionary of words, pass if found

A more advanced technique similar to codeka's solution would be to build a list of valid syllable patterns then match your input against them. There may be even more complex techniques, but to go there you need to become familiar with linguistics.

Rob Elliott
@Rob Elliott, i dont have problem on text-to-speech, it says everything. but on speech-to-text it cannot recognize word like PLDT,XB3K,Max2D etc.. even if it tagged with <O>Max2D</O>
XBasic3000
+1  A: 

Just an idea (maybe crazy): I've never tried that.
Can you feed the output of the Text-To-Speech into the input of the Speech-To-Text?
Then in a perfect world, anything not recognized (or not matching) in the end is not pronounceable.

François
you have a point. but if you have idea on how to get the phonemes from voice input. please post it. i might needing it... thanx
XBasic3000