tags:

views:

200

answers:

4

I want to write a program in which plays an audio file that reads a text. I want to highlite the current syllable that the audiofile plays in green and the rest of the current word in red. What kind of datastructure should I use to store the audio file and the information that tells the program when to switch to the next word/syllable?

A: 
anjanb
I don't want a synthesic voice, but I want to use a preexisting audio file of the text in question.
Christian
+1  A: 

How about a simple data structure that describes what next batch of letters consists of the next syllable and the time stamp for switching to that syllable?

Just a quick example:

[0:00] This [0:02] is [0:05] an [0:07] ex- [0:08] am- [0:10] ple

Yuval A
+1  A: 

To highlight part of word sounds like you're getting into phonetics which are sounds that make up words. It's going to be really difficult to turn a sound file into something that will "read" a text. Your best bet is to use the text itself to drive a phonetics based engine, like FreeTTS which is based off of the Java Speech API.

To do this you're going to have to take the text to be read, split it into each phonetic syllable and play it. so "syllable" is "syl" "la" "ble". Playing would be; highlight syl, say it and move to next one.

This is really "old-skool" its been done on the original Apple II the same way.

jim
+2  A: 

This is a slightly left-field suggestion, but have you looked at Karaoke software? It may not be seen as "serious" enough, but it sounds very similar to what you're doing. For example, Aegisub is a subtitling program that lets you create subtitles in the SSA/ASS format. It has karaoke tools for hilighting the chosen word or part.

It's most commonly used for subtitling anime, but it also works for audio provided you have a suitable player. These are sadly quite rare on the Mac.

The format looks similar to the one proposed by Yuval A:

{\K132}Unmei {\K34}no {\K54}tobira
{\K60}{\K132}yukkuri {\K36}to {\K142}hirakareta

The lengths are durations rather than absolute offsets. This makes it easier to shift the start of the line without recalculating all the offsets. The double entry indicates a pause.

Is there a good reason this needs to be part of your Java program, or is an off the shelf solution possible?

Marcus Downing