Hi, I've been working on a tool to transcribe recordings of speech with Javascript. Basically I'm hooking up key events to play, pause, and loop a file read in with the audio
tag.
There are a number of advanced existing desktop apps for doing this sort of thing (such as Transcriber -- here's a screenshot). Most transcription tools have a built-in waveform that can be used to jump around the audio file, which is very helpful because the transcriber can learn to visually find and repeat or loop phrases.
I'm wondering if it's possible to emulate a subset of this functionality in the browser, with Javascript. I don't know much about signal processing, perhaps it's not even feasible.
But what I envision is Javascript reading the sound stream from the file, and periodically sampling the amplitude. If the amplitude is very low for longer than a certain threshhold of time, then that would be labled as a phrase break.
Such labeling, I think, would be very useful for transcription. I could then set up key commands to jump to the previous period of silence. So hypothetically (imagining a jQuery-based API):
var audio = $('audio#someid');
var silences = silenceFindingVoodoo(audio);
silences
will then contain a list of times, so I can hook up some way to let the user jump around through the various silence
s, and then set the currentTime
to a chosen value, and play it.
Is it even conceivable to do this sort of thing with Javascript?