I have a severe to profound deafness from a very early age but luckily I can speak like a normal person. Verbal communication has always been difficult for me due to my impaired speech recognition abilities even with lip-reading. I have gone through school and college by just reading boards, powerpoint slides, books and the internet. I am doing pretty much fine at my current software engineering job, but of late I feel that I must put some effort to make my situation better.
Subtitles are my lifesaver in this country to understand movies/shows on TV and I have only been enjoying this for the last 7 years (I am 31 now).
I strongly feel the need for the ability to see subtitles in real life whenever I talk to some person, even strangers. I want to develop an untrained speech to text converter, and as a start it does not even have to spell out exact words for me, only cues on syllables/phonetics will also be fine.
I have googled on this for a while, but most results are either text to speech or half-baked attempts on speech recognition to give voice commands to a computer. I would really like to get some pointers on how to start on this project. Specifically I need steps like how to deal with audio files and what kind of processing I have to do to get approx phonetics as fast as possible.