views:

268

answers:

3

After seeing this question, I got to thinking about the various challenges that blind programmers face, and how some of them are applicable even to sighted programmers. Particularly, the problem of reading source code aloud gives me pause. I have been programming for most of my life, and I frequently tutor fellow students in programming, most often in C++ or Java.

It is uniquely aggravating to try to verbally convey the essential syntax of a C++ expression. The speaker must give either an idiomatic translation into English, or a full specification of the code in verbal longhand, using explicit yet slow terms such as "opening parenthesis", "bitwise and", et cetera. Neither of these solutions is optimal.

On the one hand, an idiomatic translation is only useful to a programmer who can de-translate back into the relevant programming code—which is not usually the case when tutoring a student. In turn, education (or simply getting someone up to speed on a project) is the most common situation in which source is read aloud, and there is a very small margin for error.

On the other hand, a literal specification is aggravatingly slow. It takes far far longer to say "pound, include, left angle bracket, iostream, right angle bracket, newline" than it does to simply type #include <iostream>. Indeed, most experienced C++ programmers would read this merely as "include iostream", but again, inexperienced programmers abound and literal specifications are sometimes necessary.

So I've had an idea for a potential solution to this problem.

In C++, there is a finite set of keywords—63—and operators—54, discounting named operators and treating compound assignment operators and prefix versus postfix auto-increment and decrement as distinct. There are just a few types of literal, a similar number of grouping symbols, and the semicolon. Unless I'm utterly mistaken, that's about it.

So would it not then be feasible to simply ascribe a concise, unique pronunciation to each of these distinct concepts (including one for whitespace, where it is required) and go from there? Programming languages are far more regular than natural languages, so the pronunciation could be standardised. Speakers of any language would be able to verbally convey C++ code, and due to the regularity and fixity of the language, speech-to-text software could be optimised to accept C++ speech with a high degree of accuracy.

So my question is twofold: first, is my solution feasible; and second, does anyone else have other potential solutions? I intend to take suggestions from here and use them to produce a formal paper with an example implementation of my solution.

+3  A: 

Instead of creating new "words" to describe them, for things such as "include" you could simply prefix it with "keyword" when saying it aloud. You could use words/phrases commonly known to say other parts as well. As with any new programmer, you have to literally describe everything anyway, so I don't think that requires special attention. I think creating new words is the harder method...

So, for example:

#include <iostream>;

int main()
{
   if (1 < 2)
     return 1;
   else
     return 0;
}

Could be read out as:

(keyword) include iostream new-line (keyword) int main no params start block if number 1 (operator) less than number 2 new-line (keyword) return number 1 new-line (keyword) else new-line (keyword) return number 0 end block

Treat words in () as optional descriptive words, most likely to be used in more complex code. You could use the word 'literal' if you want them to actually write the descriptive word. For example

(keyword) if literal number (operator) less than literal keyword

becomes

if (number < keyword)

Other words could be given defined meanings as well, such as 'split-line' when you want them to continue on the next line, without closing any currently open parenthesis, etc.

I personally find this method quite simple to use and easy to teach. YMMV, as always.

Of course, this doesn't solve the internationalisation issue, but at worst, would result in 'new words' being used in the non-English languages, which is no worse than the proposed solution you offered.

Dan McGrath
Plus one because your solution is pragmatic and optimal without the internationalisation constraint. I don't see why more programming languages don't support localised (Chinese Python) or language-neutral (APL) programming.
Jon Purdy
Thanks. Also, oops. I typed 'internalisation'. didn't I. Haha. Spell-check doesn't fix everything :)
Dan McGrath
+2  A: 

So would it not then be feasible to simply ascribe a concise, unique pronunciation to each of these distinct concepts (including one for whitespace, where it is required) and go from there? Programming languages are far more regular than natural languages, so the pronunciation could be standardised

Perhaps, but you've lost sight of your goal. The premise was that the person listening did not already know the language. If he does, we can simply say "include iostream" when we mean #include <iostream>, or "vector of int" when we mean std::vector<int>.

Your premise was that the person listening is not familiar enough with the language to understand what you read out loud unless you read out exactly what it says.

Now, inventing a whole new language just to describe the primitives that occur in your source code doesn't solve the problem. Instead, you still have to read out every syntactic token (with simpler, more "standardized" pronunciations, yes, but they still have to be read out loud), and the person listening still won't understand you, because if they don't know C++ well enough to understand "include iostream", they won't understand your standardized pronunciation either. And if you're going to teach them your pronunciation, why bother, when you could've just taught them to understand C++ syntax directly instead?

There's also the root problem that C++ code tends to consist of a lot of syntactic tokens. Take a line as simple as this:

std::vector<int> v;

I count 9 tokens. Not one of them can be omitted. If the person listening does not understand the code and syntax well enough to understand a high-level description such as "declare a vector of int, named v", then you'll have to read out all 9 tokens in some form. Even if you come up with simpler names than "namespace resolution operator" and "less than sign", you still have to list 9 token names. Which is a lot of work.

In short, no, I don't think it'd work. First, it's still too cumbersome, and second, it's presuming prior knowledge on the part of the person listening, when the motivation for this was that the person listening was a student without the prior knowledge that made it possible to understand a high-level description of the code.

jalf
I guess then that there are two sides to this problem. High-level descriptions are better suited to beginners, and low-level to experienced programmers. Those eight tokens could easily correspond to only as many syllables, provided that the standard library is also indexed, as, e.g., "sa na ve la i ga li v se", where "li" is a particle denoting that what follows it is a literal (in this case an identifier) in the native speaker's language. Even "li std na li vector la i ga li v se" isn't bad.
Jon Purdy
+2  A: 

As a blind developer, programming since I was 13, I found this question really interesting. First of all, as mentioned by other peple, learning a new language to be able to understand code is not a practical solution, as it would probably take longer to learn the spoken utterances as it would to learn the actual programming language.

Reading the question/answers two further points occured to me:

  • Firstly, you'd be surprised how important "thinking time" is. I have previously programmed in C/C++/Java and now use C# as my primary language, and consider myself very competant. But when I did a couple of projects in Python, I found the reduced punctuation robbed me of my "thinking time" - subconsciously, I was using the punctuation to digest what I'd just heard - fascinating... However, the situation is a bit different when it comes to identifiers, as these aren't well known by the listener - I personally find it hard to listen to code with acronym variables (RGXRatio, RGVRatio) as I don't have time to figure out what it means. On the flip side, hungarian notation and initial underscores makes code hard to listen to as the length of the variables (in terms of time taken to speak) is much longer than the more important operations being performed on those variables.
  • Another thing to consider is that the length of the audio stream is an end result, but not the root cause. The reason the audio is so long is because audio is a one-dimensional medium, whereas reading text is a 2d medium with the ability to jump around and skip past irelevant/familiar text. It wouldn't work for a face-to-face lecture, but what if there were keyboard commands for controlling the speech. In text documents my screen reader lets me jump to the next line, but what if this were adapted to the semantics of a programming language. some research, such as by T V Raman at Google, includes using different voices for syntax highlighting, and audio cues to mark metadata like capitals.

I know the original question specifically related to a lecture given to a class, but if like myself you have to listen to entire files of source code , I also find the structure of the code makes a huge difference. I personally read code like a story - left to right, top to bottom. so it's very hard to trace through unfamiliar code when it's written bottom-up.

Saqib
I'm curious how you read code. Do you type it out then have it read back to you?
Matthew
Do you think that you would find a more English-like programming language, or at least a more English-like pronunciation of an existing language, to be easier to parse auditorily?
Jon Purdy
Matthew: I hear characters/words written as I type them, and then can review later line-by-line or character-by-character. The screen reader also reads focused elements, and this applies to autocompletion of source code as well.Jon: An more english language could be good, but you risk a leaky abstraction (think Applescript). Reading a normal programming language in a more english way may be interesting, but I think it'd be rather cumbersome (my university teacher said: "x becomes 5" instead of "x=5", "x is equivalent to 10" instead of "x==10" and "add 1 to the value of x" instead of "x++")
Saqib