views:

199

answers:

9

Let me explain. Suppose I want to teach Python to someone who only speaks Spanish. As you know, in most programming languages all keywords are in English. How complex would it be to create a program that will find all keywords in a given source code and translate them? Would I need to use a parser and stuff, or will a couple of regexes and string functions be enough?

If it depends on the source programming language, then Python and Javascript would be the most important.

What I mean by "how complex would it be" is that would it be enough to have a list of keywords, and parse the source code to find keywords not in quotes? Or are there enough syntactical weirdnesses that something more complicated is required?

A: 

The problem you will encounter is that, unless you have strict coding standards, the fact that people will not necessarily follow a pattern in how they do the code. And in any dynamic language you will have a problem where the eval function will have keywords within quotes.

If you are trying to teach a language, you could create a DSL that has keywords in spanish, so that you can teach in your language, and it can be processed in python or javascript, so you have basically made your own language, with the constructs you want, for teaching.

Once they understand how to program, they will then need to start learning languages with the "English" keywords, so that they can communicate with others, but that could come after they understand how to program, if it would make your life easier.

So, to answer your question, there is enough syntactic weirdness that it would be considerably more complicated to translate the keywords.

James Black
A: 
Phil
I agree with this. The important parts are the documentation and the discussion, not the 20 or so keywords. I mean, most keywords aren't even English words (`method` is Greek, `routine` is French (I think), `function` is Latin, lambda isn't even a word, just a letter spelled out). And what kind of word is `=~`??? And even *if* they are English words, they usually don't mean what they mean in English anyway. `Yield` is good example. Heck, in most programming languages `yield` doesn't even mean what it normally means in computer science, plus it means *different* things in every language.
Jörg W Mittag
A: 

cdecl is one program which converts C declarations into human language.

Kinopiko
A great utility but doesn't answer the question.
Artelius
It's an example of a program which converts a programming language into a human language.
Kinopiko
That's not the question.
Javier Badia
The question is about how hard it would be to convert a programming language into a human language, so looking at an example of something which actually does that might help.
Kinopiko
No, the question is about converting a programming language from one human language to another.
Javier Badia
The principle is the same though.
Kinopiko
No it's not. I want to get a "Spanish Python" source file, translate it to "English Python" and it should be runnable.
Javier Badia
Aw, boo hoo. 15 characters.
Kinopiko
A: 

It would be impossible to make a translation that would handle every case. Take for example this Javascript code:

var x = Math.random() < 0.5 ? window : { location : { href : '' } };
var y = x.location.href;

The x variable can either become a reference to the window object, or a reference to the newly created object. It would only make sense to translate the members if it's the window object, otherwise you would have to translate the variable names too, which would be a mess and could easily cause problems.

Besides, it's not really useful to know a language in the wrong language. All the documentation and examples out there is going to be in the original language, so they would be useless.

Guffa
A: 

You should think that the 'de facto' language for tokens on commonly used programming languages is english. So, for purely educational objectives, to teach on a translated language can be harmful for your student(s). But, if you really want to translate a computer language tokents, you should think on the following issues:

  • You should translate language primitive constructs. This is easy... you have to learn and use a basic parser like yacc or antlr
  • You should translate language API's. This can be so painful and difficult... first, modern API's like java's one are very extensive; second, you have to translate the API's documentation.... no more words about that.
JPCF
+6  A: 

If all you want is to translate keywords, then (while you definitely DO need a proper parser, as otherwise avoiding any change in strings, comments &c becomes a nightmare) the task is quite simple. For example, since you mentioned Python:

import cStringIO
import keyword
import token
import tokenize

samp = '''\
for x in range(8):
  if x%2:
    y = x
    while y>0:
      print y,
      y -= 3
    print
'''

translate = {'for': 'per', 'if': 'se', 'while': 'mentre', 'print': 'stampa'}

def toks(tokens):
  for tt, ts, src, erc, ll in tokens:
    if tt == token.NAME and keyword.iskeyword(ts):
      ts = translate.get(ts, ts)
    yield tt, ts

def main():
  rl = cStringIO.StringIO(samp).readline
  toki = toks(tokenize.generate_tokens(rl))
  print tokenize.untokenize(toki)

main()

I hope it's obvious how to generalize this to "translate" any Python source and in any language (I'm supplying only a very partial Italian keyword translation dict). This emits:

per x in range (8 ):
  se x %2 :
    y =x 
    mentre y >0 :
      stampa y ,
      y -=3 
    stampa

(strange though correct whitespace, but that could be easily enough remedied). As an Italian speaker I can tell you this is terrible to read, but that's par for the course for any "programming language translation" as you desire. Worse, NON-keywords such as range remain un-translated (as per your specs) -- of course, you don't have to constrain your translation to keywords-only (it's easy enough to remove the if that does that above;-).

Alex Martelli
Just curious, why the accept w/o an upvote? That's really peculiar!-)
Alex Martelli
Forgot to vote. Here, have an upvote.
Javier Badia
Heh, not that it matters (I'm maxed out for the day anyway;-) -- I was just curious about the motivation (if any). Tx, anyway!-)
Alex Martelli
A: 

While I don't have an answer to the question, I think it's an interesting one. It brings up some issues which I have been thinking about:

  • As developing countries start introducing their population to higher technologies, naturally some will be interested in learning to program. Will English-only programming languages be an impediment?

  • Let's say a programming language was developed in a non-English part of the world: the keywords were written in the native language for that area and it used the native punctuation (eg, «‹ˆ» instead of " ", a comma as the decimal point (123,45), and so forth). It's a fantastic programming language, generating lots of buzz. Do you think it would see widespread adoption? Would you use it?

Most English-speaking people answer "no" to the first question. Even non-English (but educated) people answer no. But they also answer "no" to the second question, which seems to be a contradiction.

Barry Brown
Ruby's creator is Japanese, but he made it in English
hasen j
Good point. Perhaps he realized that using Japanese keywords would be a detriment to its adoption.
Barry Brown
A: 

There was a moment I was thinking about something like that for bash scripts, but idea can be implemented in other languages too:

#!/bin/bash

PrintOnScreen() {
    echo "$1 $2 $3 $4 $5 $6 $7 $8 $9"
}
PrintOnScreenWithoutNewline() {
    echo -n "$1 $2 $3 $4 $5 $6 $7 $8 $9"
}
MathAdd() {
    expr $1 + $2
}

Then we can add this to some script:

#!/bin/bash
. HumanLanguage.sh
PrintOnScreen Hello
PrintOnScreenWithoutNewline "Some number:"
MathAdd 2 3

This will produce:

Hello
Some number: 5
maxorq
A: 

You might find Perl's Lingua::Romana::Perligata interesting -- it allows you to write your perl programs in latin. It's not quite the same as your idea, as it essentially restructures the language semantics around Latin ideas, rather than just translating the strings.

Andrew Aylett