tags:

views:

611

answers:

10

(This question was reworded; please, give it a second chance before deleting it again!)

It would be nice if there existed a program that automatically transforms Perl code to Python code, making the resultant Python program as readable and maintainable as the original one, let alone working the same way.

The most obvious solution would just invoke perl via Python utils:

#!/usr/bin/python
os.exec("tail -n -2 "+__file__+" | perl -")
...the rest of file is the original perl program...

However, the resultant code is hardly a Python code, it's essentially a Perl code. The potential converter should convert Perl constructs and idioms to easy-to-read Python code, it should retain variable and subroutine names (i.e. the result should not look obfuscated) and should not shatter the wrokflow too much.

Such a conversion is obviously very hard. The hardness of the conversion depends on the number of Perl features and syntactical constructs, which do not have easy-to-read, unobfuscated Perl equivalents. I believe that the large amount of such features renders such automatic conversion impossible practically (while theoretical possibility exists).

So, could you please name Perl idioms and syntax features that can't be expressed in Python as concise as in the original Perl code?

Edit: some people linked Python-to-Perl conventers and deduced, on this basis, that it should be easy to write Perl-to-Python as well. However, I'm sure that converting to Python is in greater demand; still this converter is not yet written--while the reverse has already been! Which only makes my confidence in impossibility of writing a good converter to Python more solid.

+6  A: 

It is not impossible, it would just take a lot of work.

By the way, there is Perthon, a Python-to-Perl translator. It just seems like nobody is willing to make one that goes the other way.

EDIT: I think I might I've found the reason why a Python to Perl translator is much easier to implement. It's because Python lets you fiddle with a script's AST. See parser module.

NullUserException
Your answer is not helpful. A helpful answer would contain elaboration *why* this would take a lot of work. And your edits doesn't make it more clear.
Pavel Shved
The same reason why it takes a lot of work to write a program that translates from English to Chinese.
NullUserException
+1: A Lot of work.
S.Lott
@NullUser, *why* the same? It's not clear how natural language translation relates to processing of formal languages.
Pavel Shved
@Pavel The easiest way would be try and write a converter/translator yourself, then you'd be way better qualified to answer that question than I am :)
NullUserException
@Pavel: Natural languages should technically be *easier* to translate than formal languages, because formal languages are not required to have any fundamental similarities, while natural languages are, or, at least in practice, do.
Jon Purdy
@Jon Purdy - I'm no expert, but it seems that natural languages have more "long range" structure requirements. That is, you could translate a function one statement at a time, but text needs to be translated sentences or paragraphs at a time for even cursory success. Wouldn't that make it more difficult?
detly
@detly: It's not so much that the *structure* of natural language is complex. I think you're inadvertently referring to the difficulty of capturing the *sense* of a text, which admittedly in formal languages *is* significantly easier because the meaning is formally specified, whereas in natural language it's more ill-defined. Arguably, no translation is complete without understanding, and current computers are much better at understanding the precise than the fuzzy.
Jon Purdy
A: 

NullUserException basically summed it up - it certainly can be done; it would just be an enormous amount of effort to do so. Some language conversion utilities I've seen compile to an intermediate language (such as .NET's CIL) and then decompile that to the desired language. I have not seen any for Perl to Python. You can, however, find a Python to Perl converter here, though that's likely of little use to you unless you're trying to create your own, in which case it may provide some helpful reference.

Edit: if you just need the exact functionality in a Python script, PyPerl may be of some use to you.

Ryan Mentley
The approach of using intermediate languages and decompilation usually involves dramatic decrease of readability and loss of the structure of the program. And Python-to-Perl converter is a completely different task, and it doesn't shed any light on the reverse.
Pavel Shved
@Pavel I believe use of an intermediate language is the preferred technique. As for readability, it is the job of the intermediate-to-Python converter to generate well formed and readable Python code. As for structure, Python and Perl programs are structured differently so I don't think you would really want a Python program that was structured like Perl, if that were the case just use Perl! Basically, after you have a reasonable Perl-to-intermediate converter, you can focus all your development effort on generating a good intermediate-to-Python converter.
Fredrick Pennachi
+21  A: 

Why Perl is not Python.

  1. Perl has statements which Python more-or-less totally lacks. While you can probably contrive matching statements, the syntax will be so utterly unlike Perl as to make it difficult to call it a "translation". You'd really have to cook up some fancy Python stuff to make it as terse as the original Perl.

  2. Perl has run-time semantics which are so unlike Python as to make translation very challenging. We'll look at just one example below.

  3. Perl has data structures which are enough different from Python that translation is hard.

  4. Perl threads don't share data by default. Only selected data elements can be shared. Python threads have more common "shared everything" data.

One example of #2 should be enough.

Perl:

do_something || die()

Where do_something is any statement of any kind.

To automagically translate this into Python you'd have to wrap every || die() statement in

try:
   python_version_of_do_something
except OrdinaryStatementFailure, e:
   die()
   sys.exit()

Where the more common formulation

Perl

do_something

Would become this using simple -- unthinking -- translation of the source

try:
   python_version_of_do_something
except OrdinaryStatementFailure, e:
   pass

And, of course,

Perl

do_this || do_that || die()

Is even more complex to translate into Python.

And

Perl

do_this && do_that || die()

really push the envelope. My Perl is rusty, so I can't recall the precise semantics of this kind of thing. But you have to totally understand the semantics to work out a Pythonic implementation.

The Python examples are not good Python. To write good Python requires "thinking", something an automatic translated can't do.

And every Perl construct would have to be "wrapped" like that in order to get the original Perl semantics into a Pythonic form.

Now, do a similar analysis for every feature of Perl.

S.Lott
OUCH! And people claim that Python is easy to read??? :)
DVK
Straw man, anyone? Python is easy to read. It's just not easy to read Python that's been written in Perl.
Nathon
@DVK: You missed the point entirely. So I revised the answer to attempt to clarify it. **Unthinking**, **automatic** translation of Perl to Python leads to problems. With a tiny scrap of actual human **thinking**, a better, clearer Pythonic statement is possible. But the question is not about **thinking**, human translation. It's about **unthinking**, **automated** translation.
S.Lott
Even with a few tries you're not dying the perl way: Outside an eval, prints the value of LIST to STDERR and exits with the current value of $!. If $! is 0 , exits with the value of ($?>> 8). If ($?>> 8) is 0 , exits with 255 . Inside an eval(), the error message is stuffed into $@ and the eval is terminated with the undefined value. If the last element of LIST does not end in a newline, the current script line number and input line number (if any) are also printed, and a newline is supplied. If the output is empty and $@ already contains a value that value is reused after appending.
DiggyF
@Nathon, It's *still* Python, though. So if the argument is that in order to do what somebody calling a Perl function expects, Python gets ugly, it *is* Python that's the "problem". Whether this is a pseudo-problem or not, is left to the reader.
Axeman
@S.Lott - I was trying to be funny (obviously, unsuccessfully). My comment was not really related to the substance of your answer.
DVK
Sure, Axeman. Python is not good at doing what Perl programmers expect Perl functions to do. That may or may not be a shortcoming, depending on your perspective. Like any language, Python code must be written in Python in order to appear sane. The same case can be made for C or Lisp code written for the Python interpereter. My point is that syntactic Python is not the same as idiomatic Python, which is the desired output of the OP's program.
Nathon
@Nathon: Agreed. And idiomatically "correct" translation requires **understanding**. Which requires deducing the **intent** behind some random block of code. Rather hard to do for any random block of code in any language.
S.Lott
+1  A: 

The B set of modules by Malcolm Beattie would be the only sane starting point for something like this, though I'm with other answers in that this would be a difficult problem to solve. In general, translating the sense of one high-level language into another high-level language requires a high-level translator, and, for the time being, that can mean only a human.

The difficulty of this problem, for any pair of languages, is due to fundamental differences in the nature of the languages in question, such as runtime semantics and common idioms, not to mention libraries.

Jon Purdy
Libraries may be of help, rather than an obstacle. A translation of a library call may be another library call, which we can implement. While the translation of a built-in feature as a library call doesn't look that nice.
Pavel Shved
+5  A: 

Perl can experimentally be built to collect additional information (for instance, comments) during compilation of perl code and even emit the results as XML. There doesn't appear to be any documentation of this outside the source, except for: http://search.cpan.org/perldoc/perl5100delta#MAD

This should be helpful in building a translator. I'd expect you to get 80% of the way there fairly easily, 95% with great difficulty, and never much better than that. There are too many things that don't map well.

ysth
"too many things that don't map well" - could you slightly mention a couple of them? Just to be less abstract...
Pavel Shved
a few: pack and unpack, formats, lvalue substrings, closures, local()
ysth
@ysth, thank you.
Pavel Shved
I see assertions that python has closures, but the examples then given are essentially just function pointers with no closing happening. So I'm probably wrong about that.
ysth
+4  A: 

Fundamentally, these are two different languages. Converting from one to another and have the result be mostly readable would mean that the software would have to be able to recognize and generate code idioms, and be able to do some static analysis.

The meaning of a program may be exactly defined by the language definition, but the programmer did not necessarily require all the details. A C programmer testing if the value a printf() returned is negative is checking for an error condition, and doesn't typically care about the exact value. if (printf("%s","...") < 0) exit(); can be translated into Perl as print "..." or die();. These statements may not mean exactly the same thing, but they'll typically be what the programmer means, and to create idiomatic C or Perl code from idiomatic Perl or C code the translator must take this into account.

Since different computer languages tend to have different slightly semantics for similar things, it's typically impossible to translate one language into another and come up with the exact same meaning in readable form. To create readable code, the translator needs to understand what the programmer was intending to do, and that's real difficult.

In addition, it would be easier to translate from Python to Perl rather than Perl to Python. Python is intended as a straightforward language with clear standard ways to do things, while Perl is an unduly complex language with the motto "There's More Than One Way To Do It." Translating a Python expression into one of the innumerable corresponding Perl expressions is easier than figuring out what the Perl programmer meant and expressing it in Python.

David Thornley
+20  A: 

Your best Perl to Python converter is probably 23 years old, just graduated university and is looking for a job.

glowcoder
Without a useful contribution, you're turning a serious question into a novel joke.
Evan Carroll
@Evan: It's probably true, though.
Paul Nathan
+9  A: 

Just to expand on some of the other lists here, these are a few Perl constructs that are probably very clumsy in python (if possible).

  • dynamic scope (via the local keyword)
  • typeglob manipulation (multiple variables with the same name)
  • formats (they have a syntax all their own)
  • closures over mutable variables
  • pragmas
  • lvalue subroutines (mysub() = 5; type code)
  • source filters
  • context (list vs scalar, and the way that called code can inspect this with wantarray)
  • type coercion / dynamic typing
  • any program that uses string eval

The list goes on an on, and someone could try to create a mapping between all of the analogous constructs, but in the end it will be a failure for one simple reason.

Perl can not be statically parsed. The definitions in Perl code (particularly those in BEGIN blocks) change the way the compiler is going to interpret the remaining code. So for non-trivial programs, conversion from Perl => Python suffers from the halting problem.

There is no way to know exactly how all of the program will be compiled until the program has finished running, and it is theoretically possible to create a Perl program that will compile differently every time it is run. Meaning that one Perl program could map to an infinite number of Python programs, the correct of which is only know after running the original program in the perl interpreter.

Eric Strom
Also, Python handles scope semantics differently.
Paul Nathan
Good points, also with a lack of formal semantics it is not possible to know exactly what a Perl program does let alone recreate that functionality in another language.
Amoss
XS is another tricky spot, though that blurs the line about whether or not it's really "Perl".
Michael Carman
Thank you for your answer, it was even the accepted one (before I made the question CW). Your proof of impossibility is correct, but it's not *practical*. For most non-trivial programs I saw, `BEGIN` blocks are nevertheless not very complex, and affect the rest of the code in such a way that they can be transformed. So, this point does belong to the list, but can't replace it.
Pavel Shved
@Pavel => I think you're underestimating the scope of the issue. While the thought exercise in my answer is admittedly an edge case, every single `sub ... {}` definition in a Perl program has an implicit `BEGIN` block around it. As does every `use` statement. The simple example of `BEGIN {*{(caller)."::$_"} = sub {...} for qw(list of names)}` defies any sort of static parsing (and that pattern is fairly common practice (exporting of subroutines from modules is done this way)). And each of these declarations could completely change the way the following code is parsed.
Eric Strom
@Eric, it's unnecessary to look at `sub`-s as at `BEGIN` blocks--we can converts them as `sub`-s instead. Also, Python has `import`, which can be used instead `use Sometihng qw(a b c)`. Python is also capable of creating class methods dynamically (see [here](http://stackoverflow.com/questions/533382/dynamic-runtime-method-creation-code-generation-in-python)) which can be used to change what subroutines can be called from a module dynamically. And, if Perl code uses `Exporter`, it's an idiom that can be converted, no expanding into `BEGIN` required!
Pavel Shved
@Eric, I mean, "while each of these declarations could completely change the way the following code is parsed", they most likely do not: they just play around with symbol tables. Yes, tricky, yes, sometimes impossible, but this is just the same impediment as the rest of the list we're gathering here.
Pavel Shved
@Pavel => consider trying to parse the following statement `my @result = foo bar 1, 2, 3;`. if `foo` has a `(@)` prototype, and `bar` has a `($)` prototype, the statement will parse like this `my @result = foo( bar(1), 2, 3)`. However, if `foo` has a `($)` prototype, it would be like this `(my @result = foo(bar(1))), 2, 3;`, and if they both had `(@)` prototypes, `my @result = foo(bar(1, 2, 3));`. As you can see, just the prototypes here can completely change the way the line their bareword is in gets parsed. and these parsing rules can change back and forth as more prototypes are added.
Eric Strom
... and if it turns out that the prototyped subs are added programatically (which is not unheard of, there is no one way to export subs in Perl, you could use Exporter, or write your own methods), there is no way to statically determine what they will be, and thus no way to accurately continue with the parse beyond the BEGIN block that introduced them. This problem of self mutating parser rules is one of the primary barriers to automatic code conversion, because it gives truth to the statement "only perl can parse Perl" (little p being the interpreter binary).
Eric Strom
+3  A: 
  • Python scope and namespace are different from Perl.

  • In Python, everything is an object. In Perl, everything under the hood seems to be a list/hash/scalar/reference/function. This induces different design approaches and idioms.

  • Perl has anonymous code blocks and can generate closures on the fly with some branches. I am pretty sure that is not a python feature.

I do think that a very smart chap could statically analyze the bulk of Perl and produce a program that takes small Perl programs and output Python programs that do the same job.

I am much more doubtful about the feasibility of large and/or gnarly Perl translation. Some of us write some really funky code at times.... :)

Paul Nathan
AFAIK, in Python object is a dict - same in Perl, because it was copied from Python :)
Alexandr Ciornii
@Alex: not exactly, I don't *think*. My introspections into types and objects in Python suggest to me that there is a fundamental difference - although fields do get stored in the local dict.
Paul Nathan
@Alexandr Ciornii: err, what are you claiming Perl copied from Python? You might want to check the relevant dates...
ysth
A: 

This is impossible just because you can't even properly parse perl code. See Perl Cannot Be Parsed: A Formal Proof for more details.

Hynek -Pichi- Vychodil
Arguably, though, a *reasonably large size* of Perl can be parsed.
Paul Nathan