views:

907

answers:

6

In my new job more people are using Python than Perl, and I have a very useful API that I wrote myself and I'd like to make available to my co-workers in Python.

I thought that a compiler that compiled Perl code into Python code would be really useful for such a task. Before trying to write something that parsed Perl (or at least, the subset of Perl that I've used in defining my API), I came across bridgekeeper from a consultancy.

It's almost certainly not worth the money for me to engage a consultancy to translate this API, but that's a really interesting tool.

Does anyone know of a compiler that will parse (or try to parse!) Perl5 code and compile it into Python? If there isn't such a thing, how should I start writing a simple compiler that parses my object-oriented Perl code and turns it into Python? Is there an ANTLR or YACC grammar that I can use as a starting point?

Edit: I found perl.y, which might be a starting point if I were to roll my own compiler.

+5  A: 

I never tried it and it seems unmaintained, but maybe PyPerl is an option?

How big is this API? If it really this useful then why don't you rewrite it in python. Writing an automatic converter will probably take longer then rewriting the API.

And even if you manage to automatically rewrite it, the resulting code probably won't be very pythonic anyway.

Be sure to check out the answers by weismat and eliben

jitter
or Inline::Python
Alexandr Ciornii
+10  A: 

James,

I recommend you to just rewrite the module in Python, for several reasons:

  1. Parsing Perl is DARN HARD. Unless this is an important and desirable exercise for you, you'll find yourself spending much more time on the translation than on useful work.
  2. By rewriting it, you'll have a great chance to practice Python. Learning is best done by doing, and having a task you really need done is a great boon.
  3. Finally, Python and Perl have quite different philosophies. To get a more Pythonic API, it's best to just rewrite it in Python.
Eli Bendersky
PERL is FULL of constructs that don't trivially translate to Python. PERL requires a human to read and understand the intent behind the code.
S.Lott
s/PERL/Perl/g (unless you're shoutier than I suspect)
ijw
@ijw: what does "s/PERL/Perl/g" mean?
Bastien Léonard
@Bastien: it's a Perl regex substitution command, and means "substitute PERL by Perl globally (i.e. all occurrences)"
Eli Bendersky
@S.Lott. Indeed, but the OP already knows Perl and has written the module in question, so I conjectured it's safe to assume he understands the intent of what he's written.
Eli Bendersky
+8  A: 

I think you should rewrite your code. The quality of the results of a parsing effort depends on your Perl coding style. I think the quote below sums up the theoretical side very well. From Wikipedia:Perl in Wikipedia

Perl has a Turing-complete grammar because parsing can be affected by run-time code executed during the compile phase.[25] Therefore, Perl cannot be parsed by a straight Lex/Yacc lexer/parser combination. Instead, the interpreter implements its own lexer, which coordinates with a modified GNU bison parser to resolve ambiguities in the language.

It is often said that "Only perl can parse Perl," meaning that only the Perl interpreter (perl) can parse the Perl language (Perl), but even this is not, in general, true. Because the Perl interpreter can simulate a Turing machine during its compile phase, it would need to decide the Halting Problem in order to complete parsing in every case. It's a long-standing result that the Halting Problem is undecidable, and therefore not even Perl can always parse Perl. Perl makes the unusual choice of giving the user access to its full programming power in its own compile phase. The cost in terms of theoretical purity is high, but practical inconvenience seems to be rare.

Other programs that undertake to parse Perl, such as source-code analyzers and auto-indenters, have to contend not only with ambiguous syntactic constructs but also with the undecidability of Perl parsing in the general case. Adam Kennedy's PPI project focused on parsing Perl code as a document (retaining its integrity as a document), instead of parsing Perl as executable code (which not even Perl itself can always do). It was Kennedy who first conjectured that, "parsing Perl suffers from the 'Halting Problem'."[26], and this was later proved.[27]

weismat
+6  A: 

Starting in 5.10, you can compile perl with the experimental Misc Attribute Decoration enabled and set the PERL_XMLDUMP environment variable to a filename to get an XML dump of the parse tree (including comments - very helpful for language translators). Though as the doc says, this is a work in progress.

ysth
+3  A: 

As much as it might be fun to convert it to or rewrite it in python, I wouldn't make either of those my first choice. Then you'd be stuck with a forked code base. Any modifications you make will have to be duplicated.

Write some sort of wrapper for your API that you can access from outside of Perl. One possibility is a RESTful interface. Another, if you don't want to deal with networking issues, is to create a set of command line tools that access the API (possibly passing information as JSON). Then you can write an easy python library which accesses the wrapper API using httplib2 or subprocess (depending on how you've implemented the wrapper).

You'll still have to update the Python API whenever the interface changes, but now it's only for interface changes.

jcdyer
Or I could just switch to Python. That wouldn't be so bad!
James Thompson
+3  A: 

You could try writing a parser with PPI, dump it to some intermediary form and write Python mecanically from there. Hard, but doable. Useful? Er....

Or you could port your code to Perl 6, wait to Pynie to be ready enough to allow direct call from Python to Perl6 within the same runtime! It's not that far away after all. Too bad Ponie's dead though.

wazoox