views:

162

answers:

1

There is a Perl library I would like to access from within Python. How can I use it?

FYI, the software is NCleaner. I would like to use it from within Python to transform an HTML string into text. (Yes, I know about aaronsw's Python html2text. NCleaner is better, because it removes boiler-plate.)

I don't want to run the Perl program as a script and call it repeatedly, because it has an expensive initial load time and I am calling it many times.

+10  A: 

pyperl provides perl embedding for python, but honestly it's not the way I'd go. I second Roboto's suggestion -- write a script that runs NCleaner (either processing from stdin to stdout, or working on temporary files, whichever one is more appropriate), and run it as a subprocess.

Or, since I see from the NCleaner page that it has a C implementation, use whatever facilities Python has for binding to C code and write a Python module that wraps the NCleaner C implementation. Then in the future the answer to invoking NCleaner from Python will just be "here, use this module."

Footnote: Inline::Python is better code than pyperl, and I would suggest using that instead, but it only supports having Python call back to Perl when Python is invoked from Perl in the first place -- the ability to embed Perl into Python is listed as a possible future feature, but it's been so since 2001, so don't hold your breath.

hobbs