views:

987

answers:

5

I'm working on some Java <-> Perl interaction. I would like to know what the best way is to pass information from Perl to Java. (Great answers about Perl and Java here and here btw).

There's a lot of text and XML(XML::Twig) I'm parsing in Perl, in a script I'm supposed to call from a Java Web App. So I have all this gathered data, and I need it to use it inside certain objects in Java.

What could be a good strategy to send the information from Perl to Java? Is it even possible to return an object or other compatible data structure from Perl to Java?

I guess writing to a text file and reading it from Java would make all the optimization gained by using Perl meaningless.

Perlformance is an important issue here.

EDIT: From what I've seen here, maybe Inline-Java would be a good option?

+2  A: 

if you've got XML already, your best option is probably to continue using it.

Tequila Jinx
Our first implementation had Java parse the XML's, but we found Perl is much faster and more effective in doing this. So we are trying to derivate that job to Perl which is better at it.
Fernando
You're probably doing something very strange if Perl is parsing XML faster than Java.
Tom Hawtin - tackline
We were using xPath to navigate XML files and obtaining node values; but, isn't Perl oriented to that kind of work (text parsing and stuff like that)??
Juan Manuel
I'd ignore Tom Hawtin's thought that somehow Java will beat Perl at XML parsing. When it comes to parsing text, Perl is king - http://www.tbray.org/ongoing/When/200x/2007/10/30/WF-Results
mpeters
The reason Perl might be faster than Java at parsing XML (I wouldn't know for sure) is that Perl doesn't parse XML, it uses standard C libraries to do the parsing (expat or libxml2) and modules to access those libraries.
runrig
+3  A: 

JSON is an easy, lightweight format for passing data around.

you could add Rhino or something similar to your toolkit and may gain additional performance (not to mention a scripting engine), but this will depend on how complex you are planning your project to be.

dsm
+5  A: 

If performance is important I'd recommend having a persistent Perl process running. Starting a Perl interpreter every time you want to run your code will be quite an overhead.

The easiest way to communicate is for the Java process to open a TCP connection to the Perl processs, write some data and get some back.

The format you use to send data to from your Perl process back to your Java one will depend on what you're sending and how generic and re-usable you want your code to be. Sending back XML strings will be nice and generic but sending back byte arrays created with pack in Perl and then read a with DataInputStream in Java will be much, much faster.

Dave Webb
You can actually use Inline::Java this way. It can create a jvm process that communicates with your Perl process. It's a custom format or course, but it let's you use native Java objects in Perl and vice-versa
mpeters
A: 

I also find it strange that your perl code for parsing XML would be significantly faster than the equivalent code in java. Even if it is, though, I can't imagine that it would ever be faster than the overhead incurred by the IPC. Even if you use a persistent perl process, you're still going to have to send it data, presumably over a socket, then get some data back, and finally deserialize it into something usable.

Have you tried improving the performance of your XML parsing in java? If you're using DOM or a third-party library like JDOM or castor, try using SAX instead. Or maybe just using regular expressions instead of parsing the XML would be faster (jwz notwithstanding).

In any case, I would recommend using a profiler with your java code first to see if it can be improved.

Jason Day
There is a pure perl XML parsing library, but nobody who cares about speed uses it. Most perl XML parsing involves expat or libxml2 under the hood which is fast.
runrig
I don't see how that's going to make much difference. Even if the raw XML parsing is significantly faster, you've still got the IPC overhead.
Jason Day
+2  A: 

Inline::Java. It works really well once you get it up and running. I was working on a project a few years ago that had a Perl web application talking to a Java SOAP server for some realtime communication and it was just way too slow. We replaced it with a system using Inline::Java for the communication and it was much faster. Definitely minimize the points that you pass objects back and forth and try to make those objects simple (we stuck with strings, numbers and arrays) to keep things from getting too out of control. But I'm definitely a convert!

mpeters