views:

108

answers:

3

I need to build simple server that

  1. reads (potentially large) xml files
  2. processes them in memory(eg transform them to a different xml structure)
  3. writes them back to disk.

Some important aspects of the program:

  • speed
  • ability to distribute the server. That means placing (what does that mean) several such servers and each server will handle different volume of xml files.
  • cross platform
  • built in a very tight dead line

Basically my question is :
In what programming language should I do it ?

Java ?

  • speed of development
  • cross platform
  • IO operation are high with the right configuration (add a web link here).

C++ ?

  • execution speed
  • cross platform (with the right libraries).
  • however development is slower.
+7  A: 

Rather than coding this in a low-level language, you might want to look into an ETL or XSLT engine. They are optimized for performance beyond what you would generally be able to produce on your own, and are generalized enough to accommodate user changes (not sure if your XML transformation is a one-time thing, or if it may change over time).

Greg Harman
+1 -- using existing, carefully optimized, code (especially for a task like this) is likely to be what most people will write themselves quite thoroughly.
Jerry Coffin
An example of an available XSLT engine would probably help
Martin York
I don't have enough hands-on experience with an XSLT engine to make a solid recommendation, but by popular request: Altova and Saxonica are two free options. It looks like Intel and MicroSoft also have solutions.
Greg Harman
@Martin: http://en.wikipedia.org/wiki/XML_template_engine
rwong
+1  A: 

I'm still a little foggy on your requirements BUT.

You are asking the wrong question. If language really isn't an issue, you should be looking for 3rd party libraries that can handle large amounts of disk io, an libraries that perform XSLT. See which libraries exist for both languages then pick.

Further, if the performance is a key requirement, you'll need to determine whether the process will be IO bound or CPU bound. That will dictate with libraries need to be used as well as general architecture. Are the xml transformations cpu intensive? or can the easily be done with a one or two pass parse?

caspin
+1, also for making something comprehensible out of the OP :-)
Péter Török
A: 

Tight deadline? The need for parallel operation a given? Then speed is not a problem. Just throw more servers at it, until your throughput matches the demand.

If you're faster in Java, sure, go ahead with that. You might need twice the number of servers, but those can be built in days not weeks.

Portability is never a requirement under tight deadlines. Just ask whoever set those deadlines whether he has made any non-reversible choices. If so, stick with those; if not, pick something and stick with that. You don't have the time to test it on different platforms, so any portability you would have would be theoretical anyway.

MSalters