In the company I work for, we do a lot of file-based transaction processing. The processing centers around the conversion of files between numerous formats to suit numerous systems in numerous companies.

The processing almost always involves an XML stage and can include a lot of text parsing, database lookups, data conversion and data validation.

Currently the programs performing all these tasks are written in C++ and they perform quite quickly all on one average server. I'm investigating the possibilities of using a more "modern" language that newer graduate programmers are more likely to be familiar with. (Correct memory allocation in C++ seems to causes problems with a lot of newer programmers these days)

Based on the brief information provided, would a language such as python provide the required functionality and performance, as well as addressing the memory allocation (and various other C++ related) problems which arise?

I like the idea of not needing to compile the programs each time we make a change. I understand that the interpreted languages probably wont hit the same performance we currently get.

Our systems are Linux based which also restrict some options.

Any comments on the functionality and performance available with Python or suggestions of alternative languages would be much appreciated.

+5  A: 

Python would probably remove most of the low level stuff that you use in your application. Memory allocation wouldn't be an issue anymore. Also, at least my university seems to be embracing Python as a programming language because students don't have to write all of that formal stuff to get started. Your only problem would be the performance part, as Python will likely never be as fast as a compiled C++ program.

I would advise you to take a couple of weeks to get to know the programming languages that you're considering. I'd check out Ruby also. Maybe toy around with Haskell a bit?

As I understand it Python seems well equipped for dealing with everything you're talking about. XML, database lookups, validation, parsing. It is usually a safe choice, not just because of the easy and fun programming experience, but if you're stuck there's an awesome community around the language who are just happy to help out.

+8  A: 

I like the idea of not needing to compile the programs each time we make a change. I understand that the interpreted languages probably wont hit the same performance we currently get.

This is the biggest issue; can you live with the performance hit. You could try to use Python and extending it with your current C++ modules for the performance heavy parts. Still, switching your entire system seems like a big effort if the only reason is the lack of C++ talent. Hiring people who know C++ seems like the cheaper option.

poor programmers tend to be poor with all languages, so changing everything just to suit the numpties won't be a solution. I'd recommend teaching them how to be better instead, it'll pay off significantly. (and use STL and a nice XML lib - tinyXML is good)
+4  A: 

Which is more important, quickly getting the programs to work, or getting the programs working quickly?

If you're dealing with large numbers of large files then you may be better off staying in C++ and teaching your graduate programmers what a pointer is (!)

Otherwise I'd strongly advise that you look at a scripting-based solution, because development in these, once you're up to speed, is so much faster. And a lot more fun, if we're honest, for most people at least.

If the per-record processing load is not high, you may be surprised how little performance you lose: file IO will almost certainly be handled in a compiled (C) library, so the interpreter overhead may be relatively low. Worth trying, I'd suggest.

Of the imperative languages, Perl is an obvious option, Python is popular and Ruby has a high profile (and probably cleaner OO features than the first two). Then there is the slightly more, er, esoteric realm of the functional languages, but I'm not qualified to comment on those.

Mike Woodhouse
+3  A: 

Another alternative is to embed Python in your C++ program. You could keep much of your application the same, and make calls out to Python for the pieces that change often, or need the flexibility that a scripting language provides.

From the Python docs

The previous chapters discussed how to extend Python, that is, how to extend the functionality of Python by attaching a library of C functions to it. It is also possible to do it the other way around: enrich your C/C++ application by embedding Python in it. Embedding provides your application with the ability to implement some of the functionality of your application in Python rather than C or C++. This can be used for many purposes; one example would be to allow users to tailor the application to their needs by writing some scripts in Python. You can also use it yourself if some of the functionality can be written in Python more easily.

Rob Thomas
+2  A: 

This is the biggest issue; can you live with the performance hit. Blockquote

You can improve the performace using Psyco. And after all, thanks to Moor's law, performance isn't that big issue this days.


Or should try to store your parsing rules on a database instead of leaving them hard-coded inside your code. As Ken Downs rightly quoted, minimize code, maximize data. This way you would not need to recompile everytime a tiny rule changes.

Mario Marinato -br-
+2  A: 

I hate to say this, but f you want something that your incoming developers are going to be familiar with, go with Java. Java is the language that most recent graduates will be most familiar with. You still have to compile, but compile times will be shorter than C++. It'll run on Linux and pretty much anywhere else. It's got a good garbage collector. It's pretty fast. And did I mention your developers will be familiar with it? No, it's not "cool" like Python, but it's a very tried-and-true language.

Honestly, I doubt that you've got a lot of incoming developers who suck with C++ but would be awesome with Python anyway. The people who use Python well tend to be fine with manual memory management. The people who are bad with memory management actually tend to be bad with all languages.

I do find it worrisome that you've got developers who are so bad with memory management that you want to switch languages. That's a sign indicating a problem, but I'm not sure that the problem is with the language.

Derek Park
-1 for Java (doesn't really help the OP much at all), but +1 for "people who are bad with memory management tend to be bad with all languages".

If you can get away with using Python, Ruby, Groovy or Perl vs. C++ you would be better off going with one of these higher level languages. Productivity will greatly increase. If you find that you need more performance then go with Java. Everyone should know at and use at least one dynamically typed language.


should move to python that languange make all possible in networking, if you need faster move to c/c++