views:

584

answers:

7

I know the question is very subjective. But I cannot form the question in a much more better manner. I would appreciate some guidance.

I often as a developer feel how easier it would have been for me if I could have some tools for doing some reasonably complex task on a log file , or a set of source files, or some data set etc.

Clearly, when the same type of task needs to be done repetitively and when speed is critical, I can think of writing it in C++/Java.

But most of the times, it is some kind of text processing or file searching activity that I want to do only once just to perform a quick check or to do some preliminary analysis etc. In such cases, I would be better off doing the task manually rather than writing in C++/Java. But I could surely doing it in seconds if I knew some language like Bash/Python/Perl/Ruby/Sed/Awk.

I know this whole question is subjective and there is no objective definite answer, but in general what does the developer community feel as a whole? What subset of these languages should I know so that I can do all these kinds of tasks easily and improve my productivity.

Would Perl be a good choice?
It is a super set of Sed/Awk, plus it allows to write terse code. I can get done with fewer lines of code. It is neither readable nor easily maintainable, but I never wanted those features anyway. The only thing that bothers me is the negative publiciity that Perl has got lately and it has been criticized by the Ruby/Python community a lot. Also, I am not sure if it can replace bash scripting totally. If not, then is Perl+Bash a good combination for these kind of tasks?

Thanks, Ajay G

+1  A: 

I find Python+Bash a very nice combo.
I usually use Python because it's very readable and maintainable. And because there are lots of online documentation available.

Btw, I suggest you to read http://www.ibm.com/developerworks/aix/library/au-python/

ikkebr
Good answer, but note that the "try: except:" blocks in that article are a bad idea.
Jason Orendorff
A: 

My first port of call is bash with sed to provide regular expression processing. You can do a lot with a bash for loop, grep and some regular expressions.

It's worth learning regular expressions if you don't already know them. An editor which lets you use them (like vi) is extremely useful when manipulating files (e.g. you have a set of data extracted from a logfile, and you need to turn it into a set of SQL statements for example).

If it takes me more than a few minutes to figure out how to do whatever parsing task I'm trying to do in bash/sed, I usually end up using perl instead. As suggested by ikkebr, python is probably as good as (or better than) perl; I just had the misfortune to learn perl first, so am much more familiar with it - if I was to start again, I'd learn python instead I think.

bm212
+1  A: 

I would use Perl over a bash/sed/awk combination. Why ?

  1. You only have the one executable, rather than spawning off multiple executables to do work.
  2. You can make use of a wide range of Perl modules to do most anything (see CPAN for the modules available)

In fact I would recommend any scripting language over the shell/awk/sed combination, for the same reasons. I don't have a problem with sed/awk per se, but as your required solutions become more complex/lengthy, I find the more powerful scripting languages more scalable, and (to some degree) refactorable for re-use.

Brian Agnew
CPAN is awesome, and IMO is the most overlooked feature of Perl.
MiffTheFox
+1  A: 

Python is more expressive and readable than bash, but requires more setup: import os and so on. For simple tasks, bash is quicker -- which is the most important for this. And don't underestimate the power of input/output redirection in bash!

singingwolfboy
+2  A: 

I tend to do a lot of processing with ruby. It has all the functionality of perl, but I find it to be a little more readable. Both perl and ruby support the -n, -e, and -p options.

-e 'command'    one line of script. Several -e's allowed. Omit [programfile]
-n              assume 'while gets(); ... end' loop around your script
-p              assume loop like -n but print line also like sed

For example in ruby

seq 1 4 | ruby -ne 'BEGIN{ $product = 1 }; $product *= $_.to_i; END { puts $product }'
24

Which is very similar to perl

seq 1 4 | perl -ne 'BEGIN{ $product = 1 }; $product *= $_; END { print $product }'
24

In Python, the same would look like this:

seq 1 4 | python -c 'import sys; print reduce(lambda x,y : int(x)*int(y),  sys.stdin.read().splitlines(True))'
24

While it's possible to do the above in bash/awk/sed, you'll be limited by their lack of more advanced features.

brianegge
nicely presented
ajay
A: 

With shell scripting, all you ever need to know is a bit bash/sh and a lot of awk. Bash for calling your commands, and awk for processing. Some of the unix tools below, contrary to the fact that many people use them, are not necessary because awk can do their functions.

1) cut
2) sed
3) wc
4) (e)grep
5) cat
6) head 
7) etc..

and a few others whose functions overlap. In the end, your script will not cluttered with redundant tools and slow down your script.

Perl/Python are very useful sysadmin tools as well. Both of them do similar things and have libraries that help in your sysadmin tasks. The only significant difference is, aesthetically speaking, the appearance of your code written in them.

You can learn about Ruby if you want, but in terms of sysadmin, I would say go for Perl/Python instead.

ghostdog74
A: 

In the time it took you to write those few paragraphs, you could have already learned enough Python to make your life significantly better.

Anyone who already knows C++ or Java can become productive in Python in about 4 hours. Just read the tutorial.

Jason Orendorff