There are few ways to make your shell (eg Bash) execute faster.
- Try to use less of external commands if Bash's internals can do the task
for you. Eg, excessive use of
sed
, grep
, awk
et for string/text
manipulation.
- If you are manipulating relatively BIG files, don't use bash's while read loop.
Use awk. If you are manipulating really BIG files, you can use grep to search for the patterns you want, and then pass them to awk to "edit". grep's searching algorithm is very good and fast. If you want to get only front or end of the file, use head and tail.
- file manipulation tools such as sed, cut, grep, wc, etc all can be done
with one awk script or using Bash internals if not complicated. Therefore, you can try to cut down the use of these tools that overlap in their functions.
Unix pipes/chaining is excellent, but using too many of them,
eg
command|grep|grep|cut|sed
makes your code slow. Each pipe is an overhead.
For this example, just one awk does them all.
command | awk '{do everything here}'
The closest tool you can use which can match Perl's speed for certain tasks, eg string manipulation or maths, is awk. Here's a fun benchmark for this solution. There are around 9million numbers in the file
Output
$ head -5 file
1
2
3
34
42
$ wc -l <file
8999987
# time perl -nle '$sum += $_ } END { print $sum' file
290980117
real 0m13.532s
user 0m11.454s
sys 0m0.624s
$ time awk '{ sum += $1 } END { print sum }' file
290980117
real 0m9.271s
user 0m7.754s
sys 0m0.415s
$ time perl -nle '$sum += $_ } END { print $sum' file
290980117
real 0m13.158s
user 0m11.537s
sys 0m0.586s
$ time awk '{ sum += $1 } END { print sum }' file
290980117
real 0m9.028s
user 0m7.627s
sys 0m0.414s
For each try, awk is faster than Perl.
Lastly, try to learn awk beyond what they can do as one liners.