views:

47

answers:

2
+1  Q: 

awk optimization

any pointers on what must not be in an awk script? I am asking as I dont find any tool to debug an awk script. I have a script taking a lot of cpu so need to know if I am doing something terribly wrong with the script.

Just for an example, I keep looking for output of a logfile via 'tail -f filename' till my script gets killed.

A: 

I have never used it, but there is awkdb. Caveat Emptor.

If you are using gnu awk, recent versions have --profile that should at least let you know what is going on at an abstract level.

deinst
Thanks dentist, I will check it out.
hari
A: 

awkdb has not been updated since 2000. It has some limitations as well according to the web site. If you are using gawk, and looked at its man page reference, you can see some options, like --profile, --optimize, --dump-variables etc. you can try those options. Another option is to use pgawk as indicated in the man page as well.

Generally, if you script is slow in execution, either you have a REALLY large file, or your algorithm is causing the problem. You should at least show the code that you think is hogging up the CPU. Some of the things you should avoid doing if possible, eg

  1. calling the big file a 2nd (or more files)

    awk '{}' file file

  2. iterating a file with a while loop as you iterate a file

    awk '{while(( getline line<"file2") >0 ) {} )}' file

  3. storing values into arrays for a big file takes up memory. Try to clear some of the elements (or delete the array) if not in use anymore

ghostdog74
Thanks a bunch for this info. I will look into it.
hari