awk optimization

views:

answers:

+1 Q:

any pointers on what must not be in an awk script? I am asking as I dont find any tool to debug an awk script. I have a script taking a lot of cpu so need to know if I am doing something terribly wrong with the script.

Just for an example, I keep looking for output of a logfile via 'tail -f filename' till my script gets killed.

I have never used it, but there is awkdb. Caveat Emptor.

If you are using gnu awk, recent versions have --profile that should at least let you know what is going on at an abstract level.

deinst 2010-07-30 22:04:10

Thanks dentist, I will check it out.

hari 2010-07-30 22:31:21

awkdb has not been updated since 2000. It has some limitations as well according to the web site. If you are using gawk, and looked at its man page reference, you can see some options, like --profile, --optimize, --dump-variables etc. you can try those options. Another option is to use pgawk as indicated in the man page as well.

Generally, if you script is slow in execution, either you have a REALLY large file, or your algorithm is causing the problem. You should at least show the code that you think is hogging up the CPU. Some of the things you should avoid doing if possible, eg

calling the big file a 2nd (or more files)

awk '{}' file file
iterating a file with a while loop as you iterate a file

awk '{while(( getline line<"file2") >0 ) {} )}' file
storing values into arrays for a big file takes up memory. Try to clear some of the elements (or delete the array) if not in use anymore

ghostdog74 2010-07-30 23:41:49

Thanks a bunch for this info. I will look into it.

hari 2010-07-31 05:25:45

ansaurus

tags:

views:

answers:

awk optimization

related questions