awk

Faster way to find duplicates conditioned by time

In a machine with AIX without PERL I need to filter records that will be considered duplicated if they have the same id and if they were registered between a period of four hours. I implemented this filter using AWK and work pretty well but I need a solution much faster: # Generar lista de Duplicados awk 'BEGIN { FS="," } /OK/ { ...

Is there a Unix utility to prepend timestamps to lines of text?

I ended up writing a quick little script for this in Python, but I was wondering if there was a utility you could feed text into which would prepend each line with some text -- in my specific case, a timestamp. Ideally, the use would be something like: $ cat somefile.txt | prepend-timestamp (Before you answer sed, I tried this: $ ca...

sorting hashes/arrays in awk

Is there an easy way to do any of the following things in awk? Sorting array/hash by it's data Sorting a hash by it's string key ...

Is there still any reason to learn AWK ?

I am constantly learning new tools, even old fashioned ones, because I like to use the right solution for the problem. Nevertheless, I wonder if there is still any reason to learn some of them. AWK for example, is interesting to me, but for simple text processing, I can use grep / cut / sed / whatever, while for complex ones, I´ll go fo...

Searching/reading another file from awk based on current file's contents, is it possible?

I'm processing a huge file with (GNU) awk, (other available tools are: Linux shell tools, some old (>5.0) version of Perl, but can't install modules). My problem: if some field1, field2, field3 contain X, Y, Z I must search for a file in another directory which contains field4, and field5 on one line, and insert some data from the found...

Awk scripting help - Logic Issue

I'm currently writing a simple .sh script to parse an Exim log file for strings matching " o' ". Currently, when viewing output.txt, all that is there is a 0 printed on every line(606 lines). I'm guessing my logic is wrong, as awk does not throw any errors. Here is my code(updated for concatenation and counter issues). Edit: I've adopte...

How do I print a field from a pipe-separated file?

I have a file with fields separated by pipe characters and I want to print only the second field. This attempt fails: $ cat file | awk -F| '{print $2}' awk: syntax error near line 1 awk: bailing out near line 1 bash: {print $2}: command not found Is there a way to do this? ...

Can awk skip files which do not exist, race-free?

Is there a way to make awk (gawk) ignore or skip missing files? That is, files passed on the command line that no longer exist in the file system (e.g. rapidly appearing/disappearing files under /proc/[1-9]*). By default, a missing file is a fatal error :-( I would like to be able to do the equivalent of something like this: BEGIN { M...

Regex in awk and WinGrep

So I'm looking for a pattern like this: size='0x0' in a log file - but I'm only interested in large sizes (4 digits or more). The following regex works great in EditPadPro (nice tool BTW) size='0x[0-9a-fA-F]{4,} But the same regex does not work in awk - seems like the repetition {4,} is messing it up. Same with WinGrep - any idea fr...

Best Awk Commands

I find AWK really useful. Here is a one liner I put together to manipulate data. ls | awk '{ print "awk " "'"'"'" " {print $1,$2,$3} " "'"'"'" " " $1 ".old_ext > " $1 ".new_ext" }' > file.csh I used this AWK to make a script file that would rename some files and only print out selective columns. Anyone know a better way to do ...

How do you split a file base on a token?

Let's say you got a file containing texts (from 1 to N) separated by a $ How can a slit the file so the end result is N files? text1 with newlines $ text2 $etc... $ textN I'm thinking something with awk or sed but is there any available unix app that already perform that kind of task? ...

How can I extract lines of text from a file?

I have a directory full of files and I need to pull the headers and footers off of them. They are all variable length so using head or tail isn't going to work. Each file does have a line I can search for, but I don't want to include the line in the results. It's usually *** Start (more text here) And ends with *** Finish (more te...

concatenating many email files with unix utils

I would like to know if there is any easy way to print multiple emails(about 200) so that they continue on as opposed to printing one per page. I have tried with thunderbird and evolution and this does not seem possible. Would concatenating the individual mail files work or are there other unix utilities that could do this? WOuld sed or ...

AWK: redirecting script output from script to another file with dynamic name

Hi all, I know I can redirect awk's print output to another file from within a script, like this: awk '{print $0 >> "anotherfile" }' 2procfile (I know that's dummy example, but it's just an example...) But what I need is to redirect output to another file, which has a dynamic name like this awk -v MYVAR"somedinamicdata" '{print $0 ...

How can I append the name of a file to end of each line in that file?

I need to do the following for hundreds of files: Append the name of the file (which may contain spaces) to the end of each line in the file. It seems to me there should be some way to do this: sed -e 's/$/FILENAME/' * where FILENAME represents the name of the current file. Is there a sed variable representing the current filename? ...

In sed or awk, how do I handle record separators which *may* span multiple lines?

My log file is: Wed Nov 12 blah blah blah blah cat1 Wed Nov 12 blah blah blah blah Wed Nov 12 blah blah blah blah Wed Nov 12 blah blah blah blah cat2 more blah blah even more blah blah Wed Nov 12 blah blah blah blah cat3 Wed Nov 12 blah blah blah blah cat4 I want to parse out the full multiline entries where cat is fo...

parse csv file using gawk

How do you parse a csv file using gawk? Simply setting FS="," is not enough, as a quoted field with a comma inside will be treated as multiple fields. Example using FS="," which does not work: file contents: one,two,"three, four",five "six, seven",eight,"nine" gawk script: BEGIN { FS="," } { for (i=1; i<=NF; i++) printf "field #...

How to print the Nth column of a text file with AWK using argv

Suppose I have a text file with data separated by whitespace into columns. I want to write a little shell script which takes as input a filename and a number N and prints out only that column. With awk I can do the following: awk < /tmp/in '{print $2}' > /tmp/out This code prints out the second column. But how would one wrap that in...

Awk matching of entire record using regular expression

Using Awk I want to match the entire record using a regular expression. By default the regular expression matching is for parts of a record. The ideal solution would: Be general for all fields, regardless of the field separator used. Not treat the entire input as a single field and parse it manually using string functions. Work in a g...

Shell script numbering lines in a file

I need to find a faster way to number lines in a file in a specific way using tools like awk and sed. I need the first character on each line to be numbered in this fashion: 1,2,3,1,2,3,1,2,3 etc. For example, if the input was this: line 1 line 2 line 3 line 4 line 5 line 6 line 7 The output needs to look like this: 1line 1 2line 2...