I have a data that looks like this:
-1033
-
222
100
-30
-
10
What I want to do is to capture all the numbers excluding "dash only" entry.
Why my awk below failed?
awk '$4 != "-" {print $4}'
I have a data that looks like this:
-1033
-
222
100
-30
-
10
What I want to do is to capture all the numbers excluding "dash only" entry.
Why my awk below failed?
awk '$4 != "-" {print $4}'
Your awk
script says
If the fourth field is not a dash, print it out
However, you want to print it out if the line is not a dash
awk '$0 != "-"'
Default action is to print so no body is needed.
If you want to print group of numbers, you can use a GNU awk extension if you use gawk. It allows splitting records using regular expressions:
gawk 'BEGIN { RS="(^|\n)-($|\n)" } { print "Numbers:\n" $0 }'
Now, instead of lines, it takes a group of numbers separated by a line containing only -
. Setting the field separator (FS
) to a newline allows you to iterate over the numbers within such a group:
gawk 'BEGIN { FS="\n"; RS="(^|\n)-($|\n)" }
{ print "Numbers:"; for(i=1;i<=NF;i++) print " *: " $i }'
However I agree with other answers. If you just want to filter out lines matching some text, grep
is the better tool for that.
Assuming that your data file is actually multi-column, and that the values are in column 4, the following will work:
awk '$4 != "-" {print $4} {}'
It prints the value only where it isn't "-". Your version will probably print the value regardless (or twice) since the default action is to print. Adding the {}
makes the default action "do nothing".
If the data is actually as shown (one column only), you should be using $1
rather than $4
- I wouldn't use $0
since that's the whole line and it appears you have spaces at the end of your first two lines which would cause $0
to be "-1033 "
and "- "
.
But, if it were a single column, I wouldn't use awk at all but rather:
grep -v '^-$'
grep -v '^ *- *$'
the second allowing for spaces on either side of the "-"
character.
Why are you checking $4
? It appears you should check $1
or $0
as litb said.
But awk is a heavyweight tool for this job. Try
grep -v '^-$'
To remove lines containing only a dash or
grep -v '^ *- *$'
To remove lines containing only a dash and possibly some space characters.