views:

837

answers:

2

Given this data

34 foo
34 bar
34 qux
62 foo1
62 qux
78 qux

I want to replace the string at 2nd column into "" if it is "qux". Resulting:

34 foo
34 bar
34 
62 foo1
62 
78

How do you do that with sed? In particular the data is very big with ~10^7 lines

+2  A: 

No trailing spaces:

sed 's/qux$//' < file

If it must be in the second column (of potentially more than three columns):

sed 's/\([0-9][  ]*\)qux\(.*\)/\1\2/'

(Note that there is a literal tab and space; sed doesn't match tabs with '\t';

But awk is better for tabular data:

awk '{ if ($2 == "qux") {$2 = ""; print} else { print }; }' < file
guns
If you want to keep the blank lines sed 's/qux//' < file
ojblass
Sorry didn't realize the nubers were a column.... as you were.
ojblass
That awk solution doesn't change the file (in Cygwin at least).
paxdiablo
Never mind, I see the problem, it was the second ==, I've fixed it for ya'.
paxdiablo
@Pax: if I want to do in-place replacement to the file how could I do it? And in AWK how do you print with tab delimited? Now it prints with space delimited.
neversaint
@foolishbrat sed and awk don't do in place edits; you can use perl for that, or super sed (ssed). Otherwise the standard thing to do is use a temporary file.
guns
@foolishbrat, you could pass the output through <sed 's/SPACE/TAB/g'> - that's the fastest way to get a result.
paxdiablo
Or there's an OFS variable which is by default a space - set that to a tab. But you shouldn't need to, since $0 preserves the tabs (in my answer - I haven't tested guns').
paxdiablo
+6  A: 

I wouldn't actually do it with sed since that's not the best tool for the job. The awk tool is my tool of choice whenever somebody mentions columns.

cat file | awk '$2 == "qux" { print $1 } $2 != "qux" { print $0 }'

or the simplest form:

cat file | awk '{ if ($2 == "qux") {$2 = ""}; print }'

If you must use sed:

cat file | sed 's/  *qux *$//'

making sure that you use the correct white space (the above uses only spaces).

paxdiablo
@Pax: thanks for the reply. BUt yours approach with awk give this instead: 34 bar3434 qux62 foo16262 qux7878 qux
neversaint
Try again. I had a $s instead of $2.
paxdiablo