ansaurus

Question

How to express "map a linux command to each line in a file"?

Answer 1

A:

There are a few ways to do this with various drawbacks associated... One of the simpler ones is to use a while read.. loop. Another way is to use a for loop and change IFS to \n, but that's a bit uglier without much of any benefit. You may see it done though, especially if they want word splitting on spaces.

git status | while read line; do
    echo "I'm an albatross! $line"
done

or, using process substitution (which gets rid of a subshell which can add nasty complications)

while read line; do
    echo "I'm an albatross! $line"
done <(git status)

There are caveats to this which may make things more complicated in specific situations. You should read help read, and Bash FAQ #1 contains lots of gory details.

Daenyth 2010-09-24 18:57:33

Answer 2

A:

Use Command Substitution using back-ticks: http://tldp.org/LDP/abs/html/commandsub.html

In your case:

rm -f `svn st | grep ^? | awk '{print $2'}`

Ali Sattari 2010-09-24 18:58:01

This will fail for long list of files. xargs is safer as it invokes the command N times with appropriate sizing of argv.

gawi 2010-09-24 19:02:17

Answer 3

+5 A:

Use xargs:

svn st | grep '^?' | awk '{print $2}' | xargs rm -rf

Note: In your command, you don't deal with files having whitespaces because of {print $2}. You also have to be careful with xargs as it splits its input on whitespaces. So it's safer to use the -0 option if you have whitespaces in your filenames.

This is an example of a more correct xargs usage:

find -type f| tr \\n \\0 | xargs -0 wc

Or, using find -print0 option:

find -type f -print0 | xargs -0 wc

xargs is one of the most powerful UNIX tools. I use it everyday.

gawi 2010-09-24 18:59:49

The odd `'{print $2'}` parity there makes me nervous. Should it be `'{print $2}'` or `{'print $2'}`?

Jim Davis 2010-09-24 19:32:09

@Jim Davis Fixed. Thanks.

gawi 2010-09-24 19:40:38

Answer 4

+2 A:

Why are so many people invoking grep in the pipeline and either using bash's for loops or xargs on the back end when they have bloody awk right in the middle of things?

First let's get rid of the whimsical use of grep:

svn st | awk '/^?/ { print $2 }'

Since awk allows you to filter lines based on regular expressions, the use of grep is entirely unnecessary. The regular expressions of awk aren't that different from grep (depending on which implementation of awk and which of grep you're using) so why add a whole new process and a whole new bottleneck in the pipeline?

From there you already have two choices that would be shorter and more readable:

# option 1
svn st | awk '/^?/ { print $2 }' | xargs rm -f

# option 2
rm -f $(svn st | awk '/^?/ { print $2 }')

Now that second option will only work if your file list doesn't exceed the maximum command line size, so I recommend the xargs version.

Or, perhaps even better, use awk again.

svn st | awk '/^?/ { system("rm -f $2") }'

This will be the functional equivalent of what you did above with the for loop. It's grotesquely inefficient, seeing as it will execute rm once per file, but it's at least more readable than your for loop example. It can be improved still farther, however. I won't go into full details here, but will instead give you comments as clues as to what the final solution would look like.

svn st | awk 'BEGIN{ /*set up an array*/ }; /^?/ { /*add $2 to the array*/ }; END{ /*system("rm -f ...") safe chunks of the array*/}

OK, so that last one is a bit of a mouthful and too much to type off as a routine one-liner. Since you have to "often" do this, however, it shouldn't be too bad to put this into a script:

#!/usr/bin/env awk
BEGIN {
  /* set up your accumulator array */
}

/^?/ {
  /* add $2 to the array */
}

END {
  /* invoke system("rm -f") on safe chunks of the accumulator array */
}

Now your command line will be svn st | myawkscript.

Now I'll warn you I'm not equipped to check all this (since I avoid SVN and CVS like I avoid MS-DOS -- and for much the same reason). You might have to monkey with the #! line in the script, for example, or with the regular expression you use to filter, but the general principle remains about the same. And me personally? I'd use svn st | awk '/^?/ { print $2 }' | xargs rm -f for something I do infrequently. I'd only do the full script for something I do several times a week or more.

JUST MY correct OPINION 2010-09-25 06:40:04

This is wonderful!

PainterZ 2010-09-26 15:14:53

It seems the END expression will feed the list of file paths to "rm -f". In this case, would it suffer the problem that the list of arguments might exceed the maximum command line size?

PainterZ 2010-09-26 15:45:52

Hence the reference to "safe chunks". You put in a loop that grabs less than the maximum command line size at a time and issue `rm -f`. `awk` is a general-purpose programming language. It has loops and conditionals and calculations and all sorts of nifty stuff beyond pattern matching.

JUST MY correct OPINION 2010-09-26 23:21:50

Answer 5

A:

xargs is your friend:

svn st | awk '/^?/{print $2}'|xargs rm -rf

Vijay Sarathi 2010-09-29 07:42:29

ansaurus

tags:

views:

answers:

How to express "map a linux command to each line in a file"?

related questions