views:

1188

answers:

4

Looking for a solution in bash (will be part of a larger script).

Given a variable containing information of the form

diff -r efb93662e8a7 -r 53784895c0f7 diff.txt
--- diff.txt Fri Jan 23 14:48:30 2009 +0000
+++ b/diff.txt Fri Jan 23 14:49:58 2009 +0000
@@ -1,9 +0,0 @@ 
-diff -r 9741ec300459 myfile.c 
---- myfile.c Thu Aug 21 18:22:17 2008 +0000 
-+++ b/myfile.c Thu Aug 21 18:22:17 2008 +0000 -@@ -1,4 +1,4 @@ 
-  int myfunc() 
-  { 
--     return 1; 
-+     return 10; 
-  }

I wish to extract both (here diff.txt and myfile.c, but future cases will not be limited to this number) filenames to a string of the form "edited: filename1 filename2 ... filenameN".

To clarify, I wish to extract multiple matching filenames to a string.

  • The command "$(expr "$editing" : '.*---[[:space:]]\([[:graph:]]*\)[[:space:]]')" returns the last filename correctly but not previous instances.

EDIT: Require the ability to identify edited filenames (possibly including spaces) i.e. filenames appearing after "---" and before the day "Fri/Thu...".

Thanks for your help (and to the many people have replied thus far).

A: 

Could you perform your operation before setting $editing - then you might still have the line breaks?

Then maybe some sed would be able to extract the filenames.

Douglas Leeder
Its possible to process using a combination of grep,sed and awkm this would involve the creation/deletion of a file which I am hoping to avoid. Thanks for the input.
anon
I have to agree that having line breaks would definitely make this much cleaner. (Bash variables can contain line breaks, by the way)
David Zaslavsky
+1  A: 

I'd suggest using an external tool for it - here's one way with perl:

$(echo "$variable" | perl -e 'print "edited:"; while (<>) { while (/--- (\S+)/g) { print " $1"; } }')

I'm sure it can be done more elegantly, but I can't think of a way right now that wouldn't take a more substantial program.

David Zaslavsky
+1  A: 

Here is a simple, working solution:

txt=$(cat)
str="edited: "

for word in $txt; do
        if echo $word | grep -qi '^[a-z0-9-_]*\.[a-z]*$'; then
           str="$str $word"
        fi
done

echo $str

Running it:

anton@CAPTAIN-FALCON ~/Desktop
$ bash sol.sh
diff -r efb93662e8a7 -r 53784895c0f7 diff.txt --- diff.txt Fri Jan 23 14:48:30 2
009 +0000 +++ b/diff.txt Fri Jan 23 14:49:58 2009 +0000 @@ -1,9 +0,0 @@ -diff -r
 9741ec300459 myfile.c ---- myfile.c Thu Aug 21 18:22:17 2008 +0000 -+++ b/myfil
e.c Thu Aug 21 18:22:17 2008 +0000 -@@ -1,4 +1,4 @@ - int myfunc() - { -- return
 1; -+ return 10; - }
edited: diff.txt diff.txt myfile.c myfile.c

Edit: Dicking around with grep for a while resulted in the following script, but I'm starting to wonder if pure bash is the right tool for the job... It seems like there would be many corner cases where you would either miss some files or get erroneous file names.

#! /bin/bash

rawFiles=`cat | grep -ioz ' -* [a-z0-9-_\ ]*\.[a-z]*'`

for file in $rawFiles; do
   if ! echo $file | grep -q '^-*$'; then
      files="$files${file} "
   fi
done

echo "edited: $files"
Very elegant. The only time this will not work properly is when the filenames mentioned in the diff have spaces in them, but that is so infrequent I doubt it is a legitimate concern.
Sean Bright
Perfect! Thanks very much. ** I'll post a link to the script when its all done - hopefully it'll be of help to other ppl too. FYI its a *fancy* backup script.
anon
Glad you like it :)
Ah the devil is in the details... To make the code bullet-proof I need to be able to identify filenames (with spaces) and distinguish between generic filenames and the files actually being edited i.e. filenames appearing after "---"
anon
+3  A: 

A solution using only bash built-ins, no external programs is:

res="edited: "; var="${var#* --- } --- "
while test -n "$var";do res="$res ${var%% *}"; var="${var#* --- }";done
echo "$res"

It iterates on all occurences of " --- ". The trick is to prepare the string by first trimming garbarge from the start (up to first ---) and appending a " --- " at the end to be able to have a simpler logic in the while loop afterwards.

This is by using bash most useful feature, the # and % to trim strings

Colas Nahaboo
Very elegant solution. Thanks. Made a small edit to allow for spaces in names "...do res="$res ${var%% [[:upper:]][[:graph:]* *}";..." to allow for filenames with spaces. The or '|' operator didn't work for me in the sequence but that regex will suffice for identifying "Mon|Tue|Wed...". Thanks again
anon