tags:

views:

66

answers:

3

Dear community,

I hope the below task will be very easy for sed lovers. I am not sed-guru, but I need to express the following task in sed, as sed is more popular on Linux systems.

The input text stream is something which is produced by "make depends" and looks like following:

pgm2asc.o: pgm2asc.c ../include/config.h amiga.h list.h pgm2asc.h pnm.h \
 output.h gocr.h unicode.h ocr1.h ocr0.h otsu.h barcode.h progress.h
box.o: box.c gocr.h pnm.h ../include/config.h unicode.h list.h pgm2asc.h \
 output.h
database.o: database.c gocr.h pnm.h ../include/config.h unicode.h list.h \
 pgm2asc.h output.h
detect.o: detect.c pgm2asc.h pnm.h ../include/config.h output.h gocr.h \
 unicode.h list.h

I need to catch only C++ header files (i.e. ending with .h), make the list unique and print as space-separated list prepending src/ as a path-prefix. This is achieved by the following perl script:

make libs-depends | perl -e 'while (<>) { while (/ ([\w\.\/]+?\.h)/g) { $a{$1} = 1; } } print join " ", map { "src/$_" } keys %a;'

The output is:

src/unicode.h src/pnm.h src/progress.h src/amiga.h src/ocr0.h src/ocr1.h src/otsu.h src/barcode.h src/gocr.h src/../include/config.h src/list.h src/pgm2asc.h src/output.h

Please, help to express this in sed.

A: 

Sed probably isn't the best tool here as it's stream-oriented. You could possibly use it to convert the spaces to newlines though, pipe that through sort and uniq, then use sed again to convert the newlines back to spaces.

Typing this on my phone, though, so can't give exact commands :(

pdbartlett
Looking again at the requirements you might need grep in there as well to just select the headers, and use one of the sed commands to prepend the src prefix. By now I'd be reaching for Perl, Python, Ruby, etc. though...
pdbartlett
+2  A: 

Not sed but hope this helps you:

make libs-depends | grep -io --perl-regexp "[\w\.\/]+\.h " | sort -u | sed -e 's:^:src/:' 
Anton
I would replace "sort | uniq" by "sort -u"
Nicolas Viennot
@Pafy: Old habits die hard :) Replaced it.
Anton
@Anton: Nice! Minor thing: how to join all lines? I believe that `sed -e 's/\n/ /g'` will not work as I need to accumulate all rows into buffer before running regex.
dma_k
@dma_k, Maybe not relevant anymore but this a quick and easy way to join lines `tr '\n' ' '`. This will replace the new line character \n with space.
Anton
@Anton: I found a small bug in a script: regexp should be with a leading space `' [\w\.\/]+\.h'`, because in the input there might be lines like `detect.o: detect.h\n`, so the leading space is guaranteed, but not the trailing. Unfortunately, I cannot adjust sed replacement: `sed -e 's:^ :src/:'` does not work. Please help.
dma_k
@dma_k: As initial whitespace is optional you can use:`sed -e 's:^\s*:src/:'`
Anton
@Anton: I found the root of the problem: I have `grep` aliased to `grep --color=always` which always highlights the match found. This causes the input for `sed` to be like this `00000000 1b 5b 30 31 3b 33 31 6d 1b 5b 4b 20 61 6d 69 67 |.[01;31m.[K amig|` and of course, sed cannot match the beginning of the line. From script it works as a charm.
dma_k
@dma_k: I don't know what I was drinking, but luckily didn't manage to mislead you.
Anton
+1  A: 

If you really want to do this in pure sed:

make libs-depends | sed 's/ /\n/g' | sed '/\.h$/!d;s/^/src\//' | sed 'G;/^\(.*\)\n.*\1/!h;$!d;${x;s/\n/ /g}'

The first sed command breaks the output up into separate lines, the second filters out everything but *.h and prepends 'src/', the third gloms the lines together without repetition.

Beta
I am not an expert on `sed`. But I believe you can rephrase it with a single sed call, i.e.`make lib-depends | sed -e 's/ /\n/g' -e '/\.h$/!d;s/^/src\//' -e 'G;/^\(.*\)\n.*1/!h;$!d;${x;s/\n/ /g}'
Grzegorz Oledzki
@GrzegorzOledzki: yes and no. The first call must be separate; I know I could combine the second and thirst, and maybe make things a little tighter elsewhere, but I was going for readability.
Beta
@Beta: Thanks for the solution. Why the first sed should be separated? Because the 2nd sed needs to "sense" newly generated lines?
dma_k
@dma_k: yes, exactly. It *can* be done with one call, but that's ugly.
Beta