ansaurus

Question

Extracting multiple parts of a string using bash

Answer 1

A:

To get rid of the newline, you can just echo it again:

$ echo $(echo "1=A00^35=D^150=1^33=1"|egrep -o "35=[^/^]*\^|150=[^/^]*\^")
35=D^ 150=1^

If that's not satisfactory (I think it may give you one line for the whole input file), you can use awk:

pax> echo '
1=A00^35=D^150=1^33=1
1=a00^35=d^157=11^33=11
' | awk -vLIST=35,150 -F^ ' {
        sep = "";
        split (LIST, srch, ",");
        for (i = 1; i <= NF; i++) {
            for (idx in srch) {
                split ($i, arr, "=");
                if (arr[1] == srch[idx]) {
                    printf sep "" arr[1] "=" arr[2];
                    sep = "^";
                }
            }
        }
        if (sep != "") {
            print sep;
        }
    }'
35=D^150=1^
35=d^

pax> echo '
1=A00^35=D^150=1^33=1
1=a00^35=d^157=11^33=11
' | awk -vLIST=1,33 -F^ ' {
        sep = "";
        split (LIST, srch, ",");
        for (i = 1; i <= NF; i++) {
            for (idx in srch) {
                split ($i, arr, "=");
                if (arr[1] == srch[idx]) {
                    printf sep "" arr[1] "=" arr[2];
                    sep = "^";
                }
            }
        }
        if (sep != "") {
            print sep;
        }
    }'
1=A00^33=1^
1=a00^33=11^

This one allows you to use a single awk script and all you need to do is to provide a comma-separated list of keys to print out.

And here's the one-liner version :-)

echo '1=A00^35=D^150=1^33=1
      1=a00^35=d^157=11^33=11
      ' | awk -vLST=1,33 -F^ '{s="";split(LST,k,",");for(i=1;i<=NF;i++){for(j in k){split($i,arr,"=");if(arr[1]==k[j]){printf s""arr[1]"="arr[2];s="^";}}}if(s!=""){print s;}}'

paxdiablo 2010-09-07 11:57:46

Thanks for your reply Pax. I have edited the question to better depict my problem. The awk solution would be awesome, except that I will not always want 35 and 150. I am already generating the egrep regex and generating the entire awk statement seems a bit brute force.

Dave 2010-09-07 12:18:21

@Dave, see the update. The script itself doesn't change since you now just provide a list of tokens of interest. The only thing you need to generate dynamically is the `-vLIST=` bit.

paxdiablo 2010-09-07 12:28:33

Thanks a million!

Dave 2010-09-07 12:52:56

Answer 2

+1 A:

You have two options. Option 1 is to change the "white space character" and use set --:

OFS=$IFS
IFS="^ "
set -- 1=A00^35=D^150=1^33=1  # No quotes here!!
IFS="$OFS"

Now you have your values in $1, $2, etc.

Or you can use an array:

tmp=$(echo "1=A00^35=D^150=1^33=1" | sed -e 's:\([0-9]\+\)=: [\1]=:g' -e 's:\^ : :g')
eval value=($tmp)
echo "35=${value[35]}^150=${value[150]}"

Aaron Digulla 2010-09-07 12:18:10

Answer 3

A:

given a file 'in' containing your strings :

$ for i in $(cut -d^ -f2,3 < in);do echo $i^;done
35=D^150=1^
35=D^150=2^

matja 2010-09-07 12:46:21

ansaurus

tags:

views:

answers:

Extracting multiple parts of a string using bash

related questions