ansaurus

Question

[Bash] Save part of matching pattern to variable

Answer 1

+1 A:

There's probably a better way using bash only, but:

echo 'Apr 12 19:24:17 PC_NMG kernel: sd 11:0:0:0: [sdf] Attached SCSI removable disk' \
| sed -s 's/.*\[\(.*\)\].*/\1/'

As Jurgen points out, this matches non-matching lines. If you don't want to output nonmatching lines, use '-n' so it doesn't output the pattern, and '/p' to outputs the pattern when it matches.

| sed -n 's/.*\[\(.*\)\].*/\1/p'

Stephen 2010-04-12 18:23:50

This also prints non-matching lines

Jürgen Hötzel 2010-04-12 20:49:29

@Jurgen Hotzel: Thanks, edited a fix.

Stephen 2010-04-12 21:47:33

Answer 2

+1 A:

Match against regex, replace using grouping and only print if regex matched:

sed -n "s/.*\[\(.*\)\].*/\1/p"

Jürgen Hötzel 2010-04-12 19:31:58

Answer 3

A:

sed is greedy, so the sed answers will miss out some of the data if there are more [] pairs in your data. Use the grep+tr solution or you can use awk

$ cat file
[sss]Apr 12 19:24:17 PC_NMG kernel: sd 11:0:0:0: [sdf] Attached SCSI removable disk [tag] blah blah

$ awk -F"[" '{for(i=2;i<=NF;i++){if($i~/\]/){sub("].*","",$i)};print $i}}' file
sss
sdf
tag

ghostdog74 2010-04-13 00:30:29

Answer 4

+1 A:

BASH_REMATCH is an array containing groups matched by the shell.

$ line='Apr 12 19:24:17 PC_NMG kernel: sd 11:0:0:0: [sdf] Attached SCSI removable disk'
$ [[ $line =~ \[([^]]+)\] ]]; echo "${BASH_REMATCH[1]}"
sdf

If you want to put this in a loop, you can do that; here's an example:

while read -r line; do
  if [[ $line =~ \[([^]]+)\] ]] ; then
    drive="${BASH_REMATCH[1]}"
    do_something_with "$drive"
  fi
done < <(dmesg | egrep '\[([hsv]d[^]]+)\]')

This approach puts no external calls into the loop -- so the shell doesn't need to fork and exec to start external programs such as sed or grep. As such, it is arguably significantly cleaner than other approaches offered here.

BTW, your initial approach (using grep) was not that far off; using grep -o will output only the matching substring:

$ subtext=$(egrep -o "\[[^]]*\]" <<<"$line")

...though this includes the brackets inside the capture, and thus is not 100% correct.

Charles Duffy 2010-04-13 00:49:53

but then bash's while read loop is significantly slower for iterating a big file as compared to awk (etc). By the way, i get no output for your first version without the while loop. The `]` in your character ranges should not be escaped.

ghostdog74 2010-04-13 01:09:43

@ghostdog - updated, thanks. I *do* get output even as-is, but that's bash 4. I agree that the read loop is slow -- filtering once on the input side is much, much better than filtering inside your loop, and you *have* to have a loop if you're going to match more than one line.

Charles Duffy 2010-04-13 01:14:40

ansaurus

tags:

views:

answers:

[Bash] Save part of matching pattern to variable

related questions