views:

1288

answers:

4

I have already extracted the tag from the source document using grep but, now I cant seem to figure out how to easily extract the properties from the string. Also I want to avoid having to use any programs that would not usually be present on a standard installation.

$tag='<img src="http://imgs.xkcd.com/comics/barrel_cropped_(1).jpg" title="Don't we all." alt="Barrel - Part 1" />'

I need to end up with the following variables

$src="http://imgs.xkcd.com/comics/barrel_cropped_(1).jpg"
$title="Don't we all."
$alt="Barrel - Part 1"
A: 

Looks like a job for sed.

dacracot
I couldn't figure out the man page
GameFreak
+1  A: 

I went with dacracot's suggestion of using sed although I would have prefered if he had given me some sample code

src=\echo $tag | sed 's/.*src=["]\(.*\)["] title=["]\(.*\)["] alt=["]\(.*\)["].*/\1/'\
title=\echo $tag | sed 's/.*src=["]\(.*\)["] title=["]\(.*\)["] alt=["]\(.*\)["].*/\2/'\
alt=\echo $tag | sed 's/.*src=["]\(.*\)["] title=["]\(.*\)["] alt=["]\(.*\)["].*/\3/'\

GameFreak
Charles Duffy
sorry i didnt work out the sed script for you, i didnt have time right then
dacracot
If you don't have time to write a good answer, then don't write one. Even if you do, be sure to come back and edit it later.
ephemient
What is your definition of good? I find it very amusing that I have my answer is selected with -1 votes. No I didn't code it for him, but I sent him in the right direction to find the answer. Give a man a fish and you feed him for a day. Teach a man to fish and you feed him for a lifetime.
dacracot
+3  A: 

You can use xmlstarlet. Then, you don't even have to extract the element yourself:

$ echo $tag|xmlstarlet sel -t --value-of '//img/@src'
http://imgs.xkcd.com/comics/barrel_cropped_(1).jpg

You can even turn this into a function

$ get_attribute() {
  echo $1 | xmlstarlet sel -t -o "&quot;" -v $2 -o "&quot;"
  }
$ src=get_attribute $tag '//img/@src'

If you don't want to reparse the document several times, you can also do:

$ get_values() {
   eval file=\${$#}
   eval $#=    
   cmd="xmlstarlet sel "
   for arg in $@
   do
      if [ -n $arg ]
      then
        var=${arg%%\=*}
        expr=${arg#*=}
        cmd+=" -t -o \"$var=&quot;\" -v $expr -o \"&quot;\" -n"
      fi
   done
   eval $cmd $file
  }
$ eval $(get_values src='//img/@src' title='//img/@title' your_file.xml)
$ echo $src
http://imgs.xkcd.com/comics/barrel_cropped_(1).jpg
$ echo $title
Don't we all.

I'm sure there's a better way to remove the last argument to a shell function, but I don't know it.

Torsten Marek
Oh, then xmlstarlet might not be available on a standard installation. Sorry, I think it was a little too late when I wrote the answer...
Torsten Marek
A: 

If xmlstarlet is available on a standard installation and the sequence of src-title-alt does not change, you can use the following code as well:

tag='<img src="http://imgs.xkcd.com/comics/barrel_cropped_(1).jpg" title="Don'"'"'t we all." alt="Barrel - Part 1" />'
xmlstarlet sel -T -t -m "/img" -m "@*" -v '.' -n <<< "$tag"
IFS=$'\n'
array=( $(xmlstarlet sel -T -t -m "/img" -m "@*" -v '.' -n <<< "$tag") )
src="${array[0]}"
title="${array[1]}"
alt="${array[2]}"

printf "%s\n" "src: $src" "title: $title" "alt: $alt"
lmxy