tags:

views:

38

answers:

3

I have got a file that look like:

<a href="some-adress">some-txt</a>
<a href="some-adress">some-txt</a>
<a href="some-adress">some-txt</a>
...

I need to download all files that are as "some-adress", how can I do that using only bash?

+3  A: 

Why don't you use wget ? It already have that feature :

wget -i --force-html yourfile.html
BatchyX
+1: Can't get simpler than this.
codaddict
A: 
cut -f 2 -d '"' file-with-addresses.txt

cut is included in all posix shells. This command will split the line using the " as the delimiter and return the second "field". To download using wget Adam Rosenfield's method is fine.

cut -f 2 -d '"' file-with-addresses.txt | xargs wget
adamse
+2  A: 

Here's one way to do that using a combination of sed, xargs, and wget:

sed -n 's/.*<a href="\([^"]*\)">.*/\1/p' input-file | xargs wget
Adam Rosenfield
Couple tweaks: you might want to change [^"]* to [^"]\+ to ensure the pattern appears at least once, and you might want to use xargs -n 1 so xargs will be called once for each address.
Adam Liss