ansaurus

Question

How to calculate a hash for a string (url) in bash for wget caching

Answer 1

+2 A:

Sounds like you want the md5sum system utility.

URLMD5=`/bin/echo $URL | /usr/bin/md5sum | /bin/cut -f1 -d" "`

If you want to only create the hash on the filename, you can get that quickly with sed:

FILENAME=`echo $URL | /bin/sed -e 's#.*/##'`
URLMD5=`/bin/echo $FILENAME | /usr/bin/md5sum | /bin/cut -f1 -d" "`

Epsilon Prime 2009-10-21 17:41:17

Well many thanks for the rapid answer; I hadn't realized that I could simply use md5sum this way!I don't understand what you are saying about the 'filename' though: when the md5 key is calculated, there are no filenames yet...?

2009-10-21 18:06:43

@bambax: Epsilon Prime is referring to the filename portion of the URL, for example: "index.html". The `sed` command strips off everything up to and including the last slash.

Dennis Williamson 2009-10-21 18:16:50

@Dennis: Ok, thanks; but in that case I certainly do not want to just use the filename part of the URL, as different sets of GET parameters should result in different files being cached/retrieved.

2009-10-22 16:59:47

Answer 2

A:

Newer versions of Bash provide an associative array, as well as an indexed array. Something like this might work for you:

declare -A myarray
myarray["url1"]="url1_content"
myarray["url2"]=""

if [ ! -z ${myarray["url1"]} ] ; then 
    echo "Cached";
fi

wget will typically rename the files with a filename.html.1, .2, etc., so you could use the associative array to store a list of which one has been downloaded and what the actual filename was.

Kaleb Pederson 2009-10-21 17:58:38

ansaurus

tags:

views:

answers:

How to calculate a hash for a string (url) in bash for wget caching

related questions