We required a script that simulates Associative arrays or Map like data structure for Shell Scripting, any body?
Now answering this question.
Following scripts simulates associative arrays in shell scripts. Its simple and very easy to understand.
Map is nothing but a never ending string that has keyValuePair saved as --name=Irfan --designation=SSE --company=My:SP:Own:SP:Company
spaces are replaced with ':SP:' for values
put() {
if [ "$#" != 3 ]; then exit 1; fi
mapName=$1; key=$2; value=`echo $3 | sed -e "s/ /:SP:/g"`
eval map="\"\$$mapName\""
map="`echo "$map" | sed -e "s/--$key=[^ ]*//g"` --$key=$value"
eval $mapName="\"$map\""
}
get() {
mapName=$1; key=$2; valueFound="false"
eval map=\$$mapName
for keyValuePair in ${map};
do
case "$keyValuePair" in
--$key=*) value=`echo "$keyValuePair" | sed -e 's/^[^=]*=//'`
valueFound="true"
esac
if [ "$valueFound" == "true" ]; then break; fi
done
value=`echo $value | sed -e "s/:SP:/ /g"`
}
put "newMap" "name" "Irfan Zulfiqar"
put "newMap" "designation" "SSE"
put "newMap" "company" "My Own Company"
get "newMap" "company"
echo $value
get "newMap" "name"
echo $value
edit: Just added another method to fetch all keys.
getKeySet() {
if [ "$#" != 1 ];
then
exit 1;
fi
mapName=$1;
eval map="\"\$$mapName\""
keySet=`
echo $map |
sed -e "s/=[^ ]*//g" -e "s/\([ ]*\)--/\1/g"
`
}
To add to Irfan's answer, here is a shorter and faster version of get() since it requires no iteration over the map contents:
get() {
mapName=$1; key=$2
map=${!mapName}
value="$(echo $map |sed -e "s/.*--${key}=\([^ ]*\).*/\1/" -e 's/:SP:/ /g' )"
}
I think that you need to step back and think about what a map, or associative array, really is. All it is is a way to store a value for a given key, and get that value back quickly and efficiently. You may also want to be able to iterate over the keys to retrieve every key value pair, or delete keys and their associated values.
Now, think about a data structure you use all the time in shell scripting, and even just in the shell without writing a script, that has these properties. Stumped? It's the filesystem.
Really, all you need to have an associative array in shell programming is a temp directory. mktemp -d
is your associative array constructor:
prefix=$(basename $0)
map=$(mktemp -dt ${prefix})
echo >${map}/key somevalue
value=$(cat ${map}/key)
If you don't feel like using echo
and cat
, you can always write some little wrappers; these ones are modelled off of Irfan's, though they just output the value rather than setting arbitrary variables like $value
:
#!/bin/sh
prefix=$(basename $0)
mapdir=$(mktemp -dt ${prefix})
trap 'rm -r ${mapdir}' EXIT
put() {
[ "$#" != 3 ] && exit 1
mapname=$1; key=$2; value=$3
[ -d "${mapdir}/${mapname}" ] || mkdir "${mapdir}/${mapname}"
echo $value >"${mapdir}/${mapname}/${key}"
}
get() {
[ "$#" != 2 ] && exit 1
mapname=$1; key=$2
cat "${mapdir}/${mapname}/${key}"
}
put "newMap" "name" "Irfan Zulfiqar"
put "newMap" "designation" "SSE"
put "newMap" "company" "My Own Company"
value=$(get "newMap" "company")
echo $value
value=$(get "newMap" "name")
echo $value
edit: This approach is actually quite a bit faster than the linear search using sed suggested by the questioner, as well as more robust (it allows keys and values to contain -, =, space, qnd ":SP:"). The fact that it uses the filesystem does not make it slow; these files are actually never guaranteed to be written to the disk unless you call sync
; for temporary files like this with a short lifetime, it's not unlikely that many of them will never be written to disk.
I did a few benchmarks of Irfan's code, Jerry's modification of Irfan's code, and my code, using the following driver program:
#!/bin/sh
mapimpl=$1
numkeys=$2
numvals=$3
. ./${mapimpl}.sh #/ <- fix broken stack overflow syntax highlighting
for (( i = 0 ; $i < $numkeys ; i += 1 ))
do
for (( j = 0 ; $j < $numvals ; j += 1 ))
do
put "newMap" "key$i" "value$j"
get "newMap" "key$i"
done
done
The results:
$ time ./driver.sh irfan 10 5 real 0m0.975s user 0m0.280s sys 0m0.691s $ time ./driver.sh brian 10 5 real 0m0.226s user 0m0.057s sys 0m0.123s $ time ./driver.sh jerry 10 5 real 0m0.706s user 0m0.228s sys 0m0.530s $ time ./driver.sh irfan 100 5 real 0m10.633s user 0m4.366s sys 0m7.127s $ time ./driver.sh brian 100 5 real 0m1.682s user 0m0.546s sys 0m1.082s $ time ./driver.sh jerry 100 5 real 0m9.315s user 0m4.565s sys 0m5.446s $ time ./driver.sh irfan 10 500 real 1m46.197s user 0m44.869s sys 1m12.282s $ time ./driver.sh brian 10 500 real 0m16.003s user 0m5.135s sys 0m10.396s $ time ./driver.sh jerry 10 500 real 1m24.414s user 0m39.696s sys 0m54.834s $ time ./driver.sh irfan 1000 5 real 4m25.145s user 3m17.286s sys 1m21.490s $ time ./driver.sh brian 1000 5 real 0m19.442s user 0m5.287s sys 0m10.751s $ time ./driver.sh jerry 1000 5 real 5m29.136s user 4m48.926s sys 0m59.336s
Another option, if you don't care about portability, is to use associative arrays that are built in to the shell. This should work in bash 4.0 (just released about a month ago, so almost no distros ship it yet), ksh, and zsh:
newmap[name]="Irfan Zulfiqar"
newmap[designation]=SSE
newmap[company]="My Own Company"
echo ${newmap[company]}
echo ${newmap[name]}
Depending on the shell, you may need to do a typeset -A newmap
or declare -A newmap
, though my testing in bash 4 seems to indicate that is optional.
You weren't specific about which shell language you can use, so maybe you could consider AWK, which has associative arrays built in.
hput () {
eval hash"$1"='$2'
}
hget () {
eval echo '${hash'"$1"'#hash}'
}
hput France Paris
hput Netherlands Amsterdam
hput Spain Madrid
echo `hget France` and `hget Netherlands` and `hget Spain`
$ sh hash.sh
Paris and Amsterdam and Madrid
I've found it true, as already mentioned, that the best performing method is to write out key/vals to a file, and then use grep/awk to retrieve them. It sounds like all sorts of unnecessary IO, but disk cache kicks in and makes it extremely efficient -- much faster than trying to store them in memory using one of the above methods (as the benchmarks show).
Here's a quick, clean method I like:
hinit() {
rm -f /tmp/hashmap.$1
}
hput() {
echo "$2 $3" >> /tmp/hashmap.$1
}
hget() {
grep "^$2 " /tmp/hashmap.$1 | awk '{ print $2 };'
}
hinit capitols
hput capitols France Paris
hput capitols Netherlands Amsterdam
hput capitols Spain Madrid
echo `hget capitols France` and `hget capitols Netherlands` and `hget capitols Spain`
If you wanted to enforce single-value per key, you could also do a little grep/sed action in hput().
Bash4 supports this natively. Do not use grep
or eval
, they are the ugliest of hacks.
For a verbose, detailed answer with example code see: http://stackoverflow.com/questions/3467959