tags:

views:

103

answers:

7

Hello-
I am a newbie in Bash and I am doing some string manipulation.

I have the following file among other files in my directory:

jdk-6u20-solaris-i586.sh

I am doing the following to get jdk-6u20 in my script:

myvar=`ls -la | awk '{print $9}' | egrep "i586" | cut -c1-8`
echo $myvar

but now I want to convert jdk-6u20 to jdk1.6.0_20. I can't seem to figure out how to do it.

It must be as generic as possible. For example if I had jdk-6u25, I should be able to convert it at the same way to jdk1.6.0_25 so on and so forth

Any suggestions?

A: 

i think that

sed
is the command for you

garph0
A: 
awk 'if(match($9,"i586")){gsub("jdk-6u20","jdk1.6.0_20");print $9;}'

The if(match()) supersedes the egrep bit if you want to use it. You could use substr($9,1,8) instead of cut as well.

drawnonward
A: 

garph0 has a good idea with sed; you could do

myvar=`ls jdk*i586.sh | sed 's/jdk-\([0-9]\)u\([0-9]\+\).\+$/jdk1.\1.0_\2/'`
David Zaslavsky
yeah, I know what sed does, but not how to use it :)anyway this sample just assigns the original file name to myvar, in my test.
garph0
Odd, it works for me... maybe this is some difference between Solaris and Linux, now that I think about it. In that case try msw's answer (I would have edited mine to match if he hadn't posted)
David Zaslavsky
Hmm, I get the following when I apply your suggestionjdk-6u20-solaris-i586.sh
kuti
Yeah, you mentioned that. I was saying that sed might work differently on Linux (which I use) vs. Solaris (which I assume you use), and that would account for the fact that this sed command works for me but not for you. msw has given you an alternate sed command that should work the same way on all platforms (Solaris and Linux).
David Zaslavsky
A: 

You can try this snippet:

for fname in *; do
    newname=`echo "$fname" | sed 's,^jdk-\([0-9]\)u\([0-9][0-9]*\)-.*$,jdk1.\1.0_\2,'`
    if [ "$fname" != "$newname" ]; then
        echo "old $fname, new $newname"
    fi
done
Roland Illig
+2  A: 

Depending on exactly how generic you want it, and how standard your inputs will be, you can probably use AWK to do everything. By using FS="regexp" to specify field separators, you can break down the original string by whatever tokens make the most sense, and put them back together in whatever order using printf.

For example, assuming both dashes and the letter 'u' are only used to separate fields:

myvar="jdk-6u20-solaris-i586.sh"
echo $myvar | awk 'BEGIN {FS="[-u]"}; {printf "%s1.%s.0_%s",$1,$2,$3}'

Flavour according to taste.

goldPseudo
A: 

You're needing the awk in there is an artifact of the -l switch on ls. For pattern substitution on lines of text, sed is the long-time champion:

ls | sed -n '/^jdk/s/jdk-\([0-9][0-9]*\)u\([0-9][0-9]*\)$/jdk1.\1.0_\2/p'

This was written in "old-school" sed which should have greater portability across platforms. The expression says:

  • don't print lines unless they match -n
  • on lines beginning with 'jdk' do:
  • on a line that contains only "jdk-IntegerAuIntegerB"
    • change it to "jdk.1.IntegerA.0_IntegerB"
    • and print it

Your sample becomes even simpler as:

myvar=`echo *solaris-i586.sh | sed 's/-solaris-i586\.sh//'`
msw
Might want to drop the `p` flag on that substitution, otherwise it prints twice (at least for me).
David Zaslavsky
aye, oops, edited, thanks.
msw
And actually, come to think of it, the second substitution (the one I meant in the last comment) doesn't implement the transformation the OP wants. In the first one, you need to replace `$` with `.*` to get it to work (since it should drop the `-solaris-i586.sh` from the end of the filename).
David Zaslavsky
kuti
I am trying to understand how sed works here. I can see that ^jdk represents the lines beginning with jdk. s for searching. How about [0-9][0-9] pieces? and why .*/jdk1.\1.0_\2/ part?I am sorry, I am just new to both sed and bash. I am assuming these are regular expressions?
kuti
yes, these are sed-flavored regexps. The first part tells sed to only consider lines beginning with "jdk", that's a sed 'address'. The "[0-9]" matches a single digit and "[0-9]*" matches zero or more digits. The parentheses group the subexpression for later use. Modern regexps might use "\d+" which is 'one or more digits' but I didn't know your platform. the "\1" inserts what the first parens matched and thus the "\2". Enjoy. http://en.wikipedia.org/wiki/Regular_expression and don't ever try to use regexps to parse HTML/XML as it only hurts you and annoys the text.
msw
+1  A: 

Using only Bash:

for file in jdk*i586*
do
    file="${file%*-solaris*}"
    file="${file/-/1.}"
    file="${file/u/.0_}"
    do_something_with "$file"
done
Dennis Williamson