views:

10209

answers:

8

Looking for a solution in bash (will be part of a script).

Given a filename in the form "someletters_12345_moreleters.ext", I want to extract the 5 digits and put them into a variable.

So to emphasize the point. I have a filename with x number of characters then a five digit sequence surrounded by a single underscore on either side then another set of x number of characters. I want to take the 5 digit number and put that into a variable.

I am very interested in the number of different ways that this can be accomplished. As with most things, I am sure there are a number of different ways to tackle this problem.

Thanks for your help in advanced.


Duplicate of

+11  A: 

Use cut:

echo someletters_12345_moreleters.ext | cut -d'_' -f 2

More generic:

INPUT=someletters_12345_moreleters.ext
SUBSTRING=`echo $INPUT| cut -d'_' -f 2`
echo $SUBSTRING
FerranB
the more generic answer is exactly what i was looking for, thanks
Berek Bryan
+3  A: 

Generic solution where the number can be anywhere in the filename, using the first of such sequences:

number=$(echo $filename | egrep -o '[[:digit:]]{5}' | head -n1)

Another solution to extract exactly a part of a variable:

number=${filename:offset:length}

If your filename always have the format stuff_digits_... you can use awk:

number=$(echo $filename | awk -F _ '{ print $2 }')

Yet another solution to remove everything except digits, use

number=$(echo $filename | tr -cd '[[:digit:]]')
Johannes Schaub - litb
thanks for the variety of solutions, great stuff.
Berek Bryan
+1  A: 

There's also the bash builtin 'expr' command:

INPUT="someletters_12345_moreleters.ext"  
SUBSTRING=`expr match "$INPUT" '.*_\([[:digit:]]*\)_.*' `  
echo $SUBSTRING
jor
A: 

Building on jor's answer (which doesn't work for me):

substring=$(expr "$filename" : '.*_\([^_]*\)_.*')
PEZ
+6  A: 

If x is constant, the following parameter expansion performs substring extraction:

b=${a:12:5}

If the underscores around the digits are the only ones in the input, you can strip prefix and suffix off in two steps:

tmp=${a#*_}
b=${a%_*}

If there are other underscores, it's probably feasible anyway, albeit more tricky. If anyone knows how to perform both expansions in a single expression, I'd like to know too.

Both solutions presented are pure bash, with no process spawning involved, hence very fast.

JB
+1 for the native Bash solution
seanhodges
+2  A: 

Without any sub-processes you can:

shopt -s extglob
front=${input%%_+([a-zA-Z]).*}
digits=${front##+([a-zA-Z])_}

A very small variant of this will also work in ksh93.

Darron
A: 

Here's how i'd do it:

FN=someletters_12345_moreleters.ext
[[ $FN =~ _([[:digit:]]{5})_ ]] && NUM=${BASH_REMATCH[1]}

Note: the above is a regular expression and is restricted to your specific scenario of five digits surrounded by underscores. Change the regular expression if you need different matching.

nicerobot
A: 

just try to use cut -c startIndx-stopIndx

brown.2179