views:

33

answers:

4

Hi all,

I would like some help with a regular expression.

I have some files like this:

  • JWE-766.1.pdf
  • JWE-766.2.pdf
  • JWE-768.1.pdf
  • JWE-770.1.pdf

I would like a regex pattern to extract the number after 'JWE-'. i.e. 766.

Also, a regex expression to extract 1 and 2 from JWE-766.1.pdf and JWE-766.2.pdf respectively.

Any help would be hugely appreciated.

Thanks guys!

+1  A: 

Unless there is more variety to the pattern than this, I would just use substring manipulation in this case.

ie

string s = "JWE-766.1.pdf";
string firstNumber = s.substring( s.indexOf("-" +1), s.indexOf(".") );
string secondNumber = "JWE-766.1.pdf".substring( s.indexOf("." +1), s.lastIndexOf(".") ); 
BioBuckyBall
Hello there.I would have done this approach, however, this is a problem because I also have files like: JWE-11.1.pdf
Jamie
@Jamie updated my answer to reflect your detail. To me, it still feels like substring is the better choice in this case, but to each his own :)
BioBuckyBall
+2  A: 
Pattern p = Pattern.compile("^JWE-([0-9]+)\\.([0-9]+)\\.pdf$");
Matcher m = p.matcher("your string here");

if (m.find()) {
    System.out.println(m.group(1)); //first number group
    System.out.println(m.group(2)); //second number group
}

Taken from here

Also, make sure to reuse the Pattern p object if you're looping through a series of strings

sigint
Hey.Thanks very much. Alas, that pattern has an error of "Invalid Escape Sequence". Hmmmn.....I shall find out why :-)
Jamie
Yay! it works! Thanks! :D You guys are awesome.
Jamie
It may be possible that Java is complaining over the `\\d` part. I changed it to use `[0-9]` instead.
sigint
`\\d` is correct. He was probably using your regex in its original form, before you doubled all the backslashes.
Alan Moore
+1  A: 

JWE-(\d+).(\d+).pdf

should do the trick.

of course when you are creating the string:

Pattern  p = Pattern.compile("JWE-(\\d+)\.(\\d+)\\.pdf");
Matcher m = p.matcher(s); // s contains your filename
if (m.matches()) { 
   String fullName = m.group(0);
   int firstIndex = m.group(1); // 766
   int secondIndex = m.group(2); // 1
}

Have fun

Elf King
Thanks :-) I marked sigint's answer as the right one so he could get some reputation. I have upvoted yours though. Thanks again! :D
Jamie
:-) i voted for his answer too coz it appeared while i was finishing mine
Elf King
+1  A: 

You can use parentheses for capturing groups, and then use Matcher.group(int) to retrieve them after matching.

Try the pattern "^JWE-(\d+)\.(\d?)\.pdf$" and I think group one should be the 766, and group 2 should be 1.

However, as stated above, if the file names are consistent in length, straight manipulation by index will be faster.

...one minute too slow. The Elf King is quick like the wind.

DVA