tags:

views:

253

answers:

3

Hi there.

I need a regex that matches the following (the brackets indicate I want to match this section, they are not there in the actual string to be matched!):

More Words:
    6818 [some words]       641 [even more words]

I tried it with following:

(?<=[0-9]+\s)[a-z\s]+(?!\s{2,})

To say it in literals; "Match all words including the whitespace between them that come one space after 1 or more digits and before 2 or more whitespaces" but it selects all whitespaces as well as it sometimes strips out the last letter of a word (wtf?)

+1  A: 

try

(?<=[0-9]+\s)([a-z]+\s)*[a-z]+(?!\s{2,})

@Bart: I removed the brackets.

Explanation: This will select all words followed by a single whitespace (if they exist) plus the last word not followed by a whitespace (which is mandatory)

Manu
Many regex implementations do not support look-behinds that have no obvious length, or even a variable length, so chances are the the `[0-9]+` inside `(?<=[0-9]+\s)` is illegal.
Bart Kiers
Works quite well, but now I get the strange behaviour, that sometimes the last letter of a word isn't selected, e.g.:[some word]s and [even more word]s
ApoY2k
+1  A: 

this works for me

[0-9]+\s([a-z \s]+)\s\s
Matt Ellen
Works fine, I had that one also at first, but it selects also all following whitespaces after the last word.
ApoY2k
@ApoY2k, Just grab what is matched in group 1.
Bart Kiers
A: 

(?<=\d\s)([a-zA-Z]+\s)*[a-zA-Z]+

This one did the trick! Don't ask how I came on that one, just fuzzing around... Still, you were a great help :)

To clarify this regex, short explanation:

1: (                open group 1
2:  ?<=\d\s         look, if a digit followed by a whitespace are before group 2
3: )                close group 1
4: (                open group 2
5:  [a-zA-Z]+\s     match any words / letters that are followed by a whitespace
6: )*               close group 2 and let it repeat or not even be there
7: [a-zA-Z]+        match any words / letters and let them repeat one or more times

long story short, the regex doesnt try to match words between an amount of whitespaces but matches anything between a digit/whitespace and a word/letter :)

ApoY2k