ansaurus

Question

My regex is matching too much. How do I make it stop?

Answer 1

+10 A:

Make .* non-greedy by adding '?' after it:

Project name:\s+(.*?)\s+J[0-9]{7}:

jj33 2008-08-22 14:12:01

Answer 2

A:

I knew it was something easy. Thanks jj33

Mark Biek 2008-08-22 14:13:44

Answer 3

+2 A:

Using non-greedy quantifiers here is probably the best solution, also because it is more efficient than the greedy alternative: Greedy matches generally go as far as they can (here, until the end of the text!) and then trace back character after character to try and match the part coming afterwards.

Hower, consider using a negative character class instead:

Project name:\s+(\S*)\s+J[0-9]{7}:

\S means “everything except a whitespace and this is exactly what you want.

Konrad Rudolph 2008-08-22 14:15:57

Answer 4

A:

I would also recommend you experiment with regular expressions using "Expresso" - it's a utility a great (and free) utility for regex editing and testing.

One of its upsides is that its UI exposes a lot of regex functionality that people unexprienced with regex might not be familiar with, in a way that it would be easy for them to learn these new concepts.

For example, when building your regex using the UI, and choosing "*", you have the ability to check the checkbox "As few as possible" and see the resulting regex, as well as test its behavior, even if you were unfamiliar with non-greedy expressions before.

Available for download at their site: http://www.ultrapico.com/Expresso.htm

Express download: http://www.ultrapico.com/ExpressoDownload.htm

Hershi 2008-08-22 14:17:21

Answer 5

A:

Well, ".*" is a greedy selector. You make it non-greedy by using ".*?" When using the latter construct, the regex engine will, at every step it matches text into the "." attempt to match whatever make come after the ".*?". This means that if for instance nothing comes after the ".*?", then it matches nothing.

Here's what I used. s contains your original string. This code is .NET specific, but most flavours of regex will have something similar.

string m = Regex.Match(s, @"Project name: (?<name>.*?) J\d+").Groups["name"].Value;

Svend 2008-08-22 14:24:12

Answer 6

A:

@Hershi

I'm actually using RegexBuddy which is definitely helpful in terms of seeing what's going on. Although Espresso does look nice.

@Konrad

Thanks for the tip on \S. That's something I didn't know about although, in my case, there may be spaces in the stuff I want to capture in a group.

Mark Biek 2008-08-22 14:25:16

ansaurus

tags:

views:

answers:

My regex is matching too much. How do I make it stop?

related questions