ansaurus

Question

How to find a string between two other strings with regex?

Answer 1

+1 A:

You need to use a group, which is a match string within parantheses:

\[start\](.*?)\[end\]

These are numbered from 1 when you come to read them (zero being the whole matched string). (There is also the facility of named groups if you find that more intuitive.)

E.g. in C#:

Match match = new Regex("\[start\](.*?)\[end\]").Match("[start]blah[end]");
string value = match.Groups[1].Value;

Paul Ruane 2009-11-26 10:29:17

sounds very usefull, but in a test, i still yields the start and the endtag for me

Peter 2009-11-26 10:33:17

test was this btw : printfn "%s" (Regex.Match(@"[start]result[end]", @"(?:\[start\])(.*?)(:?\[end\])").Groups.[0].Value)which yields [start]result[end]

Peter 2009-11-26 10:48:02

That is because you are looking at group 0 which is the whole string. Instead use group 1 which is just the bit in the parantheses.

Paul Ruane 2009-11-26 10:52:47

In fact, I am being stupid — you could just use groups. I'll update my example.

Paul Ruane 2009-11-26 10:53:34

(For posterity I initially suggested non-capturing groups (?:) but this is overkill for the specified problem.

Paul Ruane 2009-11-26 10:55:59

Which is what I said in the beginning. Sigh.

PP 2009-11-26 10:56:47

Ah yes, so it is. Have a vote. Perhaps Peter will be kind enough to to change the accepted answer to yours?

Paul Ruane 2009-11-26 11:16:40

Answer 2

+1 A:

Depends on the language. Usually you have to specify the matching group you want returned; often group zero is the whole matching expression, 1 is the first matching group, 2 is the second matching group, and so forth.

Update 1: please see http://www.regular-expressions.info/dotnet.html

Update 2: Author seemed to think he understood .NET syntax. So removing code example and letting answer stand on its own.

PP 2009-11-26 10:30:06

see the .NET tag as for the library

Peter 2009-11-26 10:30:58

I know the syntax. tx not the issue however

Peter 2009-11-26 10:48:40

If the syntax isn't the issue, then what is the issue? You pointed out you were using .NET which seems pretty syntax-specific.

PP 2009-11-26 10:53:54

my problem was indeed the index, I apologize. Calling me lazy however wasn't appropriate, neither is it to point out to google. Just as I should have read the syntax better, you should read this http://meta.stackoverflow.com/questions/8724/how-to-deal-with-google-questions

Peter 2009-11-26 11:00:38

And since this anwer is less insulting and more useful, + 1,

Peter 2009-11-26 11:12:49

I have flagged this answer for being called an idiot and lazy, (see edit remarks) This is not a direction i like SO to turn to

Peter 2009-11-26 11:30:33

That aside : I admit that your first answer,apart from calling me lazy, was correct and I didn't see that. Maybe because when offended your concentrations isn't optimal. Sorry, you were right. But there's a difference between being right and being acknowledged for it, I have myselft experienced that many times on SO. I learned to live with it, it's inherent at communities that are formed by PEOPLE and (shockingly?) they make mistakes. But hopefully, they stay polite..

Peter 2009-11-26 12:21:06

Agreed; completely inappropriate and unacceptable. @PP - I can't e-mail you privately since you don't have one listed - but note: that type of behaviour **will not** be tolerated; please keep it civil, or we will be forced to suspend your account.

Marc Gravell 2009-11-26 12:26:27

Answer 3

A:

Use \[start\](.*?)\[end\]

C#

Regex regex = new Regex("\\[start\\](.*?)\\[end\\]");

VB

Dim regex As Regex = New Regex("\[start\](.*?)\[end\]")

Sandy 2009-11-26 10:38:52

just removed the group capturing to 'start' and 'end'

Sandy 2009-11-26 10:39:58

Answer 4

+3 A:

Remember Groups[0] matches the entire input. If you just want the first captured group it is Groups[1], so

string text = "[start]blahblah[end]";
Console.WriteLine(Regex.Match(text, @"\[start\](.*?)\[end\]").Groups[1].Value);

prints blahblah.

Brian Rasmussen 2009-11-26 10:50:29

ansaurus

tags:

views:

answers:

How to find a string between two other strings with regex?

related questions