tags:

views:

4261

answers:

9

For example, this regex

(.*)<FooBar>

will match:

abcde<FooBar>

But how do I get it to match across multiple lines?

abcde
fghij<FooBar>
+1  A: 

In the context of use within languages, regular expressions act on strings, not lines. So you should be able to use the regex normally, assuming that the input string has multiple lines.

In this case, the given regex will match the entire string, since "<FooBar>" is present. Depending on the specifics of the regex implementation, the $1 value (obtained from the "(.*)") will either be "fghij" or "abcde\nfghij". As others have said, some implementations allow you to control whether the "." will match the newline, giving you the choice.

Line-based regular expression use is usually for command line things like egrep.

nsayer
+8  A: 

It depends on the language, but there should be a modifier that you can add to the regex pattern. In PHP it is:

/(.*)<FooBar>/s

The s at the end causes the dot to match all characters including newlines.

yjerem
+2  A: 

Try this:

((.|\n)*)<FooBar>

It basically says "any character or a newline" repeated zero or more times.

levik
This is dependent on the language and/or tool you are using. Please let us know what you are using, eg Perl, PHP, CF, C#, sed, awk, etc.
Ben Doom
+2  A: 

"." normally doesn't match line-breaks. Most regex engines allows you to add the S-flag (also called DOTALL and SINGLELINE) to make "." also match newlines. If that fails, you could do something like [\S\s].

MizardX
A: 

generally . doesn't match newlines, so try ((.|\n)*)<foobar>

tloach
No, don't do that. If you need to match anything including line separators, use the DOTALL (a.k.a. /s or SingleLine) modifier. Not only does the (.|\n) hack make the regex less efficient, it's not even correct. At the very least, it should match \r (carriage return) as well as \n (linefeed). There are other line separator characters, too, albeit rarely used. But if you use the DOTALL flag, you don't have to worry about them.
Alan Moore
\R is the platform-independent match for newlines in Eclipse.
opyate
+1  A: 
/(.*)<FooBar>/s

the s causes Dot (.) to match carriage returns

Bill
A: 

Note that (.|\n)* can be less efficient than (for example) [\s\S]* (if your language's regexes support such escapes) and than finding how to specify the modifier that makes . also match newlines. Or you can go with POSIXy alternatives like [[:space:][:^space:]]*.

tye
A: 

I had the same problem and solved it in probably not the best way but it works. I replaced all line breaks before I did my real match:

mystring= Regex.Replace(mystring, "\r\n", "")

I am manipulating HTML so line breaks don't really matter to me in this case.

I tried all of the suggestions above with no luck, I am using .Net 3.5 FYI

Slee
A: 

Use RegexOptions.Singleline, it changes the meaning of . to include newlines

Regex.Replace(content, searchText, replaceText, RegexOptions.Singleline);

shmall