$regpattern4 = "!<media:description type='plain'> (.*) <\/media:description>!";
I am parsing an XML document. The above Regex works if there are no line breaks in the description, but how do I make it work even if there are line breaks?
$regpattern4 = "!<media:description type='plain'> (.*) <\/media:description>!";
I am parsing an XML document. The above Regex works if there are no line breaks in the description, but how do I make it work even if there are line breaks?
You need to add the s
(DOTALL) modifier:
$regpattern4 = "!(.*)<\/media:description>!s";
Hi,
The manual page "Pattern Modifiers" might interest you, about that, especially the s (PCRE_DOTALL
) modifier :
If this modifier is set, a dot metacharacter in the pattern matches all characters, including newlines. Without it, newlines are excluded. This modifier is equivalent to Perl's /s modifier. A negative class such as [^a] always matches a newline character, independent of the setting of this modifier.
Your regex will become something like this :
$regpattern4 = "!<media:description type='plain'> (.*) <\/media:description>!s";
Note I added the 's
' modifier after the end delimiter.
Why are you using regex to parse xml? Why not use simplexml_load_string to load the XML document and "walk" through it. It will be less error prone than complex regex statements, unless you are looking to do a simple replace.