I'm trying to pull the first paragraph out of Markdown formatted documents:
This is the first paragraph.
This is the second paragraph.
The answer here gives me a solution that matches the first string ending in a double line break.
Perfect, except some of the texts begin with Markdown-style headers:
###
This is an h3 header.This is the first paragraph.
So I need to:
- Skip any line that begins with one or more
#
symbols. - Match the first string ending in a double line break.
In other words, return 'This is the first paragraph' in both of the examples above.
So far, I've tried many variations on:
"/(?s)(?:(?!\#))((?!(\r?\n){2}).)*+/
But I can't get it to return the proper match.
Where did I go wrong in my lookaround?
I'm doing this in PHP (preg_match()), if that makes a difference.
Thanks!