I'm modifying PHP Markdown (a PHP parser of the markup language which is used here on Stack Overflow) trying to implement points 1, 2 and 3 described by Jeff in this blog post. I've easily done the last two, but this one is proving very difficult:
- Removed support for intra-word emphasis like_this_example
In fact, in the "normal" markdown implementation like_this_example would be rendered as likethisexample. This is very undesirable; I want only _example_ to become example.
I looked in the source code and found the regex used to do the emphasis:
var $em_relist = array(
'' => '(?:(?<!\*)\*(?!\*)|(?<!_)_(?!_))(?=\S|$)(?![.,:;]\s)',
'*' => '(?<=\S|^)(?<!\*)\*(?!\*)',
'_' => '(?<=\S|^)(?<!_)_(?!_)',
);
var $strong_relist = array(
'' => '(?:(?<!\*)\*\*(?!\*)|(?<!_)__(?!_))(?=\S|$)(?![.,:;]\s)',
'**' => '(?<=\S|^)(?<!\*)\*\*(?!\*)',
'__' => '(?<=\S|^)(?<!_)__(?!_)',
);
var $em_strong_relist = array(
'' => '(?:(?<!\*)\*\*\*(?!\*)|(?<!_)___(?!_))(?=\S|$)(?![.,:;]\s)',
'***' => '(?<=\S|^)(?<!\*)\*\*\*(?!\*)',
'___' => '(?<=\S|^)(?<!_)___(?!_)',
);
I tried to open it in Regex Buddy but it wasn't enough, and after spending half an hour working on it I still don't know where to start. Any suggestions?
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.