tags:

views:

37

answers:

1

I want to parse markdown style links, but I'm having some trouble matching the reference style ones. Like this one: [id]: http://example.com/ "Optional Title Here"

My regex gets the id and the url, but not the title.

Heres what I have:

/\[([a-zA-Z0-9_-]+)\]: (\S+)\s?("".*?"")?/

I go through and add the references to a hashtable. the id as the key and the value is an instance of a class I made called LinkReference that just contains the url and the title. In case the problem is not my regex, and my code adding the matches to the hash table, Heres my code for that too:

        Regex rx = new Regex(@"\[([a-zA-Z0-9_-]+)\]: (\S+)\s?("".*?"")?");
        MatchCollection matches = rx.Matches(InputText);
        foreach (Match match in matches)
        {
            GroupCollection groups = match.Groups;
            string title = null;
            try
            {
                title = groups[3].Value;
            }
            catch (Exception)
            {
                // keep title null
            }
            LinkReferences.Add(groups[1].Value, new LinkReference(groups[2].Value, title));
        }
+1  A: 

I think you actually have two spaces between your URL and the title (it doesn't show up in the rendered HTML of Stack Overflow, but I can see it in the source for the page...)

Anyway, I believe you want to change \s? (0 or 1 spaces) to \s* (0 or more spaces):

var rx = new Regex(@"\[([a-zA-Z0-9_-]+)\]: (\S+)\s*("".*?"")?");

You probably also want to allow for multiple spaces on either side of the ":" and in a couple of other places, like so:

var rx = new Regex(@"\[\s*([a-zA-Z0-9_-]+)\s*\]\s*:\s*(\S+)\s*("".*?"")?");

(it doesn't hurt to be liberal in allowing spaces, IMO)

Dean Harding
Ah hah! Yes this is it. Thanks :)
The.Anti.9