tags:

views:

36

answers:

1

I'm using CodeRay for syntax highlighting, but I'm having trouble with this regular expression. The text will look like this:

<pre><code>:::ruby
def say_hello
  puts 'hello!'
end
</code></pre>

This part: :::ruby will tell CodeRay which language the code block should be interpreted as (but it needs to be optional). So here's what I have so far:

def coderay(text)
  text.gsub(/\<pre\>\<code\>(.+?)\<\/code\>\<\/pre\>/m) do
    CodeRay.scan($2, $3).div()
  end
end

$2 contains the code that I'm formatting (including the line that says which language to format it in), but I need to extract that first line so I can pass it as the second parameter to scan() or pass it a default parameter if that language line wasn't found. How can I do this?

+1  A: 

In Ruby 1.9, using named groups:

default_lang=:ruby

def coderay(text)
  text.gsub(%r!<pre><code>(?::{3}(?<lang>\w+)\s+)?(?<code>.+?)</code></pre>!m) do
    if $~[:lang].nil?
      lang=default_lang
    else
      lang = $~[:lang].intern
    end
    CodeRay.scan($~[:code], lang).div()
  end
end

default_lang could also be a class or object variable rather than a local, depending on the context of coderay.

Same, but using an inline expression to handle the optional language:

default_lang=:ruby

def coderay(text)
  text.gsub(%r!<pre><code>(?::{3}(?<lang>\w+)\s+)?(?<code>.+?)</code></pre>!m) do
    CodeRay.scan($~[:code], $~[:lang].nil? ? default_lang : $~[:lang].intern).div()
  end
end

The second option is a little messier, hence you might want to avoid it.

It turns out named groups in a non-matching optional group are still counted in Ruby, so handling unmatched numbered groups isn't any different from unmatched named groups, unlike what I first thought. You can thus replace the named group references with positional references in the above and it should work the same.

default_lang=:ruby

def coderay(text)
  text.gsub(%r!<pre><code>(?::{3}(?<lang>\w+)\s+)?(?<code>.+?)</code></pre>!m) do
    CodeRay.scan($2, $1.nil? ? default_lang : $1.intern).div()
  end
end

def coderay(text)
  text.gsub(%r!<pre><code>(?::{3}(?<lang>\w+)\s+)?(?<code>.+?)</code></pre>!m) do
    if $1.nil?
      lang=default_lang
    else
      lang = $1.intern
    end
    CodeRay.scan($2, lang).div()
  end
end
outis
What would the 1.8 syntax look like?
Andrew