views:

68

answers:

1

I am looking at building a Scala web app that will have lots of code snippets in many programming languages that I would like to hightlight. It looks like one of the best, most popular syntax highlighters is Pygments, a Python tool. I downloaded Jython and was able to load first it and then Pygments from within my Scala REPL. However, all the indirection is pretty ugly and it seems rather slow (but maybe faster once everything's compiled?).

My (cleaned-up) REPL session, for illustration:

scala> :cp /usr/local/Cellar/jython/2.5.1/libexec/jython.jar

scala> import org.python.util.PythonInterpreter;

scala> val interp = new PythonInterpreter()        
*sys-package-mgr*: processing new jar, '/usr/local/Cellar/scala/2.8.0/libexec/lib/jline.jar'
*sys-package-mgr*: processing new jar, '/usr/local/Cellar/scala/2.8.0/libexec/lib/scala-compiler.jar'
*sys-package-mgr*: processing new jar, '/usr/local/Cellar/scala/2.8.0/libexec/lib/scala-dbc.jar'
*sys-package-mgr*: processing new jar, '/usr/local/Cellar/scala/2.8.0/libexec/lib/scala-library.jar'
*sys-package-mgr*: processing new jar, '/usr/local/Cellar/scala/2.8.0/libexec/lib/scala-swing.jar'
*sys-package-mgr*: processing new jar, '/usr/local/Cellar/scala/2.8.0/libexec/lib/scalap.jar'
interp: org.python.util.PythonInterpreter = org.python.util.PythonInterpreter@111de95a

scala> interp.exec("import sys")

scala> interp.exec("sys.path.append('/Library/Python/2.6/site-packages')")

scala> interp.exec("from pygments import highlight")

scala> interp.exec("from pygments.lexers import PythonLexer")

scala> interp.exec("from pygments.formatters import HtmlFormatter")

scala> interp.exec("html = highlight(code, PythonLexer(), HtmlFormatter())")          

scala> val html = interp.get("html").toString
html: java.lang.String = 
<div class="highlight"><pre><span class="k">print</span> <span class="s">&quot;Hello World&quot;</span>
</pre></div>


scala>val xhtml =  XML.loadString(html)
xhtml: scala.xml.Elem = 
<div class="highlight"><pre><span class="k">print</span> <span class="s">&quot;Hello World&quot;</span>
</pre></div>

Assuming that I choose to use Pygments, would you suggest going the Jython route (is a better way to call Python code than interp.exec()?) or setting up a separate, simple, Python-native web service running Pygments for my Scala code to call? Of course, if there are libraries of comparable quality and breadth of supported languages that are easier to use from Scala, I'm all ears.

+2  A: 

Pygments is a pretty nice syntax highlighter and if you've already gone to the trouble of working out how to run it from Scala code you can always hide the mess behind a function or two. Just because it's not very fast in your REPL session doesn't necessarily mean it will be a problem - the JVM waits a while before applying many of its optimizations, and anyway how much code do you need to highlight? If dynamically highlighting the code is slow, but the content isn't changing much you can just cache the rendered HTML.

I'm not aware of a good syntax highlighting tool in Scala or Java which you could use, but there are a number of syntax highlighters available in JavaScript that you could include in your site. One benefit of that approach is that you don't have to use any server-side resources to highlight the code, you can rely on every visitor to your site to provide the extra compute power needed to highlight the code they view.

http://alexgorbatchev.com/SyntaxHighlighter/ is one widely used JS library for syntax highlighting.

David Winslow
All good points, David. I am reluctant to use a Javascript library for just the reason you outline, having the ability to cache the rendered HTML. Maybe Pygments with Jython is a decent solution then...
pr1001
For whatever it is worth, I use Syntax Highlighter in my own blog, and I have been very happy with it.
Daniel