Syntax Highlighting - most efficient and professional way

As with anything code.... there rarely is a "best" way. There are multiple ways of doing things and each of them have benefits and drawbacks.

That said, some form of the Interpreter Pattern is probably the most common way. According to the GoF book:

The Interpreter pattern is widely used in compilers implemented with object-oriented languages, as the Smalltalk compilers are. SPECTalk uses the pattern to interpret descriptions of input file formats. The QOCA constraint-solving toolkit uses it to evaluate constraints.

It also goes on to talk about it's limitations in the applicability section

the grammer is simple. For complex grammars, the class hierarchy for the grammer becomes large and unmanagable. Tools such as parser generators are a better alternative in such cases

effeciency is not a critical concern. The most efficient interpreters are usually not implemented by interpreting parse trees directly but by first translating them into another form. For example, regular expressions are often transformed into state machines. But even then, the translator can be implemented by the Interpreter pattern, so the pattern is still applicable.

Understanding this, you should now know why it's better to pre-compile your reusable RegEx first before performing many matches with it. If you don't, it will have to do both steps every time (transformation, interpretations) rather than building the state machine once, and applying it efficiently several times over.

Specifically for the type of interpretation you are describing, Microsoft exposes the Microsoft.VisualStudio namespace and all of it's powerful features as part of the Visual Studio SDK. You can also look at System.CodeDOM for dynamic code generation and compilation.

Yes, but see, wouldn't regexes be a lot less efficient than whatever way Microsoft is doing it? So you say that for Microsoft it's very easy since they have the parser for the language anyway (for compiling) and thus they can just use that for the syntax highlighting and get it right 100% automatically?If so, can you point to code that implements it that way?

Pessimist 2010-03-10 15:01:35

@Pessimist: You may want to look at this: http://bit.ly/3w5wK3 and this: http://bit.ly/dxDrkx

klausbyskov 2010-03-10 15:15:19

@Pessimist: but please note that the compiler alone cannot be used when the highlighted code is not well-formed. Furthermore, as you have probably read on wikipedia, the regex approach is not necesarily very efficient.

klausbyskov 2010-03-10 15:16:44

Although no code was offered and I'd really like to see some code implementing the suggested solution (but also addressing the bigger picture, as outlined in the original question - things like only highlighting visible code, etc), I understand what is being talked about here and I can relate this to what I already know. I'll choose this as the accepted answer.

Pessimist 2010-03-12 15:44:27

@klausbyskov: thanks for the links!

Pessimist 2010-03-12 15:45:04

ansaurus

tags:

views:

answers:

Syntax Highlighting - most efficient and professional way

related questions