views:

103

answers:

3

Whenever I look at regular expressions of any complexity, my eyes start to water. Is there any existing solution for giving different colors to the different kinds of symbols in a regex expression?

Ideally I'd like different highlighting for literal characters, escape sequences, class codes, anchors, modifiers, lookaheads, etc. Obviously the syntax changes slightly across languages, but that is a wrinkle to be dealt with later.

Bonus points if this can somehow coexist with the syntax highlighting Vim does for whatever language is using the regex.

Does this exist somewhere, or should I be out there making it?

+2  A: 

This Vim plugin claims to do syntax higlighting:

http://www.vim.org/scripts/script.php?script_id=1091

I don't think it's exactly what you want, but I guess it's adaptable for your own use.

Peter Boughton
+4  A: 

Regular expressions might not be syntax-highlighted, but you can look into making them more readable by other means.

Some languages allow you to break regular expressions across multiple lines (perl, C#, Javascript). Once you do this, you can format it so it's more readable to ordinary eyes. Here's an example of what I mean.

You can also use the advanced (?x) syntax explained here in some languages. Here's an example:

(?x:          # Find word-looking things without vowels, if they contain an "s"
   \b                       # word boundary
   [^b-df-hj-np-tv-z]*      # nonvowels only (zero or more)
   s                        # there must be an "s"
   [^b-df-hj-np-tv-z]*      # nonvowels only (zero or more)
   \b                       # word boundary
)

EDIT:

As Al pointed out, you can also use string concatenation if all else fails. Here's an example:

regex = ""          # Find word-looking things without vowels, if they contain an "s"
   + "\b"                       # word boundary
   + "[^b-df-hj-np-tv-z]*"      # nonvowels only (zero or more)
   + "s"                        # there must be an "s"
   + "[^b-df-hj-np-tv-z]*"      # nonvowels only (zero or more)
   + "\b";                      # word boundary
Kimball Robinson
+1: This is a good way of doing it. When you write functions in native Vim, there's no support for verbose REs, but if you want to do it, you can use string concatenation to achieve the same thing.
Al
+1  A: 

Vim already has syntax highlighting for perl regular expressions. Even if you don't know perl itself, you can still write your regex in perl (open a new buffer, set the filetype to perl and insert '/regex/') and the regex will work in many other languages such as PHP, Javascript or Python where they have used the PCRE library or copied Perl's syntax.

In a vimscript file, you can insert the following line of code to get syntax highlighting for regex:

let testvar =~ "\(foo\|bar\)"

You can play around with the regex in double-quotes until you have it working.

It is very difficult to write syntax highlighting for regex in some languages because the regex are written inside quoted strings (unlike Perl and Javascript where they are part of the syntax). To give you an idea, this syntax script for PHP does highlight regex inside double- and single-quoted strings, but the code to highlight just the regex is longer than most languages' entire syntax scripts.

too much php