views:

71

answers:

4

I'm starting to write a code syntax highlighter in JavaScript, and I want to highlight text that is in quotes (both "s and 's) in a certain color. I need it be able to not be messed up by one of one type of quote being in the middle of a pair of the other quotes as well, but i'm really not sure where to even start. I'm not sure how I should go about finding the quotes and then finding the correct end quote.

+1  A: 

Unless you're doing this for the challenge, have a look at Google Code Prettify.

For your problem, you could read up on parsing (and lexers) at Wikipedia. It's a huge topic and you'll find that you'll come upon bigger problems than parsing strings.

To start, you could use regular expressions (although they rarely have the accuracy of a true lexer.) A typical regular expression for matching a string is:

/"(?:[^"\\]+|\\.)*"/

And then the same for ' instead of ".

Otherwise, for a character-by-character parser, you would set some kind of state that you're in a string once you hit ", then when you hit " that is not preceded by an uneven amount of backslashes (an even amount of backslashes would escape eachother), you exit the string.

Blixt
A: 

use stack.. if unmatched quote found push it.. if match found pop

Umair Ahmed
+1  A: 

You can find quotes using regular expressions but if you're writing a syntax highlighter then the only reliable way is to step through the code, character by character, and decide what to do from there.

E.g. of a Regex

/("|')((?:\\\1|.)+?)\1/g

(matches "this" and 'this' and "thi\"s")

J-P
A: 

I did it with a single regular expression in php using backwards references. JS does not support it and i think that's what you need if you really want to detect undefined backslashes.