ansaurus

Question

Granularity of Syntax Coloring in Visual Studio

Answer 1

+6 A:

A few thoughts.

First, features are "not implemented" by default. In order for a feature to be implemented, someone has to think of the feature. Then we have to design it, specify it, implement it, test it, document it, find a shipping vehicle for it, and get it out the door. If any one of those things does not happen, you don't get the feature. As far as I know, NONE of these things have happened for this feature.

Second, features are prioritized based on their net benefits -- that is, their total benefit to our customers, minus our total costs in implementing them. There are very real "opportunity costs" in play here. Every feature that we DO implement is dozens of features that we do not have budget for. So features not only have to be worth the work to make them happen, they have to be MORE beneficial than any of the thousands of features we've got on our feature request lists. That's a high bar to achieve; most features never achieve it.

To explain my third point you need to know a bit about how languages are processed. We begin by taking the source code and "lexing" it into "tokens" -- words. At this point we know whether every character is a part of a number, string, keyword, identifier, comment, preprocessor directive, and so on. Lexing is incredibly fast; we can easily re-lex a file between keystrokes.

We then take the series of tokens and "parse" them into an "abstract syntax tree". This determines what parts of the code are classes, expressions, local variable declarations, names, assignments, whatever. Parsing is also fast, but not as fast as lexing. We do some tricks, like skipping parsing the method bodies until someone is actually looking at them.

Finally, we take the abstract syntax tree and do semantic analysis on it; this determine whether a given name refers to a type, a local variable, a namespace, a method group, a field, and so on. We do both "top level" semantic analysis, to determine the type hierarchy of the program, and "method level" semantic analysis, to determine the type of every expression in every method. "Top level" semantic analysis is pretty fast, and any individual method analysis is pretty fast, but still, it's hard to do a full semantic analysis between keystrokes.

Obviously we need to do full semantic analysis for intellisense, but we can get away with figuring out what method you are currently typing in, and only doing the semantic analysis of the top level and of that method.

But colorization has to work on the entire file; you can't just colorize the method that the cursor happens to be in right now. Therefore, colorization has to be insanely fast, so historically we've colourized mostly based on lexical information.

Occasionally we can figure out special stuff like "is this thing probably a type?" to give it a different color. But figuring out when a given entity is, say, a method group vs, say, a field of delegate type, requires a pretty rich level of semantic analysis, a level that we at present don't perform on every keystroke.

Now, there are things we can do here. We could be smarter about understanding edits to the token stream, and only reperforming grammatical and semantic analysis on the edited portion of the tree. We're doing some research into this area now, but it's just research; it might never make it actually into the product.

Eric Lippert 2009-10-22 22:02:45

You might make a note that the AST used for an actively edited file is substantially different ("more flexible") from the AST used by a compiler. I say this because I've seen many people who think if you understand lexing and parsing then you can turn it into a smart editor in a straightforward manner, but that is not even remotely close to the truth. :o

280Z28 2009-10-22 22:25:49

@Eric: Thank you for the extensive response. I wasn't griping about VS's lack of features, I was genuinely hoping I'd missed something in the UI and there was some way to get the colorization I'm after.(continued)

I. J. Kennedy 2009-10-23 04:17:54

Having written several compilers myself, I appreciate the difficulties. On the other hand, it does seem to me much improvement could be made in this area without the need for the full AST. I realize that, because a given symbol can be used in multiple contexts, just having a symbol table around is not enough. For example, I just wrote this line of code today: public CaseStyle CaseStyle { get; set; }The first CaseStyle is an enum; the second is the property name, so there is ambiguity with that symbol.However, what about methods? Can't method calls be detected at the lexical level?

I. J. Kennedy 2009-10-23 04:19:39

Sure, stuff that syntactically looks like a method call can be detected. (Though there are interesting syntactic ambiguities with generic methods. is "F" a method call here? "M(F<A,B>(10))") However, we cannot tell whether, say, it is a method group call or an invocation of a delegate field without semantic analysis.

Eric Lippert 2009-10-23 14:49:56

The "CaseStyle" problem you mention has heuristics baked into the language design specifically in order to handle it. See http://blogs.msdn.com/ericlippert/archive/2009/07/06/color-color.aspx for details.

Eric Lippert 2009-10-23 14:51:53

Answer 2

+1 A:

I believe that the ReSharper plugin provides some enhanced syntax highlighting like you are talking about. There may also be other plugins which provide the same thing (at less cost) that is simply the one I use. I do agree syntax highlighting is very useful. ReSharper also does some nice things like grey out dead code to make it more obvious, highlight the current line, etc.

-Daniel

Daniel Brotherston 2009-10-22 22:19:14

I have heard much heralding of ReSharper. Unfortunately, for most projects I use Express, which doesn't allow plugins.

I. J. Kennedy 2009-10-23 04:21:59

ansaurus

tags:

views:

answers:

Granularity of Syntax Coloring in Visual Studio

related questions