I can't say exactly what algorithms are used by any particular implementation, but I can make some educated guesses. A trie is a very useful data structure for this problem: the IDE can maintain a large trie in memory of all of the symbols in your project, with some extra metadata at each node.
When you type a character, it walks down a path in the trie. All of the descendants of a particular trie node are possible completions. The IDE then just needs to filter those out by the ones that make sense in the current context, but it only needs to compute as many as can be displayed in the tab-completion pop-up window.
More advanced tab-completion requires a more complicated trie. For example, Visual Assist X has a feature whereby you only need to type the capital letters of CamelCase symbols -- e.g., if you type SFN, it shows you the symbol SomeFunctionName
in its tab-completion window.
Computing the trie (or other data structures) does require parsing all of your code to get a list of all of the symbols in your project. Visual Studio stores this in its IntelliSense database, an .ncb
file stored alongside your project, so that it doesn't have to reparse everything every time you close and reopen your project. The first time you open a large project (say, one you just synced form source control), VS will take the time to parse everything and generate the database.
I don't know how it handles incremental changes. As you said, when you're writing code, it's invalid syntax 90% of the time, and reparsing everything whenever you idled would put a huge tax on your CPU for very little benefit, especially if you're modifying a header file included by a large number of source files.
I suspect that it either (a) only reparses whenever you actually build your project (or possibly when you close/open it), or (b) it does some sort of local parsing where it only parses the code around where you've just edited in some limited fashion, just to get the names of the relevant symbols. Since C++ has such an outstandingly complicated grammar, it may behave oddly in the dark corners if you're using heavy template metaprogramming and the like.