views:

637

answers:

4

How does the C/C++ compiler manipulate the escape character ["\"] in source code? How is compiler grammar written for processing that character? What does the compiler do after encountering that character?

+9  A: 

Most compilers are divided into parts: the compiler front-end is called a lexical analyzer or a scanner. This part of the compiler reads the actual characters and creates tokens. It has a state machine which decides, upon seeing an escape character, whether it is genuine (for example when it appears inside a string) or it modifies the next character. The token is output accordingly as the escape character or some other token (such as a tab or a newline) to the next part of the compiler (the parser). The state machine can group several characters into a token.

Yuval F
+1  A: 

escape character with a following character (like \n) is a single character for C compiler - scanner presents it to parser as character token, so there is no need in special syntax rules in parser for escape character.

qrdl
+1  A: 

It generally escapes the following character:

  • In a string literal or character literal, it means escape the next character. \a means 'alert' (flashing the terminal, beeping or whatever), \n means 'linefeed', \xNUM means an hexadecimal number for example.
  • If it appears as the last visible character before a newline, whether within a string or not (and even within a line-wide comment!), it acts as a line-continuation: The following newline character is ignored, and the next line is merged with the current line.
Johannes Schaub - litb
Don't forget \uXXXX and \UXXXXXXXX in C99.
Jonathan Leffler
\NNN (where N are digits 0-7) represents a byte-number in Octal.\0, commonly referred to as a Null-Terminator, is really the same as \000. I think 255 is \377
abelenky
Aaron, that's right. there are other uses of \ too. i thought i would list the most important (by subjective measuring).
Johannes Schaub - litb
+2  A: 

An interesting note on this subject is On Trusting Trust [PDF link].

dmckee
That is a very interesting paper which I'd not read in a while. I'd forgotten the section on boot-strapping a compiler to understand a new escape character sequence such as '\v' - so I didn't immediately see its relevance. I'm glad I checked it out.
Jonathan Leffler
Awesome paper dmckee. thanks for sharing.
mahesh