tags:

views:

224

answers:

4

I'm writing a compiler in C and need to get the ASCII value of a character defined in a source code file. For normal letters this is simple but is there any way to convert the string "\n" to the ASCII number for '\n' in C (needs to work on all characters)?

Cheers

A: 

You will need to write your own parser/converter. The list of escape sequences can be found online in many places. Parsing C style syntax is extremely difficult, so you may also wish to check out existing free implementations such as Clang.

Tronic
Boost.Spirit Qi or Lex might also be a good option for parsing a complex language.
Tronic
+1  A: 

I'm writing a compiler in C

Probably not a good idea to do it all in raw C. It's far better to be using something like Bison to handle the initial parsing.

That said, the best way of handling \* escapes is just to have a lookup table of what each escape turns into.

Anon.
+3  A: 

If the string is one character long, you can just index it:

char *s = "\n";
int ascii = s[0];

However, if you are on a system where the character set used is not ASCII, the above will not give you an ASCII value. If you need to make sure your code runs on such rare machines, you can build yourself an ASCII table and use that.

If on the other hand, you have two characters, i.e.,

char *s = "\\n";

then you can do something like this:

char c;
c = s[0];
if (c == '\\') {
    c = s[1]; /* assume s is long enough */
    switch (c) {
        case 'n': return '\n'; break;
        case 't': return '\t'; break;
        ...
        default: return c;
    }
}

The above assumes that your current compiler knows what '\n' means. If it doesn't, then you can still do it. For finding out how to do so, and a fascinating story, see Reflections on Trusting Trust by Ken Thompson.

Alok
I gather he actually has string "\\n" that he wants to convert...
Tronic
Yeah, the question wasn't clear, but I have updated my answer to cover that case too. Thanks!
Alok
A: 

You will need to implement this yourself. The reason is that what you are doing is determined by the String literal syntax of the language that you are compiling! (The fact that your compiler is implemented in C is immaterial.)

There are conventional escape sequences for String literals that span multiple languages; e.g. \n typically denotes the ASCII NewLine character. However, that doesn't mean that these conventions are appropriate for the language you are trying to compile.

Stephen C