views:

354

answers:

5

I was experimenting with '\' characters, using '\a\b\c...' just to enumerate for myself which characters Python interprets as control characters, and to what. Here's what I found:

\a - BELL
\b - BACKSPACE
\f - FORMFEED
\n - LINEFEED
\r - RETURN
\t - TAB
\v - VERTICAL TAB

Most of the other characters I tried, '\g', '\s', etc. just evaluate to the 2-character string of a backslash and the given character. I understand this is intentional, and makes sense to me.

But '\x' is a problem. When my script reaches this source line:

val = "\x"

I get:

ValueError: invalid \x escape

What is so special about '\x'? Why is it treated differently from the other non-escaped characters?

+2  A: 

\x is missing the hex character you want to match against: \xnn -> \x1B

Joscha
+4  A: 

\xhh is used to represent hex escape characters.

unutbu
+3  A: 

x is used to define (one byte) hexadecimal literals in strings, for example:

'\x61'

will evaluate to 'a', because 61 is the hexadecimal value of 97, which represents a in ASCII

KillianDS
+1  A: 

You're not giving the full escape sequence:

\xhh... 

The hexadecimal value hh, where hh stands for a sequence of hexadecimal digits (‘0’–‘9’, and either ‘A’–‘F’ or ‘a’–‘f’). Like the same construct in ISO C, the escape sequence continues until the first nonhexadecimal digit is seen. However, using more than two hexadecimal digits produces undefined results. (The ‘\x’ escape sequence is not allowed in POSIX awk.)

From: http://www.gnu.org/manual/gawk/html_node/Escape-Sequences.html

shrodes
+10  A: 

There is a table listing all the escape codes and their meanings in the documentation.

Escape Sequence    Meaning                        Notes
\xhh               Character with hex value hh    (4,5)

Notes:

4. Unlike in Standard C, exactly two hex digits are required.
5. In a string literal, hexadecimal and octal escapes denote the byte with the given value; it is not necessary that the byte encodes a character in the source character set. In a Unicode literal, these escapes denote a Unicode character with the given value.

Mark Byers
Ooof! Of course, thanks!
Paul McGuire