views:

1289

answers:

2

I have two String.printable mysteries in the one question.

First, in Python 2.6:

>>> string.printable
'0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~ \t\n\r\x0b\x0c'

Look at the end of the string, and you'll find '\x0b\x0c' sticking out like a sore-thumb. Why are they there? I am using a machine set to Australian settings, so there shouldn't be any accented characters or the like.

Next, try running this code:

for x in string.printable: print x,
print
for x in string.printable: print x

The first line successfully prints all the characters separated by a space. The two odd characters turn out as the Male and Female symbols.

The second line successfully prints all the characters EXCEPT THE LAST separated by a line feed. The Male symbol prints; the female symbol is replaced with a missing character (a box).

I'm sure Python wasn't intended to be gender-biased, so what gives with the difference?

+18  A: 

There is a difference in "printable" for "can be displayed on your screen". Your terminal displays the low ascii printer control codes 0x0B and 0x0C as the male and female symbols because that is what those indices in your font contain. Those characters are more accurately described as the Vertical Tabulator and Form Feed characters. These two characters, along with \t \r and \n, are all printable, and do well defined things on a printer.

Sparr
+1: Not accented characters. We used call them "carriage-control characters" back when "printers" had "carriages".
S.Lott
Excellent. That explains why they appear in the string.printable.Now, why would the terminal display different codes depending on the spacing? I wonder how the Form Feed character interacts with a preceding newline.
Oddthinking
@Oddthinking: Your terminal driver does things to map the characters sent to meaningful behaviors on your terminal. Terminal hardware sometimes requires padding characters and other malarkey to work correctly.
S.Lott
+2  A: 

From within cmd.exe:

>>> print '\x0b'
♂
>>> print '\x0c'
♀
>>> print '\f' # form feed
♀
>>> print '\v' # vertical tab
♂
>>>

Inside Emacs:

>>> print '\f\v'
^L^K

Here's an excerpt from formats(5)' man page:

| Sequence | Character    | Terminal Action                             |
|----------+--------------+---------------------------------------------|
| \f       | form-feed    | Moves the printing position to the initial  |
|          |              | printing position of the next logical page. |
| \v       | vertical-tab | Moves the printing position to the start of |
|          |              | the next vertical tab position. If there    |
|          |              | are no more vertical tab positions left on  |
|          |              | the page, the behavior is undefined.        |
J.F. Sebastian