tags:

views:

601

answers:

5

PEP 263 defines how to define Python source code encoding.

Normally, the first 2 lines of a Python file should start with:

#!/usr/bin/python
# -*- coding: <encoding name> -*-

But I have seen a lot of files starting with:

#!/usr/bin/python
# -*- encoding: <encoding name> -*-

-> encoding instead of coding.

So what is the correct way of declaring the file encoding ?

Is encoding permitted because the regex used is lazy ? Or is it just another form of declaring the file encoding ?

I'm asking this question because the PEP does not talk about encoding, it just talks about coding.

A: 

I suspect it is similar to Ruby - either method is okay.

This is largely because different text editors use different methods (ie, these two) of marking encoding.

With Ruby, as long as the first, or second if there is a shebang line contains a string that matches:

coding: encoding-name

and ignoring any whitespace and other fluff on those lines. (It can often be a = instead of :, too).

Matthew Schinckel
A: 

If I'm not mistaken, the original proposal for source file encodings was to use a regular expression for the first couple of lines, which would allow both.

I think the regex was something along the lines of coding: followed by something.

I found this: http://www.python.org/dev/peps/pep-0263/ Which is the original proposal, but I can't seem to find the final spec stating exactly what they did.

I've certainly used encoding: to great effect, so obviously that works.

Try changing to something completely different, like duhcoding: ... to see if that works just as well.

Lasse V. Karlsen
+6  A: 

Check the docs here:

"If a comment in the first or second line of the Python script matches the regular expression coding[=:]\s*([-\w.]+), this comment is processed as an encoding declaration"

"The recommended forms of this expression are

# -*- coding: <encoding-name> -*-

which is recognized also by GNU Emacs, and

# vim:fileencoding=<encoding-name>

which is recognized by Bram Moolenaar’s VIM."

So, you can put pretty much anything before the "coding" part, but stick to "coding" (with no prefix) if you want to be 100% python-docs-recommendation-compatible.

Rafał Dowgird
A: 

PEP 263:

the first or second line must match the regular expression "coding[:=]\s*([-\w.]+)"

Obviously "encoding: UTF-8" matches.

PEP provides some examples:

      #!/usr/bin/python
      # vim: set fileencoding=<encoding name> :

 

      # This Python file uses the following encoding: utf-8
      import os, sys
vartec
A: 

The PEP263 that you reference gives the answer in the 3rd paragraph:

More precisely, the first or second line must match the regular
expression "coding[:=]\s*([-\w.]+)"

Their wording is incorrect though, the first or second line must not match, it must contain the regexp described.

Bluebird75