tags:

views:

146

answers:

3

I have recently been going through some of our windows python 2.4 code and come across this:

self.logfile = open(self.logfile_name, "wua")

I know what w, u and a do on their own, but what happens when you combine them?

+3  A: 

The a is superfluous. wua is the same as wu since w comes first and will thus truncate the file. If you would reverse the order, that is, auw, that would be the same as au. Visualized:

>>> f = open('test.txt', 'r')
>>> f.read()
'Initial contents\n'
>>> f.close()
>>> f = open('test.txt', 'wua')
>>> print >> f, 'writing'
>>> f.close()
>>> f = open('test.txt', 'r')
>>> f.read()
'writing\n'
>>> f.close()
>>> f = open('test.txt', 'auw')
>>> print >> f, 'appending'
>>> f.close()
>>> f = open('test.txt', 'r')
>>> f.read()
'writing\nappending\n'
>>> f.close()

(Reminder: both a and w open the file for writing, but the former appends while the other truncates.)

Stephan202
Yeah that's what I figured.
Harley
+2  A: 

I did not notice that you knew what the modifiers did. Combined they will do the following:

A and W together is superfluous since both will open for writing. When using W, the file will be overwritten. When using A, all new text is appended after the existing content.

U means "open file XXX for input as a text file with universal newline interpretation".

  • W is for Write
  • A is for Append
  • U will convert the file to use the defined newline character.

More here: http://codesnippets.joyent.com/posts/show/1969

alexn
+1  A: 

Underneath the hood Python 2.4 passes the the builtin open's arguments on to the operating system's fopen function. Python does do some mangling of the mode string under certain conditions.

if (strcmp(mode, "U") == 0 || strcmp(mode, "rU") == 0)
    mode = "rb";

So if you pass an upper case U or rU it will open the file for binary reading. Looking at the GNU libc source and according to the MSDN page describing the windows implementation of fopen the 'u' option is ignored.

Having more than one mode designator ('r', 'w' and 'a') in the mode string has no effect. This can be seen by looking at the GNU libc implementation of mode string parsing:

switch (*mode)
{
case 'r':
  omode = O_RDONLY;
  break;
case 'w':
  omode = O_WRONLY;
  oflags = O_CREAT|O_TRUNC;
  break;
case 'a':
  omode = O_WRONLY;
  oflags = O_CREAT|O_APPEND;
  break;
default:
  __set_errno (EINVAL);
  return NULL;
}

The first character of the mode string is searched for one of 'r', 'w' or 'a', if it's not one of these characters an error is raised.

Therefore when a file is opened as "wua" it will be opened for writing only, created if it doesn't exist and truncated. 'u' and 'a' will be ignored.

brotchie