views:

260

answers:

5

I tried to implement XOR swap in python.

x,y= 10,20

x,y,x = x^y,x^y,x^y

print('%s , %s'%(x,y))

OUTPUT:

30 , 30

I am not new to python but I am unable to explain this output. It should have been 20,10.

What is going on under the hood?

+10  A: 

First, a tuple is created consisting of x^y, x^y and x^y. Then the tuple is unpacked into x, y, and x, causing both to be bound to the result of x^y.

Save yourself the headache and do it the Pythonic way:

x, y = y, x
Ignacio Vazquez-Abrams
+1  A: 

You can do this swap easily using the following:

x, y = 10, 20
x, y = y, x
print((x,y))

As for the behavior you're seeing, I'm pretty sure that's because the entire RHS is evaluated and then assigned to the LHS at the same time, and x^y is always 30 in that case.

durin42
+2  A: 

The XOR swap algorithm only makes sense when you have two pointers to mutable objects. a and b are two references to immutable ints.

EDIT (moving from comments as requested, and expanding):

Python integers are immutable. So every time you "modify" one using XOR, new storage will be allocated (or reused, e.g. for interning). This is fundamentally different from (e.g. C), where swap changes the value without allocating new memory. Put another way, XOR swap does not create new objects in either the C99 sense ("region of data storage in the execution environment, the contents of which can represent values") or the Python sense. As noted here, a true XOR swap can "exchange the values of the variables a and b without using extra space for a temporary variable."

Or empirically:

>>> x = 3
>>> y = 5
>>> print "x: ", x, ", id(x): ", id(x), "y: ", y, ", id(y): ", id(y)
x:  3 , id(x):  137452872 y:  5 , id(y):  137452848
>>> x ^= y
>>> print "x: ", x, ", id(x): ", id(x), "y: ", y, ", id(y): ", id(y)
x:  6 , id(x):  137452836 y:  5 , id(y):  137452848
>>> y ^= x
>>> print "x: ", x, ", id(x): ", id(x), "y: ", y, ", id(y): ", id(y)
x:  6 , id(x):  137452836 y:  3 , id(y):  137452872
>>> x ^= y
>>> print "x: ", x, ", id(x): ", id(x), "y: ", y, ", id(y): ", id(y)
x:  5 , id(x):  137452848 y:  3 , id(y):  137452872

In this case, we see that the interpreter (2.6.4) seems to be interning the integers, so x ends up with the memory address y originally had. But the main point is that the swap requires at least one allocation (137452836), and x and y don't retain the same memory address throughout.

In C:

int x = 3;
int y = 5;
printf("x: %d, &x: %p, y: %d, &y: %p\n", x, &x, y, &y);
x ^= y;
printf("x: %d, &x: %p, y: %d, &y: %p\n", x, &x, y, &y);
y ^= x;
printf("x: %d, &x: %p, y: %d, &y: %p\n", x, &x, y, &y);
x ^= y;
printf("x: %d, &x: %p, y: %d, &y: %p\n", x, &x, y, &y);

gives:

x: 3, &x: 0xbfd433ec, y: 5, &y: 0xbfd433e8
x: 6, &x: 0xbfd433ec, y: 5, &y: 0xbfd433e8
x: 6, &x: 0xbfd433ec, y: 3, &y: 0xbfd433e8
x: 5, &x: 0xbfd433ec, y: 3, &y: 0xbfd433e8

This is a real XOR swap, so x and y always maintain the same memory locations and there is no temporary.

Matthew Flaschen
@Matthew, as my answer shows, it makes just as much sense (i.e. not much) with references to immutable ints as it would with pointers to mutable objects. It's temporal succession vs simultaneous evaluation that's really the crux of the matter!
Alex Martelli
I think we're confusing the purpose of this implementation of the swap procedure with the XOR algorithm itself. You might choose the XOR implementation when you don't want to use extra storage, but doing it for that reason that absolutely requires mutable data. So if using XOR creates new values, even if it can be shown to work correctly - as in your answer, Alex - there's little point because you could just use the new value for a temporary instead.
Kylotan
The math works, but the underlying goal of a space-efficient swap is not satisfied.
Mike Graham
+4  A: 

While it's definitely best, as other answers say, to just x, y = y, x, if you're allergic to creating and unpacking tuples you can do it with successive x-oring... it just has to be successive, not simultaneous as you're doing!

>>> x = 1234
>>> y = 3421
>>> x ^= y
>>> y ^= x
>>> x ^= y
>>> print x
3421
>>> print y
1234

The key to the xor-swap trick is three successive xors, i.e. one after the other -- the three separate ^= statements, in this snippet. Of course, it makes no practical sense, but, if you're really keen on it, it does work, in Python like anywhere else;-).

Alex Martelli
This is ***not*** a XOR swap. XOR swap does not create new objects. You are creating three new objects.
Matthew Flaschen
@Matthew: XOR swap doesn't know what an object is. Alex's code simulates what happens in registers. Use `reg = [10, 20]` and operate on `reg[0]` and `reg[1]` if that makes you feel better :-)
John Machin
John, I am using object in the C99 sense ("object - region of data storage in the execution environment, the contents of which can represent values"). Alex is creating three Python objects, *and* three objects under that definition. This does demonstrate how XOR works, but XOR swap goes beyond that.
Matthew Flaschen
@Matthew: The point is that it works with only two names/handles/boxes/registers/whatever unlike the usual 3-person-MHTP algorithm :-)
John Machin
I disagree. The point is *"to exchange the values of the variables a and b without using extra space for a temporary variable."* (http://graphics.stanford.edu/~seander/bithacks.html#SwappingValuesXOR). These algorithms create not one, but __three__ extra objects.
Matthew Flaschen
+1 for `trick is three successive xors`.
TheMachineCharmer
@Matthew Flaschen ... *to exchange the values of the variables a and b without using extra space for a temporary variable.* using a temporary variable means we'd be using more than 2 variables at any step. If you look closely we're not using more than 2 variables at any point of time. The spirit of temporary variable is, you keep something aside for later use. True we're *creating* new objects, but the others are gone for ever. Its as good as not present there. If it makes any help think all the non-referenced int are garbage collected as soon as you remove the reference.
jeffjose
It's not a question of variables, it's a question of objects (in either C99 or Python sense). "True we're creating new objects, but the others are gone for ever." XOR swap requires modifying the original memory. Even if we didn't care about this (we do), it is not guaranteed that "all the non-referenced int are garbage collected". What the docs really say is "An implementation is allowed to postpone garbage collection or omit it altogether" (http://docs.python.org/reference/datamodel.html). So it's certainly not guaranteed that at most two objects can be in play.
Matthew Flaschen
@Matthew, in Python **every** integer operation **may** "create a new object" -- or may not do so: it's entirely up to the implementation because it's strictly a performance issue (no effect whatsoever on semantics!), the application code cannot control it. For example, small ints **may** be cached (so no new object is actually created for ints < some threshold, entirely up to the Python implementation), memory of an existing object may be reused if the implementation can prove the last reference to it is being dropped, etc. Your assurance of creation is misplaced (more...)
Alex Martelli
@Matthew, to respond to your sea-lawyering in detail: "without using extra space for a temporary variable" -- there's no extra VARIABLE used here (there may be extra OBJECTS -- which are not variable but immutable -- but definitely no variables, so we're OK!-). And: if an object is a "region of data storage in the execution environment", then a Python object isn't an object any more as soon as the last ref to it drops, because this means it isn't *in the execution environment* -- it's dead bits available to the underlying implementation **below** the environment to use (or not) as it pleases.
Alex Martelli
The old reference doesn't go away until *after* the new integer object is created on the stack, so there actually is a period in which three integers exist simultaneously. http://pastebin.com/neU3SMFq Could a Python implementation optimize the specific case? Only in theory; it'd be a uselessly narrow net-loss optimization. In real life, using *actual* Python implementations, this does use storage for three objects (five, actually).
Glenn Maynard
And just in case any beginners are reading this and drawing the wrong conclusions: *don't do this*, in any language. It's not faster (even in C); it'll result in nothing but ugly, hard-to-read code.
Glenn Maynard
Alex, since you're getting hung up on the "temporary variable" part of that quote, see this def (http://www.cs.cmu.edu/~mjs/123/lectures/Mar-03/index.html), which says, "Using the XOR swap algorithm, however, neither temporary storage nor overflow detection are needed." XOR swap *never* requires temporary variables *or* storage, and it places the new values in the original memory. Saying that Python *may* not create a new object is missing the point (I mention possible reuse in my answer). XOR swap is guaranteed not to. And Glenn, no one is saying XOR swap is a worthwhile optimization.
Matthew Flaschen
@Matthew, can you please point me to the chapter and verse im the ISO standard that says a C compiler is forbidden to allocate an object as a side effect of doing `x^=y` (e.g. for debugging purposes, or whatever else the compiler implementor wants)? That's what you're stating when you're saying "is **guaranteed** not to", and I believe you're utterly mistaken -- C leaves a lot of leeway to its implementations, roughly like Python.
Alex Martelli
First, you're claiming C doesn't support XOR swap, not that Python does. So use assembly. ;) But I think you're missing the part about "in the execution environment". Sure, the C compiler can do whatever it wants in the background, but the environment accessible to C code will only contain the two objects. In Python, it is clear it can contain more.
Matthew Flaschen
@Matthew, there's no prohibition (unless you can show me chapter and verse in the ISO standard) against the compiler making extra allocated objects accessible to C code (e.g. through identifiers reserved for the implementation), nor similar ukases against any implementation of a CPU's instruction set. So if you think you've demonstrated Python "can't do xor-swap", you must believe you've demonstrated that NO language (including machine language, which could be running on different CPU implementations!) possibly can: i.e., for all your sea-lawyering, you've demonstrated nothing at all.
Alex Martelli
It's amusing for you to complain that I'm "lawyering", then turn around and insist on pin-point cites. The "verse" you demand is "For each different entity that an identifier designates, the identifier is visible (i.e., can be used) only within a region of program text called its scope. [...] Every other identifier [besides labels] has scope determined by the placement of its declaration". No declaration, no scope, no visibility. I have no idea what point you're trying to make about XOR swap in machine language.
Matthew Flaschen
@Matthew, usability of identifiers has nothing to do with the compiler's ability to make objects accessible: e.g. an implementation can decide that `__x` is visible in any scope and returns a pointer allowing access to said objects ("reserved to the implementation for any use" per 17.4.3.2.1 in C++ std (2003), 7.1.3 per C std (1999), so that use is fine) so those objects are perfectly allowed to "in the execution environment", contra your assertions. Similarly, a CPU (seen as an implementation of its machine language) can allocate xtra objects "in the execution environment", quite lawfully.
Alex Martelli
And the point of all this is that your "proof" that Python can't do XOR swaps (because a given implementation might allocate extra objects if the values are 1234 and 3421 -- although it wouldn't if the values are 10 and 20, as per the OP's question, e.g. in CPython since many releases) is totally bogus: the "might allocate extra objects" (in execution environment) depends on the implementation (of Python, of C, of any machine language) and thus obviously cannot be used to "deduce" anything at all, contra your silly claims.
Alex Martelli
It doesn't matter what it *can* do, it matters what it *does* do. Regardless, it's a meaningless point, since XOR swaps are almost always slower than normal swaps. Anyhow, an amusing note: in GCC, variable and XOR swaps generate the *exact same code* under optimization (usually an XCHG opcode).
Glenn Maynard
+3  A: 

You need to transcribe the "algorithm" line for line.

>>> x, y = 10, 20
>>> x = x ^ y; print x, y
30 20
>>> y = x ^ y; print x, y
30 10
>>> x = x ^ y; print x, y
20 10
>>>

You need to read the remainder of the Wikipedia article, which explains that correct implementation blocks parallel operations, and that the whole idea is pretty much useless (on modern computer architectures, at least).

John Machin
+1 for the `whole idea is pretty much useless`. unless for learning purpose, this kind of trick gives a headache to anyone reading the code.
Adrien Plisson