tags:

views:

40

answers:

1

I have a 2-d array

 xx=[[a,1],[b,2],[c,3]]

Now I'm trying to remove duplicate entries from it. For simple 1-D array, simple code like

xx=list(set(xx))

would work. But trying set on 2-d elements gives an error

temp = set(xx)
TypeError: unhashable type: 'list'

One workaround would be to serialize xx elements, and then do a list(set()) on new array and then unserialize all the elements back again.

Is there any solution in python?

+1  A: 

Convert elements to tuple and then use set.

>>> xx=[['a',1],['b',2],['c',3],['c',3]]
>>> set(tuple(element) for element in xx)
set([('a', 1), ('b', 2), ('c', 3)])
>>> 

Tuples, unlike lists, can be hashed. Hence. And once you are done, convert the elements back to list. Putting everything together:

>>> [list(t) for t in set(tuple(element) for element in xx)]
[['a', 1], ['b', 2], ['c', 3]]
Manoj Govindan
somehow the code failed to remove the duplicate entries. set() is not able to detect duplicate tuples?
Neo
@Neo: This gets interesting. Can you post some sample values?
Manoj Govindan
string and number of same value or nearly same floating point numbers maybe?
Tony Veijalainen
Breaking your code into "for" loop did the trick. for i in range(len(celeInfo)): celeInfo[i] = tuple(celeInfo[i]) celeInfo = list(set(celeInfo)). Pardon me for I'm new to scripting, is something missing from your code.
Neo
btw, How does one format code in comments? my previous comment looks ugly :P
Neo
@Neo: (1) If a loop worked then a List comprehension should also work. I'd like to see `celeInfo` first before guessing what is going wrong. (2) You can use the built in function `enumerate` to go through a list while having access to its index. For instance: `for index, value in enumerate(celeInfo): celeInfo[index] = tuple(value)`.
Manoj Govindan
@Neo: Put the backquote character "`" around your code (ASCII code 96).
Manoj Govindan
celeInfo has elements of URL along with a Tag. Sample: [['/abi.htm', 'Abi (8)'], ['/ada.htm', 'Ada (12)']] . Could the problem be because of the leading '/'?
Neo
@Neo: Tested it with the info you gave me and it worked just fine. `xx = [['/abi.htm', 'Abi (8)'], ['/ada.htm', 'Ada (12)'], ['/ada.htm', 'Ada (12)'], ['/ada.htm', 'Ada (12)']]` and then `[list(t) for t in set(tuple(element) for element in xx)]`.
Manoj Govindan
@Neo: this wouldn't work if `Ada (12)` was the _string representation_ of an object rather than a _string instance_.
Manoj Govindan
Nope, it is a string instance. 'Ada (12)' is a direct input to the list. So it must be something stupid that I'm doing, I'll figure that out. Thanks a ton Manoj.
Neo