views:

305

answers:

4

In Python 2.x, I could pass custom function to sorted and .sort functions

>>> x=['kar','htar','har','ar']
>>>
>>> sorted(x)
['ar', 'har', 'htar', 'kar']
>>> 
>>> sorted(x,cmp=customsort)
['kar', 'htar', 'har', 'ar']

Because, in My language, consonents are comes with this order

"k","kh",....,"ht",..."h",...,"a"

But In Python 3.x, looks like I could not pass cmp keyword

>>> sorted(x,cmp=customsort)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'cmp' is an invalid keyword argument for this function

Is there any alternatives or should I write my own sorted function too?

Note: I simplified by using "k", "kh", etc. Actual characters are Unicodes and even more complicated, sometimes there is vowels comes before and after consonents, I've done custom comparison function, So that part is ok. Only the problem is I could not pass my custom comparison function to sorted or .sort

+1  A: 

Use the key argument instead. It takes a function that takes the value being processed and returns a single value giving the key to use to sort by.

sorted(x, key=somekeyfunc)
Ignacio Vazquez-Abrams
key only accept one parameter function, cmp have 2 parameters, they are different behavior. and I just tested, got error, because of key keyword only pass one parameter, `TypeError: customsort() takes exactly 2 positional arguments (1 given)`
S.Mark
That is correct.
Ignacio Vazquez-Abrams
Thanks for taking time to answer btw.
S.Mark
+2  A: 

Use the key argument (and follow the recipe on how to convert your old cmp function to a key function).

Tim Pietzcker
+1, looks like the recipe give me a workaround, but I think I am going to lose some performance by passing all the comparison operators `< > = ` to middle man, since my original custom sort is written in C , it had around 1/2x speed of default sort.
S.Mark
(Just looked at your profile) Your company is blocking access to Google and StackOverflow? How stupid can they get? But about your response: I'd be interested in the actual performance decrease. Can you `timeit` it?
Tim Pietzcker
Yeah, For blocking Google is they want us to use goo.ne.jp (their affiliate I think), but I customize NTLM Proxy and Build a server side script in my hosting and tunneling through that. I don't know why for stackoverflow. - And Sure, I will do some benchmarks.
S.Mark
I've done some benchmarks, looks like around 4x slower than passing custom C compare function directly.
S.Mark
+2  A: 

Instead of a customsort(), you need a function that translates each word into something that Python already knows how to sort. For example, you could translate each word into a list of numbers where each number represents where each letter occurs in your alphabet. Something like this:

my_alphabet = ['a', 'b', 'c']

def custom_key(word):
   numbers = []
   for letter in word:
      numbers.append(my_alphabet.index(letter))
   return numbers

x=['cbaba', 'ababa', 'bbaa']
x.sort(key=custom_key)

Since your language includes multi-character letters, your custom_key function will obviously need to be more complicated. That should give you the general idea though.

Daniel Stutzbach
Thanks +1, thats ICU way I think. but since *my* language don't have word seperators and don't have standard romanize rules, it will take time to research I think.
S.Mark
+1  A: 

I don't know if this will help, but you may check out the locale module. It looks like you can set the locale to your language and use locale.strcoll to compare strings using your language's sorting rules.

Mark Tolonen
Thats true for popular languages but *my* language is not fully supported by Operation Systems, ICU, and unicode.org, so thats out of question, but +1 for good suggestion.
S.Mark