views:

49

answers:

2

I'm working with a sqlite database, using python/django. I would need to have my results, which contain german umlauts (ä,ö,ü) to be correctly sorted (not with the umlauts at the end). Reading different articles on the web I'm not even sure if it is possible or not. So any advise/instructions on that are appreciated. I already studied the docs for create_collation etc. but I couldn't find any helpful examples for "beginners". Furthermore, if it is possible I'd like to know how to apply the necessary modifications on already existing tables!

+2  A: 

A similar question was asked on here 1 year ago.

The answer may be overkill for you, as stated by the OP of that question. I do, however, recommend James Tauber's Unicode Collation Algorithm.

An example is right on his webpage:

from pyuca import Collator

c = Collator("allkeys.txt") 
sorted_words = sorted(words, key=c.sort_key)
Sean
lazerscience
I would recommend then reworking your database with a primary key number, making sure to INSERT each of the sorted objects in the correct order.
Sean
+1  A: 

So any advise/instructions on that are appreciated. I already studied the docs for create_collation etc. but I couldn't find any helpful examples for "beginners".

To create a collation with sqlite3, you need a function that works like C's strcmp.

def stricmp(str1, str2):
    str1 = str1.lower()
    str2 = str2.lower()
    if str1 == str2:
        return 0
    elif str1 < str2:
        return -1
    else:
        return 1

db = sqlite3.connect(':memory:')
# SQLite's default NOCASE collation is ASCII-only
# Override it with a (mostly) Unicode-aware version
db.create_collation('NOCASE', stricmp)

Note that although this collation will correctly handle 'ü' == 'Ü', it will still have 'ü' > 'v', because the letters still sort in Unicode code point order after case folding. Writing a German-friendly collation function is left as an exercise to the reader. Or better, to the author of an existing Unicode library.

Furthermore, if it is possible I'd like to know how to apply the necessary modifications on already existing tables!

You only need to modify the DB if you have an index that uses a collation you've overridden. Drop that index and re-create it.

Note that any column with a UNIQUE (or PRIMARY KEY) constraint will have an implicit index.

dan04