views:

82

answers:

2

I am given either a single character or a string, and am using Python.

How do I find out if a specific character has a lowercase equivalent according to the standards (standard and special case mappings) proposed by Unicode?

And how do I find out if a string has one or more characters that have a lowercase equivalent according to the standards (standard and special case mappings) proposed by Unicode?

+5  A: 
def haslower(unicodechar):
    return unicodechar != unicodechar.lower()

def anylower(unicodestring):
    return any(haslower(c) for c in unicodestring)

This will only work correctly in as much as the Python version you're using has correctly implemented the .lower() method per unicode standards, of course. Also, I'm assuming that you don't consider, e.g., u'a', to "have a lowercase equivalent" (it has an uppercase one of course). If you mean something different, consider

def changescase(uc):
    return uc != uc.lower() or uc != uc.upper()

(I've renamed the argument to uc to avoid excessive line length;-) -- if that's what you want I recommend not naming the function in terms of "lowercase equivalent" as that would be sure to confuse readers/maintainers of your code!-)

Alex Martelli
It never occurred to me that .lower (and .upper) would work for accented characters too.
Matthew Schinckel
@Alex: Thanks. I run my app in GAE, so it's Python 2.5.2. I asked another question, just in case you'd like to answer it (for others to also see). http://stackoverflow.com/questions/3536397/does-python-version-2-5-2-follow-unicode-standards-for-lower-and-upper-functi
Albert
+2  A: 

@Albert, You appear to be overly concerned with the minutiae of case conversion, when you haven't yet sorted out (nor explained to answerers) what you really want to do.

=== Your previous attempt at explanation (in comment on my answer to this question) ===

@John: Well, I'm actually making an API for my web service. My webservice accepts a key that maps out to a specific record in my database. The key is case-sensitive, and the key can be composed of any unicode characteer. So in order to normalize all input, I will convert all key queries into lowercase (if they have uppercase equivalents). A consequence of that is when I create the record keys (which my users can customize), I cannot accept any uppercase character that can be converted to a lowercase equivalent by the toLower() function. So I'm trying to make a filter for that. Any suggestions?

=== and my replying comment ===

@Albert: If your keys are case sensitive, why are you normalising them??? "record keys which users can customize" means what??? "any unicode char" vs "cannot accept any uppercase char" ??? To answer your question literally: Looks like you can't accept a character c when c.lower() != c which means that you can't accept any key if key.lower() != key. I think that you should start a NEW QUESTION, explaining exactly what you are trying to do, with examples.

... and you've certainly asked a new question (in fact 2 of them) but you haven't explained anything. This "new" question is so new that @Alex Martelli's answer is essentially the same as my comment highlighted above.

I think that you should start a NEW QUESTION, with new content, explaining exactly what you are trying to do, with examples.

John Machin
Alright. I'll put together what I exactly am trying to do. Thanks!
Albert