views:

195

answers:

3

I have a CSV file and I am running a script against it to insert into a database. If the value is blank then I don't want to insert it. Here is what I have

if attrs[attr] != '' and attrs[attr] != None:
            log.info('Attriute ID: %s' % attr)
            log.info('Attriute Value: %s' % attrs[attr])
            sql = insert_attr_query(attrs[attr], object_id, attr)
            cursor.execute(sql)

It's blank and it doesn't = '' or None, then wth does it =?

+3  A: 

It's probably whitespace i.e. a tab or string with spaces try:-

attrs[attr].strip()
Strawberry
whitespace grrr
KacieHouser
Thank you too I didn't see you offer the strip function.
KacieHouser
+2  A: 

Presumably it contains whitespace. You could check this by printing repr(attrs[attr]) which will put quotes round it and show tabs at "\t"

Change the code to if attrs[attr] is not None and attrs[attr].strip() !="":

Dave Kirby
ohhh clever thank you!
KacieHouser
I was thinking about finding a function to strip whitespace, but I didn't want it striped unless there was only whitespace, but this makes sense.
KacieHouser
Strip is a string member function, isn't it? So you'd need `if attrs[attr] is not None and attrs[attr].strip() !="":`
Craig McQueen
@craig: thanks for the catch - fixed it.
Dave Kirby
yeah I caught the syntax error, but some of the things sent through were float, so I had to format the the var into a string, and then strip the whitespace out. Worked like a charm.
KacieHouser
A: 

You should (almost) always normalise whitespace in any text string that is intended for insertion in a database (or for many other purposes).

To normalise whitespace is to (1) strip any leading whitespace (2) strip any trailing whitespace (3) replace any internal runs (length >= 1) of whitespace by exactly 1 SPACE (U+0020).

Whitespace should not be limited to what standard Python provides, especially if you are working in Python 2.X and not using unicode objects. For example, in the default "C" locale, "\xA0" is not treated as whitespace but it's very likely to represent NO-BREAK SPACE (U+00A0).

Sample code for Python 2.X:

def normalize_white_space_u(unicode_object):
    return u' '.join(unicode_object.split())

def normalize_white_space_s(str_object):
    return ' '.join(str_object.replace('\xA0', ' ').split())

Generalizing the second function: replace each occurrence of a non-standard whitespace character by a single space and then do the split-join dance.

John Machin