views:

1218

answers:

2

I'm using Python and its MySQLdb module to import some measurement data into a Mysql database. The amount of data that we have is quite high (currently about ~250 MB of csv files and plenty of more to come).

Currently I use cursor.execute(...) to import some metadata. This isn't problematic as there are only a few entries for these.

The problem is that when I try to use cursor.executemany() to import larger quantities of the actual measurement data, MySQLdb raises a

TypeError: not all arguments converted during string formatting

My current code is

def __insert_values(self, values):
    cursor = self.connection.cursor()
    cursor.executemany("""
        insert into values (ensg, value, sampleid)
        values (%s, %s, %s)""", values)
    cursor.close()

where values is a list of tuples containing three strings each. Any ideas what could be wrong with this?

Edit:

The values are generated by

yield (prefix + row['id'], row['value'], sample_id)

and then read into a list one thousand at a time where row is and iterator coming from csv.DictReader.

+1  A: 

The message you get indicates that inside the executemany() method, one of the conversions failed. Check your values list for a tuple longer than 3.

For a quick verification:

max(map(len, values))

If the result is higher than 3, locate your bad tuple with a filter:

[t for t in values if len(t) != 3]

or, if you need the index:

[(i,t) for i,t in enumerate(values) if len(t) != 3]
gimel
Thanks but that isn't the problem. All of the tuples have the correct length.
lhahne
Well, now check your format has the correct number of %s ...
gimel
+2  A: 

In retrospective this was a really stupid but hard to spot mistake. Values is a keyword in sql so the table name values needs quotes around it.

def __insert_values(self, values):
    cursor = self.connection.cursor()
    cursor.executemany("""
        insert into `values` (ensg, value, sampleid)
        values (%s, %s, %s)""", values)
    cursor.close()
lhahne