I am developing a program in Python that accesses a MySQL database using MySQLdb. In certain situations, I have to run an INSERT or REPLACE command on many rows. I am currently doing it like this:
db.execute("REPLACE INTO " + table + " (" + ",".join(cols) + ") VALUES" +
",".join(["(" + ",".join(["%s"] * len(cols)) + ")"] * len(data)),
[row[col] for row in data for col in cols])
It works fine, but it is kind of awkward. I was wondering if I could make it easier to read, and I found out about the executemany command. I changed my code to look like this:
db.executemany("REPLACE INTO " + table + " (" + ",".join(cols) + ") " +
"VALUES(" + ",".join(["%s"] * len(cols)) + ")",
[tuple(row[col] for col in cols) for row in data])
It still worked, but it ran a lot slower. In my tests, for relatively small data sets (about 100-200 rows), it ran about 6 times slower. For big data sets (about 13,000 rows, the biggest I am expecting to handle), it ran about 50 times slower. Why is it doing this?
I would really like to simplify my code, but I don't want the big drop in performance. Does anyone know of any way to make it faster?
I am using Python 2.7 and MySQLdb 1.2.3. I tried tinkering with the setinputsizes function, but that didn't seem to do anything. I looked at the MySQLdb source code and it looks like it shouldn't do anything.