views:

321

answers:

2

Hi there everyone,

This is my first post! I also just started programming, so please bear with me!

I am trying to load a bunch of .csv files into a database, in order to later perform various reports on the data. I started off by creating a few tables in mysql with matching field names and data types to what will be loaded into the tables. I am manipulating the filename (in order to parse out the date to use as a field in my table) and cleaning up the data with python.

So my problem right now (haha...) is that I get this error message when I attempt the 'Insert Into' query to mysql.

Traceback (most recent call last):  
File "C:\Program Files\Python\load_domains2.py", line 80, in <module>  
cur.execute(sql)
File "C:\Program Files\Python\lib\site-packages\MySQLdb\cursors.py", line 166, in execute
self.errorhandler(self, exc, value)
File "C:\Program Files\Python\lib\site-packages\MySQLdb\connections.py", line 35, in defaulterrorhandler
raise errorclass, errorvalue
OperationalError: (1054, "Unknown column 'a1200e.com' in 'field list'")

'a1200e.com' refers to a specific domain name I'm inserting into that column. My query is as follows:

sql="""INSERT INTO temporary_load
    (domain_name, session_count, search_count, click_count,
    revenue, revenue_per_min, cost_per_click, traffic_date)
    VALUES (%s, %d, %d, %d, %d, %d, %d, %s)""" %(cell[0],
                                                int(cell[1]),
                                                int(cell[2].replace (",","")),
                                                int(cell[3].replace(",","")),
                                                float(cell[4].replace("$","")),
                                                float(cell[5].replace("$","")),
                                                float(cell[6].replace("$","")),
                                                parsed_date)

    cur.execute(sql)

I am very new at all this, so I'm sure my code isn't at all efficient, but I just wanted to lay everything out so it's clear to me. What I don't understand is that I have ensured my table has correctly defined data types (corresponding to those in my query). Is there something I'm missing? I've been trying to work this out for a while, and don't know what could be wrong :/

Thanks so much!!! Val

+1  A: 

You should be using DB-API quoting instead of including the data in the SQL query directly:

sql = """INSERT INTO temporary_load
    (domain_name, session_count, search_count, click_count,
    revenue, revenue_per_min, cost_per_click, traffic_date)
    VALUES (%s, %d, %d, %d, %d, %d, %d, %s)"""
args = (cell[0],
        int(cell[1]),
        int(cell[2].replace (",","")),
        int(cell[3].replace(",","")),
        float(cell[4].replace("$","")),
        float(cell[5].replace("$","")),
        float(cell[6].replace("$","")),
        parsed_date)
cur.execute(sql, args)

This makes the DB-API module quote the values appropriately, and resolves a whole host of issues that you might get when doing it by hand (and usually incorrectly.)

Thomas Wouters
ok thanks. For some reason, my comment to another person who responded is gone...maybe you can help me?Basically, now I need to also get rid of the commas for the float types...but I am already replacing the dollar signs. How can I replace two types of symbols for one variable?
DalivDali
@DalivDali, if you ask one question per question you'll get answers -- but answering "half a question" since you're asking two questions in a single question and one already got answered makes little sense (indeed it makes little sense for you to ask 2 questions in one!!!).
Alex Martelli
+2  A: 

Thomas is, as usual, absolutely correct: feel free to let MySQLdb handle the quoting issues.

In addition to that recommendation:

  1. The csv module is your friend.
  2. MySQLdb uses the "format" parameter style as detailed in PEP 249.
    What does that mean for you?
    All parameters, whatever type, should be passed to MySQLdb as strings (like this %s). MySQLdb will make sure that the values are properly converted to SQL literals.
    By the way, MySQLdb has some good documentation.
  3. Feel free to include more detail about your source data. That may make diagnosing the problem easier.

Here's one way to insert values to a MySQL database from a .csv file:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import csv
import MySQLdb
import os

def main():
    db = MySQLdb.connect(db="mydb",passwd="mypasswd",) # connection string

    filename = 'data.csv'
    f = open(filename, "rb") # open your csv file
    reader = csv.reader(f)
    # assuming the first line of your csv file has column names
    col_names = reader.next() # first line of .csv file
    reader = csv.DictReader(f, col_names) # apply column names to row values

    to_db = [] # this list holds values you will insert to db
    for row in reader: # loop over remaining lines in .csv file
        to_db.append((row['col1'],row['col2']))
    # or if you prefer one-liners
    #to_db = [(row['col1'],row['col2']) for row in reader]
    f.close() # we're done with the file now

    cursor = db.cursor()
    cursor.executemany('''INSERT INTO mytable (col1,col2) 
                    VALUES (%s, %s)''', to_db) # note the two arguments
    cursor.close()
    db.close()

if __name__ == "__main__":
    main()
Adam Bernier