Hi All,
I am trying to use examples that are fun/interesting to me in order to learn MySQL and Python. I am new to both, as I am a finance guy by trade. I have learned quite a bit along the way, but I am at a point where I am not sure where to go.
The data files I am looking to import are contained in a zip file, hosted: here
I extracted the directory, are there are well over 1000 files. In the code below, I am simply trying to reference one of the files individually to parse it into MySQL.
I am fully aware that there has to be an easier way to go about this, but I am learning, and I am sure my code reflects it (you can see that I typed in some basic prints to see that my code was correct).
Any help you can provide will be GREATLY appreciated. Essentially, when it comes to Python, I am looking as it as a hobby and a way to get at the large amount of data on the web that I want to play with. I pasted my code below so you can see where I am coming from.
- Brock
P.S. Sorry about the code below, can't figure out a better way to post it.
The code below is my create table script
> DROP TABLE IF EXISTS `nba`.`event`;
CREATE TABLE `nba`.`event` (
`a1` varchar(45) DEFAULT NULL,
`a2` varchar(45) DEFAULT NULL,
`a3` varchar(45) DEFAULT NULL,
`a4` varchar(45) DEFAULT NULL,
`a5` varchar(45) DEFAULT NULL,
`h1` varchar(45) DEFAULT NULL,
`h2` varchar(45) DEFAULT NULL,
`h3` varchar(45) DEFAULT NULL,
`h4` varchar(45) DEFAULT NULL,
`h5` varchar(45) DEFAULT NULL,
`period` int(11) DEFAULT NULL,
`time` time DEFAULT NULL,
`team` varchar(3) DEFAULT NULL,
`etype` varchar(15) DEFAULT NULL,
`assist` varchar(45) DEFAULT NULL,
`away` varchar(45) DEFAULT NULL,
`block` varchar(45) DEFAULT NULL,
`entered` varchar(45) DEFAULT NULL,
`home` varchar(45) DEFAULT NULL,
`left` varchar(45) DEFAULT NULL,
`num` int(11) DEFAULT NULL,
`opponent` varchar(45) DEFAULT NULL,
`outof` varchar(45) DEFAULT NULL,
`player` varchar(45) DEFAULT NULL,
`points` int(11) DEFAULT NULL,
`possession` varchar(45) DEFAULT NULL,
`reason` varchar(50) DEFAULT NULL,
`result` varchar(10) DEFAULT NULL,
`steal` varchar(45) DEFAULT NULL,
`type` varchar(30) DEFAULT NULL,
`x` varchar(2) DEFAULT NULL,
`y` varchar(2) DEFAULT NULL,
`gameid` varchar(15) NOT NULL,
`seqnum` int(11) NOT NULL AUTO_INCREMENT,
`updated` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`seqnum`,`gameid`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT='Datafiles of play by play from basketballgeek.com';
Here is my python script. I put the table above in a schema called NBA. I set my table up so that, I believe, a sequence number is created for every record entered into the db. I pass it the gameid as a string, and also assume that the timestamp will be created for every insert statement. I know there is an error, but I can't figure out what it is.
sql = """LOAD DATA INFILE '%s' INTO TABLE event FIELDS TERMINATED BY ',' LINES TERMINATED BY '\\n';""" % path print sql try: cursor.execute(sql) db.commit() except: print "ERROR" db.rollback() db.close()