views:

1645

answers:

5

I am working on porting over a database from a custom MSSQL CMS to MYSQL - Wordpress. I am using Python to read a txt file with \t delineated columns and one row per line.

I am trying to write a Python script that will read this file (fread) and [eventually] create a MYSSQL ready .sql file with insert statements.

A line in the file I'm reading looks something like:

1    John Smith    Developer  http://twiiter.com/johns   Chicago, IL

My Python script so far:

import sys

fwrite = open('d:/icm_db/wp_sql/wp.users.sql','w')

fread = open('d:/icm_db/users.txt','r')

for line in fread:
    print line;


fread.close()
fwrite.close()

How can I "implode" each line so I can access each column and do business on it?

I need to generate multiple MYSQL insert statements per line I read. So... for each line read, I'd generate something like:

INSERT INTO `wp_users` (`ID`, `user_login`, `user_name`) 
VALUES (line[0], 'line[2]', 'line[3]');
A: 

What you probably want is something like this: data=line.split("\t")
It'll give you a nice sequence object to work with.
(By the way, no need for semicolons in Python. There's one here: print line;)

As Dave pointed out, this might leave a newline in there. Call strip() on line before splitting, like so: line.strip().split("\t")

Etienne Perot
This might leave a newline in the last field.
Dave
Then what about: line.strip().split("\t")
Etienne Perot
Eats all leading and trailing whitespace, not just the newline. This may or may not be a problem.
Dave
In his case, I think not, it's just a little single-use script to do his bidding. The split() method wouldn't be appropriate for real-world uses, he'd be better off using the csv module as other the other solutions say.
Etienne Perot
A: 

The Python Standard Library has a module for CSV (comma separated value) file reading and writing that can be made to work on tab separated files like your one. It's probably overkill for this task.

othercriteria
+7  A: 

Although this is easily doable, it does become easier with the csv module.

>>> import csv
>>> reader = csv.reader(open('C:/www/stackoverflow.txt'), delimiter='\t')
>>> for row in reader:
...     print row
...
['1', 'John Smith', 'Developer', 'http://twiiter.com/johns', 'Chicago, IL']
['2', 'John Doe', 'Developer', 'http://whatever.com', 'Tallahassee, FL']

Also, as pointed out, semicolons are not needed in Python. Try to kick that habit :)

Paolo Bergantino
...and regain that habit when you start golfing in Python :|
Dave
A: 

Knowing the exact number of columns helps self document your code:

fwrite = open("d:/icm_db/wp_sql/wp.users.sql","w")

for line in open("d:/icm_db/users.txt"):
  name, title, login, location = line.strip().split("\t")

  # Double up on those single quotes to avoid nasty SQL!
  safe_name = name.replace("'","''")
  safe_login = name.replace("'","''")

  # ID field is primary key and will auto-increment
  fwrite.write( "INSERT INTO `wp_users` (`user_login`, `user_name`) " )
  fwrite.write( "VALUES ('%s','%s');\n" % (safe_name,safe_login) )
tom
A: 
fwrite = open('/home/lyrae/Desktop/E/wp.users.sql','a')
fread = open('/home/lyrae/Desktop/E/users.txt','r')

for line in fread:
    line = line.split("\t")
    fwrite.write("insert into wp_users ( ID, user_login, user_name ) values (%s, '%s', '%s')\n" % (line[0], line[1], line[2]))

fread.close()
fwrite.close()

Assuming users.txt is:

1   John Smith Developer http://twiiter.com/johns Chicago, IL
2   Billy bob Developer http://twiiter.com/johns Chicago, IL
3   John Smith Developer http://twiiter.com/johns Chicago, IL

wp.users.sql will look like:

insert into wp_users ( ID, user_login, user_name ) values (1, 'John Smith', 'Developer')
insert into wp_users ( ID, user_login, user_name ) values (2, 'Billy bob', 'Developer')
insert into wp_users ( ID, user_login, user_name ) values (3, 'John Smith', 'Developer')

Assuming only 1 tab separates the id, name, position

lyrae