views:

304

answers:

3

Hi I am trying to import various pipe delimited files using php 5.2 into a mysql database. I am importing various formats of piped data and my end goal is to try put the different data into a suitably normalised data structure but need to do some post processing on the data to put it into my model correctly.

I thought the best way to do this is to import into a table called buffer and map out the data then import into various tables. I am planning to create a table just called "buffer" with fields that represent each columns (there will be up to 80 columns) then apply some data transforms/mapping to get it to the right table.

My planned approach is to create a base class that generically reads the the pipe data into the buffer table then extend this class by having a function that contain various prepared statements to do the SQL magic, allowing me the flexibility to check the format is the same by reading the headers on the first row and changing it for one format.

My questions are:

  1. Whats the best way to do step one of reading the data from a local file saved into the table? I'm not too sure if i should use the LOAD DATA of mysql (as suggested in http://stackoverflow.com/questions/1092638/best-practice-import-csv-to-mysql-database-using-php-5-x) or just fopen then insert the data line by line.

  2. is this the best approach? How have other people approach this?

  3. Is there anything in the zen framework that may help?

Additional : I am planning to do this in a scheduled task.

+2  A: 

You don't need any PHP code to do that, IMO. Don't waste time on classes. MySQL LOAD DATA INFILE clause allows a lot of ways to import data, for 95% of your needs. Whatever delimiters, whatever columns to skip/pick. Read the manual attentively, it's worth to know what you CAN do with it. After importing the data, it can be already in a good shape if you write the query properly. The buffer table can be a temporary one. Then normalize or denormalize it and drop the initial table. Save the script in a file to reproduce the sequence of scripts if there's a mistake.

The best way is to write a SQL script, test if finally the data is in proper shape, seek for mistakes, modify, re-run the script. If there's a lot of data, do tests on a smaller set of rows.

[added] Another reason for sql-mostly approach is that if you're not fluent in SQL, but are going to work with a database, it's better to learn SQL earlier. You'll find a lot of uses for it later and will avoid the common pitfalls of programmers who know it superficially.

culebrón
A: 

There are dozens and dozens of ways. If you have local filesystem access to the MySQL instance, LOAD DATA. Otherwise you can just as easily transform each line into SQL (or a VALUES line) for periodic submittal to MySQL via PHP.

Xepoch
A: 

In the end i used dataload AND modified this http://codingpad.maryspad.com/2007/09/24/converting-csv-to-sql-using-php/ for different situations.

Joe