ansaurus

Question

Best practice: Import mySQL file in PHP; split queries

Answer 1

+1 A:

Can't you install phpMyAdmin, gzip the file (which should make it much smaller) and import it using phpMyAdmin?

EDIT: Well, if you can't use phpMyAdmin, you can use the code from phpMyAdmin. I'm not sure about this particular part, but it's generaly nicely structured.

Lukáš Lalinský 2009-12-10 18:41:17

Nope, I need an automated solution to run every day.

Pekka 2009-12-10 18:59:44

frunsi 2010-01-07 15:02:49

Answer 2

A:

Can you use LOAD DATA INFILE?

If you format your db dump file using SELECT INTO OUTFILE, this should be exactly what you need. No reason to have PHP parse anything.

mluebke 2009-12-10 18:46:04

Depends on the dump format...

gahooa 2009-12-10 18:46:32

I *think* LOAD DATA INFILE is closed for my mySQL user in this case but will check.

Pekka 2009-12-10 18:59:09

I checked, LOAD DATA INFILE won't work.

Pekka 2010-01-04 12:31:07

I'm upvoting this to even it out. The answer is technically correct, even if it didn't help me, and I don't see why it should be downvoted.

Pekka 2010-01-10 13:12:16

Answer 3

A:

Already answered: http://stackoverflow.com/questions/147821/loading-sql-files-from-within-php Also:

BYK 2010-01-04 12:45:12

Thanks for pointing out the duplicate, but I can't see a solution there that fits my needs.

Pekka 2010-01-04 12:46:37

It suggests having a look at phpMyAdmin's code which makes perfect sense.

BYK 2010-01-04 12:48:07

I have tried that once, but didn't get very far, as the code is quite complex. If it's the only way, I will work my way through it, but there must be some sort of stand alone script for this somewhere.

Pekka 2010-01-04 12:49:28

I have added several links to the answer. I suggest reading them.

BYK 2010-01-04 12:53:11

I will, thanks a lot.

Pekka 2010-01-04 13:01:38

Answer 4

A:

Do these links help :
http://www.ozerov.de/bigdump.php
http://www.wanderings.net/notebook/Main/HowToImportLargeMySQLDataFiles

Zaje 2010-01-04 12:50:44

+1 for big dump

solomongaby 2010-01-04 13:14:37

Yeah, bigdump will save your sanity.

Arkh 2010-01-05 14:15:13

Answer 5

A:

Hi.. what do you think about:

system("cat xxx.sql | mysql -l username database");

ArneRie 2010-01-04 12:56:04

Can't do that - as I write in the question, I have no access to the command line. (The downvote is not mine, though).

Pekka 2010-01-04 13:00:58

I forgot to post my comment : This is a shared host, you cannot use "system" function and a lot of "somehow dangerous" functions.

Arno 2010-01-04 13:23:57

Answer 6

A:

I suppose that your data file always the same structure...

If so, you can use a regex to transform your data : create a "update_data.php" which contains all your commands.

When you try to execute it, it does the following :

Look for your uploaded data file
Transform your data into a PHP file with regex

Those previous operation are done once per input file.

Connect to the DB
Include the PHP file you've just created
Disconnect from DB

*Eventually, you can check store the last executed query, so you can be able to split the execution of your load (depending on the max_execution_time value and the time needed to load the whole data).*

Here's a sample data file :

1;admin;1234;[email protected]
2;test;1244;[email protected]
3;user;10;[email protected]

and you create an array from this data :

// First part (points 1 & 2)
// Check that your data file have already been transformed
$data = file_get_contents('datafile.txt');
preg_match_all('YOUR_REGEX_HERE', $data, $matches );
$export_data = var_export( $matches, true);
file_put_contents('php_file.php', $export_data);
// ... initialize last included line in DB (write in a file ?)

// Second part (points 3, 4 & 5)
include 'php_file.php';
// Check last inserted line into DB (read from a file ?)
// Connect to DB
// Loops through data
//   get back data from the "$matches" variable
//   insert data
//   update last included line

Arno 2010-01-04 13:21:39

Answer 7

+7 A:

Here is a memory-friendly function that should be able to split a big file in individual queries without needing to open the whole file at once:

function SplitSQL($file, $delimiter = ';')
{
    set_time_limit(0);

    if (is_file($file) === true)
    {
        $file = fopen($file, 'r');

        if (is_resource($file) === true)
        {
            $query = array();

            while (feof($file) === false)
            {
                $query[] = fgets($file);

                if (preg_match('~' . preg_quote($delimiter, '~') . '\s*$~iS', end($query)) === 1)
                {
                    $query = trim(implode('', $query));

                    if (mysql_query($query) === false)
                    {
                        echo '<h3>ERROR: ' . $query . '</h3>' . "\n";
                    }

                    else
                    {
                        echo '<h3>SUCCESS: ' . $query . '</h3>' . "\n";
                    }

                    while (ob_get_level() > 0)
                    {
                        ob_end_flush();
                    }

                    flush();
                }

                if (is_string($query) === true)
                {
                    $query = array();
                }
            }

            return fclose($file);
        }
    }

    return false;
}

I tested it on a big phpMyAdmin SQL dump and it worked just fine.

Some test data:

CREATE TABLE IF NOT EXISTS "test" (
    "id" INTEGER PRIMARY KEY AUTOINCREMENT,
    "name" TEXT,
    "description" TEXT
);

BEGIN;
    INSERT INTO "test" ("name", "description")
    VALUES (";;;", "something for you mind; body; soul");
COMMIT;

UPDATE "test"
    SET "name" = "; "
    WHERE "id" = 1;

And the respective output:

SUCCESS: CREATE TABLE IF NOT EXISTS "test" ( "id" INTEGER PRIMARY KEY AUTOINCREMENT, "name" TEXT, "description" TEXT );
SUCCESS: BEGIN;
SUCCESS: INSERT INTO "test" ("name", "description") VALUES (";;;", "something for you mind; body; soul");
SUCCESS: COMMIT;
SUCCESS: UPDATE "test" SET "name" = "; " WHERE "id" = 1;

Alix Axel 2010-01-06 07:19:44

Any reason why this was down-voted?

Alix Axel 2010-01-09 10:56:09

Thanks Axel, this worked fine for me.

Pekka 2010-01-10 13:03:08

No problem Pekka, glad I could help.

Alix Axel 2010-01-10 18:30:08

Answer 8

+2 A:

When StackOverflow released their monthly data dump in XML format, I wrote PHP scripts to load it into a MySQL database. I imported about 2.2 gigabytes of XML in a few minutes.

My technique is to prepare() an INSERT statement with parameter placeholders for the column values. Then use XMLReader to loop over the XML elements and execute() my prepared query, plugging in values for the parameters. I chose XMLReader because it's a streaming XML reader; it reads the XML input incrementally instead of requiring to load the whole file into memory.

You could also read a CSV file one line at a time with fgetcsv().

If you're inporting into InnoDB tables, I recommend starting and committing transactions explicitly, to reduce the overhead of autocommit. I commit every 1000 rows, but this is arbitrary.

I'm not going to post the code here (because of StackOverflow's licensing policy), but in pseudocode:

connect to database
open data file
PREPARE parameterizes INSERT statement
begin first transaction
loop, reading lines from data file: {
    parse line into individual fields
    EXECUTE prepared query, passing data fields as parameters
    if ++counter % 1000 == 0,
        commit transaction and begin new transaction
}
commit final transaction

Writing this code in PHP is not rocket science, and it runs pretty quickly when one uses prepared statements and explicit transactions. Those features are not available in the outdated mysql PHP extension, but you can use them if you use mysqli or PDO_MySQL.

I also added convenient stuff like error checking, progress reporting, and support for default values when the data file doesn't include one of the fields.

I wrote my code in an abstract PHP class that I subclass for each table I need to load. Each subclass declares the columns it wants to load, and maps them to fields in the XML data file by name (or by position if the data file is CSV).

Bill Karwin 2010-01-06 18:14:49

It's a nice technique indeed, but this doesn't provide a solution to split individual queries, which IMO is the hardest problem.

Alix Axel 2010-01-07 18:48:25

I don't think it's practical to parse a SQL script, because there are too many edge cases. I recommend preparing the data dump as *data only*, using XML or CSV or some other format that you can parse easily in PHP.

Bill Karwin 2010-01-07 19:52:57

I agree with you Bill, but that doesn't seem to be a solution for Pekka (at least that's what I understand from his question).

Alix Axel 2010-01-08 01:04:54

I got a downvote. If you downvote on StackOverflow, please offer a comment to explain why.

Bill Karwin 2010-01-09 15:49:11

Thanks Bill. As I amended to the question, the export phase is already pretty much wrapped up using `mysqldump`, so while it is probably generally the better way to use an export format as you describe, my requirement in this question is to import actual SQL queries.

Pekka 2010-01-10 13:09:41

Answer 9

A:

If its only 2-3 Mb you can just run through them and do a mysql_query on each one, it should not take to long. If they are separated by a comma here is the code:

$file = file_get_contents('file.txt');
$queries = explode(",", $file);
foreach($queries as $query){
    mysql_query($query);
}

Matt 2010-01-06 18:32:57

file_get_contents reads the whole file into memory. also, exploding on a comma is just plain wrong. So is exploding on a semicolon so don't do that either.

hobodave 2010-01-09 10:43:25

Answer 10

+2 A:

Single page PHPMyAdmin - Adminer - Just one PHP script file. check : http://www.adminer.org/en/

Zaje 2010-01-07 20:38:12

It's no solution for my automated scenario but great to know. Thanks for the link. +1

Pekka 2010-01-10 13:21:23

Anytime dude, welcome.

Zaje 2010-01-10 13:33:23

Dude! This stuff rules supreme!!! I wish I knew it existed before, it would have saved me tons of time!

Ghostrider 2010-06-10 23:49:57

Answer 11

A:

I ran into the same problem. I solved it using a regular expression:

function splitQueryText($query) {
    // the regex needs a trailing semicolon
    $query = trim($query);

    if (substr($query, -1) != ";")
        $query .= ";";

    // i spent 3 days figuring out this line
    preg_match_all("/(?>[^;']|(''|(?>'([^']|\\')*[^\\\]')))+;/ixU", $query, $matches, PREG_SET_ORDER);

    $querySplit = "";

    foreach ($matches as $match) {
        // get rid of the trailing semicolon
        $querySplit[] = substr($match[0], 0, -1);
    }

    return $querySplit;
}

$queryList = splitQueryText($inputText);

foreach ($queryList as $query) {
    $result = mysql_query($query);
}

stealthdragon 2010-01-08 01:20:50

Answer 12

A:

You can use phpMyAdmin for importing the file. Even if it is huge, just use UploadDir configuration directory, upload it there and choose it from phpMyAdmin import page. Once file processing will be close to the PHP limits, phpMyAdmin interrupts importing, shows you again import page with predefined values indicating where to continue in the import.

Michal Čihař 2010-01-08 15:13:56

Answer 13

+1 A:

Export

The first step is getting the input in a sane format for parsing when you export it. From your question it appears that you have control over the exporting of this data, but not the importing.

~: mysqldump test --opt --skip-extended-insert | grep -v '^--' | grep . > test.sql

This dumps the test database excluding all comment lines and blank lines into test.sql. It also disables extended inserts, meaning there is one INSERT statement per line. This will help limit the memory usage during the import, but at a cost of import speed.

Import

The import script is as simple as this:

<?php

$mysqli = new mysqli('localhost', 'hobodave', 'p4ssw3rd', 'test');
$handle = fopen('test.sql', 'rb');
if ($handle) {
    while (!feof($handle)) {
        // This assumes you don't have a row that is > 1MB (1000000)
        // which is unlikely given the size of your DB
        // Note that it has a DIRECT effect on your scripts memory
        // usage.
        $buffer = stream_get_line($handle, 1000000, ";\n");
        $mysqli->query($buffer);
    }
}
echo "Peak MB: ",memory_get_peak_usage(true)/1024/1024;

This will utilize an absurdly low amount of memory as shown below:

daves-macbookpro:~ hobodave$ du -hs test.sql 
 15M    test.sql
daves-macbookpro:~ hobodave$ time php import.php 
Peak MB: 1.75
real    2m55.619s
user    0m4.998s
sys 0m4.588s

What that says is you processed a 15MB mysqldump with a peak RAM usage of 1.75 MB in just under 3 minutes.

Alternate Export

If you have a high enough memory_limit and this is too slow, you can try this using the following export:

~: mysqldump test --opt | grep -v '^--' | grep . > test.sql

This will allow extended inserts, which insert multiple rows in a single query. Here are the statistics for the same datbase:

daves-macbookpro:~ hobodave$ du -hs test.sql 
 11M    test.sql
daves-macbookpro:~ hobodave$ time php import.php 
Peak MB: 3.75
real    0m23.878s
user    0m0.110s
sys 0m0.101s

Notice that it uses over 2x the RAM at 3.75 MB, but takes about 1/6th as long. I suggest trying both methods and seeing which suits your needs.

Edit:

I was unable to get a newline to appear literally in any mysqldump output using any of CHAR, VARCHAR, BINARY, VARBINARY, and BLOB field types. If you do have BLOB/BINARY fields though then please use the following just in case:

~: mysqldump5 test --hex-blob --opt | grep -v '^--' | grep . > test.sql

hobodave 2010-01-09 10:28:38

Cheers Hobodave. I tried your solution first and it basically worked, but it dropped a number of records from a number of tables. On cursory inspection, this was because those records contained actual line breaks. While this is probably easy to fix, the bounty's running out and I feel compelled out of fairness to pick the solution that worked for me out of the box, which in this case was Axel's. Thanks for your time, and if you want to change your answer to take line-breakey content into account, I'll be happy to test run it for you (I can't dump the SQL because it contains confidential info).

Pekka 2010-01-10 13:07:02

@Pekka: what field type had a linebreak in it? I tried using TEXT and VARCHAR columns and my dump looks like: `INSERT INTO newline VALUES (1,'Four score, \nand seven years\nago');`

hobodave 2010-01-11 00:54:06

I can't reproduce it with a BLOB field either.

hobodave 2010-01-11 01:03:17

That's odd. I'll take a look at the imported data, the record number it stops at is always the same.

Pekka 2010-01-11 08:57:00

ansaurus

tags:

views:

answers:

Best practice: Import mySQL file in PHP; split queries

Export

Import

Alternate Export

Edit:

related questions