views:

181

answers:

3

How can I implement recursive MySQL Queries. I am trying to look for it but resources are not very helpful.

Trying to implement similar logic.

public function initiateInserts()
{
    //Open Large CSV File(min 100K rows) for parsing.
    $this->fin = fopen($file,'r') or die('Cannot open file');

    //Parsing Large CSV file to get data and initiate insertion into schema.
    $query = "";
    while (($data=fgetcsv($this->fin,5000,";"))!==FALSE)
    {
        $query = $query + "INSERT INTO dt_table (id, code, connectid, connectcode) 
                 VALUES (" + $data[0] + ", " + $data[1] + ", " + $data[2] + ", " + $data[3] + ")";
    }
     $stmt = $this->prepare($query);
     // Execute the statement
     $stmt->execute();
     $this->checkForErrors($stmt);
}

@Author: Numenor

Error Message: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '0' at line 1

This Approach inspired to look for an MySQL recursive query approach.

Here is the Approach I was using Earlier:

Current Code:

public function initiateInserts()
{
    //Open Large CSV File(min 100K rows) for parsing.
    $this->fin = fopen($file,'r') or die('Cannot open file');

    //Parsing Large CSV file to get data and initiate insertion into schema.
    while (($data=fgetcsv($this->fin,5000,";"))!==FALSE)
    {
        $query = "INSERT INTO dt_table (id, code, connectid, connectcode) 
                 VALUES (:id, :code, :connectid, :connectcode)";

        $stmt = $this->prepare($query);
        // Then, for each line : bind the parameters
        $stmt->bindValue(':id', $data[0], PDO::PARAM_INT);
        $stmt->bindValue(':code', $data[1], PDO::PARAM_INT);
        $stmt->bindValue(':connectid', $data[2], PDO::PARAM_INT);
        $stmt->bindValue(':connectcode', $data[3], PDO::PARAM_INT);

        // Execute the statement
        $stmt->execute();
        $this->checkForErrors($stmt);
    }
}

Updated Code

public function initiateInserts()
{
    //Open Large CSV File(min 100K rows) for parsing.
    $this->fin = fopen($file,'r') or die('Cannot open file');

    //Prepare insertion query to insert data into schema.
    $query = "INSERT INTO dt_table (id, code, connectid, connectcode) 
                 VALUES (:id, :code, :connectid, :connectcode)";

    $stmt = $this->prepare($query);
    // Then, for each line : bind the parameters
    $stmt->bindValue(':id', $data[0], PDO::PARAM_INT);
    $stmt->bindValue(':code', $data[1], PDO::PARAM_INT);
    $stmt->bindValue(':connectid', $data[2], PDO::PARAM_INT);
    $stmt->bindValue(':connectcode', $data[3], PDO::PARAM_INT);

  //Loop through CSV file and execute inserts prepared, but this is not working
  //and there are not data being populated into database. 
   while (($data=fgetcsv($this->fin,5000,";"))!==FALSE)
        {
            // Execute the statement
             list($id, $code, $connid, $conncode)=$data;

            $stmt->execute();
            $this->checkForErrors($stmt);
        }
    }

This was my Main Question for which I am looking for suggestions !!!

+2  A: 
  • There's nothing recursive in that code snippet.
  • The wrong operator is used to concatenate the strings, it's . (dot) not +
  • You'd have to use something like mysqli::multi_query() to execute more than one statement with a single function call and the statements would have to be separated by a delimiter character (by default a semicolon)
  • Since you're already using prepare() and execute() why not simply make it a parametrized prepared statement and then assign the values in each iteration of the loop and execute the statement? (Exactly what is $this and what type of object does $this->prepare() return?)
  • edit and btw: $this->prepare() indicates that your class extends a database class. And it also holds a file descriptor $this->fin. This has a certain code smell. My guess is that your class uses/has a database/datasink object and a file/datasource, but not is a database+readfile class. Only extend a class if your derived class is something.

edit: a simple example

class Foo {
  protected $pdo;
  public function __construct(PDO $pdo) {
    $this->pdo = $pdo;
  }

  public function initiateInserts($file)
  {
    $query = '
      INSERT INTO
        dt_table_tmp
        (id, code, connectid, connectcode)
      VALUES
        (:id, :code, :connid, :conncode)
    ';
    $stmt = $this->pdo->prepare($query);
    $stmt->bindParam(':id', $id);
    $stmt->bindParam(':code', $code);
    $stmt->bindParam(':connid', $connid);
    $stmt->bindParam(':conncode', $conncode);

    $fin = fopen($file, 'r') or die('Cannot open file');
    while ( false!==($data=fgetcsv($fin,5000,";")) ) {
      list($id, $code, $connid, $conncode)=$data;
      $stmt->execute();
    }
  }
}

$pdo = new PDO("mysql:host=localhost;dbname=test", 'localonly', 'localonly'); 
$pdo->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
// set up a demo table and some test data
$pdo->exec('CREATE TEMPORARY TABLE dt_table_tmp (id int, code int, connectid int, connectcode int)');
$sourcepath = 'sample.data.tmp';
$fh = fopen($sourcepath, 'wb') or die('!fopen(w)');
for($i=0; $i<10000; $i++) {
  fputcsv($fh, array($i, $i%4, $i%100, $i%3), ';');
}
fclose($fh); unset($fh);
// test script
$foo = new Foo($pdo);
$foo->initiateInserts($sourcepath);
VolkerK
Agreed with you point. I was initially using bind parameters, first I was preparing query and than was binding it at run time but the things is currently I am hitting database for each row in the csv file to insert and my csv file has 100K rows and so am hitting 100k times to database and I want to reduce this no. of hits to database and so trying to prepare query by summing up the addition for no. of rows and than trying to execute in one hit to database but am not sure of how to do it and so trying different approaches but not yet successful. Any Suggestions !!!
Rachel
Keep your first approach (unless you've done it wrong ;-)). The statement is analyzed only once by the MySQL server which keeps the (optimizer/query planer) data associated with the statement as long as the prepared statement is valid. Only the parameters have to be send in each iteration using a compact binary protocol. Are you using mysqlI or pdo as the underlying/base class?
VolkerK
I am using pdo as the underlying base class and I am under the impression that for each row in csv file, I am preparing the statement and than executing it so for each row am hitting the database which is ofcourse not good. I have updated my question with the first approach which I was using and also my main motive is to reduce Database hits and so would certainly appreciate if you could provide some suggestions as to how this can be achieved. I have posted link to my main questions too in the description.
Rachel
Don't prepare the statement in each iteration. Prepare it once, only assign the parameters in each iteration and execute the statement. see http://docs.php.net/pdo.prepared-statements
VolkerK
Now a very stupid questions: First we assign parameters than we prepare the query or we do it other way round. I will post my approach in the question itself which would give some more information.
Rachel
If you see in the code above, am preparing inside the while loop and so is this not the way it should have been done. I am not sure as to how can I prepare the query out the while loop as I am reading the csv file in while loop using fgetcsv. Hope am able to explain my scenario properly.
Rachel
You cannot assign parameters unless you already have a prepared statement ;-) Please read the pdo.prepared-statements introduction.
VolkerK
I was very sure that mine would have been a stupid question. Yes I am doing that but before that I am trying to use the approach you suggested to see how it works.
Rachel
In the sample code we are using $stmt->execute in while, will this not hit database multiple times ?
Rachel
Yes, but as said before not the whole statement is transmitted again and again, only the parameters using a compact protocol. And also the MySQL server doesn't have to evaluate/plan/optimize the statement again and again. It only analyzes the statement once when it is prepared and then keeps the query plan. So, yes there might be a little overhead if you're using a tcp connection to the server and maybe a little latency because the packets have to be delivered and acknowledged. But it's faster than sending whole statements. And the overhead gets even smaller if you use a unix socket (file).
VolkerK
oh, and btw: concatenating strings in memory has a (though maybe small) price tag, too.
VolkerK
Would it be a better solution if somehow we can prepare query for all entries of csv file and than just run one execute which would insert all data into the database. I am not sure as to how this can be achieved but still thinking if that is an valid option. Would that be an feasible option ?
Rachel
I am not that good at bargain but what would be an ideal or let's say a `good` solution to this scenario ?
Rachel
If you "prepare" a statement _containing_ all the values you want to insert, then you don't need a prepared statement. It would completely defy the purpose of prepared statements. And you probably would have to deal with max\_allowed\_packet again. The default value is around 1MB which you can hit pretty easily when dealing with 100k records. It would also mean that you had 100k string concatenations of one single ever growing string in your php process.
VolkerK
I am just trying to figure out performance degradation due Database hits for every execute statement as compared to issues you mentioned, what could be an appropriate trade off estimation for this ?
Rachel
Also lets say if we are doing it by not using prepared statements and doing something like `$query = $query + "INSERT INTO dt_table (id, code, connectid, connectcode) VALUES (" + $data[0] + ", " + $data[1] + ", " + $data[2] + ", " + $data[3] + ")"; } ` and doing inserts. Would it be a good idea, am just trying to figure out what could be the best approach for this solution.
Rachel
Just try it. Use 100k records, then use a million. Try with the MySQL server on the same machine, then put it one or more (network) hops away. Try it with an idle server then put some (additional) load on it. And so on and on - what ever is feasible for your scenario. You can speculate all day long ...or try and measure it ;-)
VolkerK
I am using preparing query in while loop and bind values with it and am executing it outside the while loop and I hope it works fine :)
Rachel
Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 6554129 bytes) in
Rachel
So your query string was ~6MB long at this time. Obviously that was too much for your php configuration and it would be too long for the default value of max\_allowed\_packet.
VolkerK
Yes. This approach would certainly not work as my php configuration does not support this much memory requirement.
Rachel
Very strange thing is happening, nothing is being populated into the database when I try to use execute statement in while loop...very strange...Am updating question with latest sample code to test.
Rachel
++Rachel for persistence. ++(the others) for their free course on prepare statements.
reinierpost
Try setting the error mode to PDO::ERRMODE_EXCEPTION like in the example and add some debug output to the method, e.g. `echo '#';` before the while loop and `echo '.';` within the loop.
VolkerK
I have tried and it does work, now in case of my scenario, I need to look up into CSV file for type of field and depending upon its value like `A(add), D(delete) and U(update)` I need to implement add, update or delete logic in database, currently I am having ifelse logic for implementing that logic and so I will need that logic in while loop as for every row entry in csv file, depending upon its type I need to use proper query and so I will need prepare statement in while loop itself and so approach of concatenating query together building into one large one query seems to be the only valid
Rachel
Continued: option but again as mentioned in taking that approach am running out of memory and so I guess that I will have to prepare query and execute query both in while loop resulting in multiple database hits. I am still pondering on some other options which could minimize my database hits.
Rachel
Just prepare three statements before the while loop and then within the loop decide which one to use.
VolkerK
+1  A: 

a few tips about speeding up mysql data import

  • check if your data really requires to be parsed, sometimes load data works just fine for csv
  • if possible, create an sql file first via php and then execute it with mysql command line client
  • use multivalue inserts
  • disable keys before inserting

multivalue insert statement is something like

INSERT INTO users(name, age) VALUES
     ("Sam", 13), 
     ("Joe", 14),
     ("Bill", 33);

this is much faster than three distinct insert statements.

Disabling keys is important to prevent indexing each time you're executing an INSERT:

 ALTER TABLE whatever DISABLE KEYS;
 INSERT INTO whatever .....  
 INSERT INTO whatever .....  
 INSERT INTO whatever .....  
 ALTER TABLE whatever ENABLE KEYS;

further reading http://dev.mysql.com/doc/refman/5.1/en/insert-speed.html

stereofrog
Can you elaborate on using multivalue inserts and disable keys before inserting as I am not clearly understanding the information provided. Thanks in advance.
Rachel
If you want to go along with `load data` take a closer look at the paragraph starting with "For security reasons" in the documentation. Still a good idea though... +1
VolkerK
@Rachel, see update
stereofrog
I cannot use load data functionality due to some official constraints but certainly seems to be an good alternative.
Rachel
Also keep in mind that if you're using multivalue inserts you have to take the max\_allowed\_packet value into consideration, esp. when dealing with lots of records. see http://dev.mysql.com/doc/refman/5.1/en/packet-too-large.html
VolkerK
A: 

Inspired by this question I would say you should do something similar. If you really have so many data, then a bulk import is the most appropriate approach for this. And you already have the data in a file.

Have a look at the LOAD DATA INFILE command.

The LOAD DATA INFILE statement reads rows from a text file into a table at a very high speed. The file name must be given as a literal string.

If you are interested in the speed differences then read Speed of INSERT Statements.

E.g. you can do this:

$query = "LOAD DATA INFILE 'data.txt' INTO TABLE tbl_name
          FIELDS TERMINATED BY ';' 
          LINES TERMINATED BY '\r\n'
          IGNORE 1 LINES;
         "

This will also ignore the first line assuming that it only indicates the columns.

Felix Kling
I am not supposed to use LOAD Data infile because of security reasons !!!
Rachel