views:

51

answers:

2

I've got my database set up with three tables - code, tags, and code_tags for tagging posts. This will be the SQL query processed when a post is submitted. Each tag is sliced up by PHP and individually inserted using these queries.

INSERT IGNORE INTO tags (tag) VALUES ('$tags[1]');
SELECT tags.id FROM tags WHERE tag = '$tags[1]' ORDER BY id DESC LIMIT 1;
INSERT INTO code_tags (code_id, tag_id) VALUES ($codeid, WHAT_GOES_HERE?)

The WHAT_GOES_HERE? value at the end is what I need to know. It needs to be the ID of the tag that the second query fetched. How can I put that ID into the third query?

I hope I explained that correctly. I'll rephrase if necessary.

Edit: Thanks for your help so far but I'm still struggling a bit in regards to what was pointed out - if it's already there I can't get the inserted ID...?

+1  A: 

If I understand what you're attempting to achieve correctly, the second query is un-necessary - use mysql_insert_id to obtain the ID of previously inserted row, which is I presume what you need for "WHAT_GOES_HERE".

middaparka
Ooh, nice one - I never knew about that. Thanks middaparka!
Jack Webb-Heller
But keep in mind that this will only return a value _if_ a record has been inserted. You probably use INSERT _IGNORE_ and a unique index on tags.tag to avoid duplicate tags. If you "add" an existing tag (again) and therefore the record is ignored mysql\_insert_id() doesn't return the id of the previously inserted tag record.
VolkerK
Ah... good point. Just realised that myself. Is there any better way of doing it?
Jack Webb-Heller
+1  A: 

If you use INSERT IGNORE and a new record is ignored (because of a unique key violation) mysql_insert_id() and LAST_INSERT_ID() don't have a meaningful value.

But you can use INSERT ... ON DUPLICATE KEY UPDATE and LAST_INSERT_ID(expr) to set the data you expect LAST_INSERT_ID() to return in case of a doublet.

Step-by-step:
Let's assume you have a table tags like

CREATE TABLE tags (
  id int auto_increment,
  tag varchar(32),
  dummy int NOT NULL DEFAULT 0, /* for demo purposes only */
  primary key(id),
  unique key(tag)
)

Inserting a tag twice results in a duplicate key violation because of unique key(tag). That's probably the reason why you've used INSERT IGNORE. In that case MySQL ignores the violation but the new record is ignored as well. The problem is that you want the id of the record having tag='xyz' regardless of whether it has been newly created or it was already in the database. But right now mysql_insert_id()/LAST_INSERT_ID() can oly provide the id of a new record, not an ignored one.

With INSERT ...ON DUPLICATE you can react on such duplicate key violations. If the new record can be inserted (no violation) it behaves like a "normal" INSERT. But in case of a duplicate key violation the part after ON DUPLICATE KEY is executed like an UPDATE statement for the record with that particular index value already existing in the table. E.g. (with an empty table tags)

INSERT INTO tags (tag) VALUES ('tag A') ON DUPLICATE KEY UPDATE dummy=dummy+1

This will simply insert the record as if there was no ON DUPLICATE ... clause. id gets the next auto-increment value, dummy the default value of 0 and tag='tag A'. Let's assume the newly create auto_increment value was 1. The resulting record stored in MySQL is (id=1, tag='tag A', dummy=0) and LAST_INSERT_ID() will return 1 right after this query. So far so good.
Now if you insert the same record again with the same query a violation occurs because of the first record (id=1, 'tag=tag A', dummy=0). For this already exisitng record the UPDATE statement after ON DUPLICATE KEY is executed, i.e. the record becomes (id=1, tag='tag A', dummy=1). But since no new record has been created there was also no new auto_increment value and LAST_INSERT_ID() becomes meaningless. So still the same problem as with INSERT IGNORE.
But there is a "special" construct that allows you to set the value LAST_INSERT_ID() is supposed to return after the ON DUPLICATE KEY UPDATE statement has been executed.

id=LAST_INSERT_ID(id)

Looks strange but it really only sets the value LAST_INSERT_ID() will return.
If you use the statement

  INSERT INTO
    tags
    (tag)
  VALUES
    ('xyz')
  ON DUPLICATE KEY UPDATE
    id=LAST_INSERT_ID(id)

LAST_INSERT_ID() will always return the id of the record having tag='xyz' no matter if it was added by the INSERT part or "updated" by the ON DUPLICATE KEY part.
I.e. if your next query is

INSERT INTO
  code_tags
  (code_id, tag_id)
VALUES
  (4711, LAST_INSERT_ID())

the tags.id for the tag 'xyz' is used.

The self-contained example script uses PDO and prepared statements. It should do more or less what you want to achieve.

$pdo = new PDO("mysql:host=localhost;dbname=test", 'localonly', 'localonly'); 
$pdo->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

// set up temporary table and demo data
$pdo->exec('CREATE TEMPORARY TABLE tmpTags (id int auto_increment, tag varchar(32), primary key(id), unique key(tag))');
$pdo->exec('CREATE TEMPORARY TABLE tmpCode_tags (code_id int, tag_id int)');
$pdo->exec("INSERT INTO tmpTags (tag) VALUES ('tagA'), ('tagB')");

// prepare the statements
// set id=LAST_INSERT_ID(id), so LAST_INSERT_ID() gets a value even if the record is "ignored"
$stmtTags = $pdo->prepare('
  INSERT INTO
    tmpTags
    (tag)
  VALUES
    (:tag)
  ON DUPLICATE KEY UPDATE
    id=LAST_INSERT_ID(id)
');
$stmtTags->bindParam(':tag', $tag);

$stmtCodeTags = $pdo->prepare('INSERT INTO tmpCode_tags (code_id, tag_id) VALUES (:codeid, LAST_INSERT_ID())');
$stmtCodeTags->bindParam(':codeid', $codeid);

// and some new records we want to insert
$testdata = array(
  array('codeid'=>1, 'tags'=>'tagA tagC'), // tagA is already in the table "tags", tagC is a "new" tag
  array('codeid'=>2, 'tags'=>'tagC tagD tagE') // tagC will already be inserted, tagD and atgE are "new"
);

// process (test)data
foreach($testdata as $data) {
  // the parameter :codeid of $stmtCodeTags is bound to $codeid; assign it the "current" value
  $codeid = $data['codeid'];
  // split the tags
  $tags = explode(' ', $data['tags']);
  foreach($tags as $tag) {
    // the parameter :tag is bound to $tag
    // nothing more to do than to execute the statement
    $stmtTags->execute();
    // the parameter :codeid is bound to $codeid which was set to $codeid=$data['codeid']
    // again nothing more to do than to execute the statement
    $stmtCodeTags->execute();
  }
}
unset($stmtTags);
unset($stmtCodeTags);


// let's see what we've got
$query = '
  SELECT
    ct.code_id, t.tag
  FROM
    tmpCode_tags as ct
  JOIN
    tmpTags as t
  ON
    ct.tag_id=t.id
'; 
foreach( $pdo->query($query, PDO::FETCH_NUM) as $row ) {
  echo join(', ', $row), "\n";
}

prints

1, tagA
1, tagC
2, tagC
2, tagD
2, tagE

edit2: In case the PDO-part of the script and the prepared statements are intimidating, here's the same thing using the old php-mysql module. But I urge you to use parametrized prepared statements. Doesn't have to be PDO but I happen to like it. E.g. the mysqli module provides prepared statements as well, the old mysql module doesn't.

$mysql = mysql_connect('localhost', 'localonly', 'localonly') or die(mysql_error());
mysql_select_db('test', $mysql) or die(mysql_error());

// set up temporary table and demo data
mysql_query('CREATE TEMPORARY TABLE tmpTags (id int auto_increment, tag varchar(32), primary key(id), unique key(tag))', $mysql) or die(mysql_error());
mysql_query('CREATE TEMPORARY TABLE tmpCode_tags (code_id int, tag_id int)', $mysql) or die(mysql_error());
mysql_query("INSERT INTO tmpTags (tag) VALUES ('tagA'), ('tagB')", $mysql) or die(mysql_error());


// and some new records we want to insert
$testdata = array(
  array('codeid'=>1, 'tags'=>'tagA tagC'), // tagA is already in the table "tags", tagC is a "new" tag
  array('codeid'=>2, 'tags'=>'tagC tagD tagE') // tagC will already be inserted, tagD and atgE are "new"
);

// "prepare" the statements.
// This is nothing like the server-side prepared statements mysqli and pdo offer.
// we have to insert the parameters into the query string, i.e. the parameters must
// be escaped so that they cannot mess up the statement.
// see mysql_real_escape_string() for string literals within the sql statement.
$qsTags = "
  INSERT INTO
    tmpTags
    (tag)
  VALUES
    ('%s')
  ON DUPLICATE KEY UPDATE
    id=LAST_INSERT_ID(id)  
";

$qsCodeTags = "
  INSERT INTO
    tmpCode_tags
    (code_id, tag_id)
  VALUES
    ('%s', LAST_INSERT_ID())
";

foreach($testdata as $data) {
  // in this example codeid is a simple number
  // let's treat it as a string literal in the statement anyway
  $codeid = mysql_real_escape_string($data['codeid'], $mysql);
  $tags = explode(' ', $data['tags']);
  foreach($tags as $tag) {
    // now $tag is certainly a string parameter
    $tag = mysql_real_escape_string($tag, $mysql);
    $query = sprintf($qsTags, $tag);
    mysql_query($query, $mysql) or die(mysql_error());

    $query = sprintf($qsCodeTags, $codeid);
    mysql_query($query, $mysql) or die(mysql_error());
  }
}

// let's see what we've got
$query = '
  SELECT
    ct.code_id, t.tag
  FROM
    tmpCode_tags as ct
  JOIN
    tmpTags as t
  ON
    ct.tag_id=t.id
'; 
$result = mysql_query($query, $mysql) or die(mysql_error());
while ( false!==($row=mysql_fetch_row($result)) ) {
  echo join(', ', $row), "\n";
}
VolkerK
Intense answer! Thanks.
Jack Webb-Heller
Perhaps a little too intense, sorry, I'm still a bit of a novice and that code is a bit intimidating. What would I 'update' with ON DUPLICATE KEY UPDATE? Perhaps you could explain the context a little more?
Jack Webb-Heller
answer expanded
VolkerK
Possibly the greatest answer I have ever seen on StackOverflow... VolkerK, you are amazing.
Jack Webb-Heller
I've put it into practise and now everything works great - thank you so much!
Jack Webb-Heller