views:

67

answers:

3

I want to store the last X records in an MySQL database in an efficient way. So when the 4th record is stored the first should be deleted.

The way I do this not is first run a query getting the items. Than check what I should do then insert/delete.

There has to be a better way to do this. Any suggestions?

Edit

I think I should add that the records stored do not have a unique number. They have a mixed par. For example article_id and user_id.

Then I want to make a table with the last X items for user_x.

Just selecting the article from the table grouped by user and sorted by time is not an option for me. The table where I do the sort and group on has millions of records and gets hit a lot for no reason. So making a table in between with the last X records is way more effient.

PS. I am not using this for articles and users.

A: 

Keep the primary keys of the first three items in a file or other persistent storage, then do an update statement instead of delete/insert?

If you want to put all in the DB, then I would add a unique sequence number to all records in the in-between table (lets call it seqid), then still perform a query but instead of getting whole rows, only get the sequence number e.g.

SELECT seqid from inbetweentable where article_id=? and user_id=?

Order that in your programming language (lets say PHP) and then do an update

UPDATE inbetweentable SET seqid=BIGGESTID+1, ... WHERE seqid=SMALLESTID 

(BIGGESTID and SMALLESTID come from your PHP code)

Edit: To only return one value (a comma seperated String) from the SQL statement do:

 SELECT GROUP_CONCAT(seqid) as idsCsv from inbetweentable where article_id=? and user_id=? ORDER BY seqid

and parse this in PHP. This will spare the looping code on mysql and PHP side and should be faster. e.g.

<?php
// Get single row
... 
$seqIds = explode(',', $row['idsCsv']);
Hannes de Jager
Thank you for answering the question. However I do not really understand it. Can you maybe be a bit more specific. I want to keep everything in the database. And preferably I want to keep everything in the same db table. I do not understand what you mean with 'pairs for your persistant storage'.
Saif Bechan
I updated the question. I think you have got the wrong Idea, the table is not just X records long. Check the question for updates please.
Saif Bechan
Ok, given your new info, I've updated my answer
Hannes de Jager
Ok, can you tell me how this is different from the way I am doing this now. In the first part of the question I think I describe the exact same thing. Getting some results in PHP and then do an update/insert.
Saif Bechan
By the way you explained it, it sounded like you were getting whole rows instead of just the id's which means more data over the network. Also see my update with the GROUP_CONCAT.
Hannes de Jager
Taking my example you could probably convert that logic to a 'ON INSERT' trigger for the 'inbetweentable' table So that on your PHP side you just have "INSERT INTO inbetweentable SET article_id=? and user_id=?, ..., ...;" Your trigger will then then use the sequence numbers in a similar way to delete the oldest row. Using a trigger will be faster because then your are only sending data to mysql once.'
Hannes de Jager
Do you have any idea on how to make such a trigger. The trigger looks like the best option in my opinion.
Saif Bechan
Ok, so it turns out a trigger won't work because Mysql can't update another row of the table for which the trigger was invoked. So a created a sample Stored Procedure answer for you. See my new answer.
Hannes de Jager
A: 

I think you should do the following in a transaction:

  1. Insert new records
  2. Get ids of the excess records
  3. Delete records with the ids from the step 2 if any

It would be great if you could merge steps 2 and 3 in a single query but this does not seem to be possible, since you will need to sort on and delete from the same table which is not allowed.

Here are some hints:

CREATE TEMPORARY TABLE items (
  item_id int unsigned not null auto_increment,
  data_ varchar(5) not null,
  primary key(item_id)
);

INSERT INTO items (data_)
  VALUES ('data1'), ('data2'), ('data3'), ('data4'), ('data5'), ('data6');

# select ids of the excess items
SELECT item_id
  FROM items, (select @cnt:=0) as cnt
  WHERE IF((@cnt:=@cnt+1)<=3, 0, 1)
  ORDER BY item_id DESC;

The last query will return:

+-------+
|item_id|
+-------+
|   3   |
|   2   |
|   1   |
+-------+
newtover
Do you think this method is inferior(performance wise) to just the simple approach I had for just getting the values in PHP and just make a decision in PHP whether to insert or delete and insert. I have my doubts on whether your method will be efficient on large data sets.
Saif Bechan
+1  A: 

Implement it in a stored procedure (table is named ibt which stand for in-between-table):

delimiter ;
DROP TABLE IF EXISTS `ibt`;
CREATE TABLE `ibt` (
  `seqid` int(10) unsigned NOT NULL auto_increment,
  `article_id` varchar(10) NOT NULL default '',
  `user_id` varchar(10) NOT NULL default '',  
   anotherVar VARCHAR(10),
  PRIMARY KEY  (`article_id`,`user_id`),
  KEY `seqid` (`seqid`)
) ENGINE=MEMORY AUTO_INCREMENT=7 DEFAULT CHARSET=latin1;


drop procedure if exists addEntry;
delimiter $$
create procedure addEntry(_article_id INT, _user_id INT, _anotherVar VARCHAR(10))
begin
  DECLARE done INT DEFAULT 0;
  declare seq INT;    
  declare seqNew INT DEFAULT 1;  
  declare Cnt INT DEFAULT 0;  

  declare cur CURSOR for
      SELECT seqid
      from ibt 
      where user_id=_user_id   
      order by seqid desc;  
  DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;

  START TRANSACTION;

  open cur;  
  REPEAT  
    FETCH cur INTO seq;    
    IF NOT done THEN        
      SET Cnt = Cnt+1;       
      IF Cnt = 3 THEN            
         DELETE FROM `ibt` where seqid = seq;
      END IF;      
      IF Cnt = 1 THEN            
         SET seqNew = seq + 1;
      END IF;      
    END IF;
  UNTIL done END REPEAT;

  INSERT into `ibt` 
  SET article_id=_article_id, 
        user_id=_user_id,  
      seqid=seqNew,      
      anotherVar=_anotherVar;

  close cur;  

  COMMIT;
end $$

delimiter ;

call addEntry(1, 1, 'a');
call addEntry(2, 1, 'b');
call addEntry(3, 1, 'c');
call addEntry(4, 1, 'd');

You can run the above SQL as a unit to test. I've used HeidiSQL.

Once you have the stored procedure in your DB you can do "call addEntry" from your PHP code.

Hannes de Jager
+1 Thank you this looks somewhat like the solution I was looking for. I was not aware of the fact that such an operation was so hard to achieve. There is a lot going on as in counting declaring etc. Do you think this solution will perform better than the first solution where you just get the items in php and choose what to do with them. It will cause a little more traffic between php and mysql.
Saif Bechan
Its more just syntactically verbose than alot going on - the nature of SQL. I believe it will perform better. Except for the traffic there is also the fact that Mysql must parse the 2 PHP SQL statements and work out query plans for them each time. Using this solution most of the extra work is done the moment you create the stored procedure. Its pre-compiled.
Hannes de Jager
Great! Thank you for your time sir.
Saif Bechan
My pleasure! I'd be interested to see stats on performance improvements. Lastly, come to think of it, it may be possible to put this stored procedure logic in a trigger for the original table instead of the in-between table (which could not work), so that, when you insert a row in the original table, all is done for you by the trigger. No extra PHP. But just a thought at this point
Hannes de Jager