ansaurus

Question

Efficiently store last X records per item

Answer 1

A:

Keep the primary keys of the first three items in a file or other persistent storage, then do an update statement instead of delete/insert?

If you want to put all in the DB, then I would add a unique sequence number to all records in the in-between table (lets call it seqid), then still perform a query but instead of getting whole rows, only get the sequence number e.g.

SELECT seqid from inbetweentable where article_id=? and user_id=?

Order that in your programming language (lets say PHP) and then do an update

UPDATE inbetweentable SET seqid=BIGGESTID+1, ... WHERE seqid=SMALLESTID

(BIGGESTID and SMALLESTID come from your PHP code)

Edit: To only return one value (a comma seperated String) from the SQL statement do:

 SELECT GROUP_CONCAT(seqid) as idsCsv from inbetweentable where article_id=? and user_id=? ORDER BY seqid

and parse this in PHP. This will spare the looping code on mysql and PHP side and should be faster. e.g.

<?php
// Get single row
... 
$seqIds = explode(',', $row['idsCsv']);

Hannes de Jager 2010-04-06 08:58:01

Thank you for answering the question. However I do not really understand it. Can you maybe be a bit more specific. I want to keep everything in the database. And preferably I want to keep everything in the same db table. I do not understand what you mean with 'pairs for your persistant storage'.

Saif Bechan 2010-04-06 09:12:00

I updated the question. I think you have got the wrong Idea, the table is not just X records long. Check the question for updates please.

Saif Bechan 2010-04-06 09:24:03

Ok, given your new info, I've updated my answer

Hannes de Jager 2010-04-06 09:50:07

Ok, can you tell me how this is different from the way I am doing this now. In the first part of the question I think I describe the exact same thing. Getting some results in PHP and then do an update/insert.

Saif Bechan 2010-04-06 09:58:43

By the way you explained it, it sounded like you were getting whole rows instead of just the id's which means more data over the network. Also see my update with the GROUP_CONCAT.

Hannes de Jager 2010-04-06 10:01:14

Taking my example you could probably convert that logic to a 'ON INSERT' trigger for the 'inbetweentable' table So that on your PHP side you just have "INSERT INTO inbetweentable SET article_id=? and user_id=?, ..., ...;" Your trigger will then then use the sequence numbers in a similar way to delete the oldest row. Using a trigger will be faster because then your are only sending data to mysql once.'

Hannes de Jager 2010-04-06 10:19:46

Do you have any idea on how to make such a trigger. The trigger looks like the best option in my opinion.

Saif Bechan 2010-04-06 13:44:02

Ok, so it turns out a trigger won't work because Mysql can't update another row of the table for which the trigger was invoked. So a created a sample Stored Procedure answer for you. See my new answer.

Hannes de Jager 2010-04-06 14:39:40

Answer 2

A:

I think you should do the following in a transaction:

Insert new records
Get ids of the excess records
Delete records with the ids from the step 2 if any

It would be great if you could merge steps 2 and 3 in a single query but this does not seem to be possible, since you will need to sort on and delete from the same table which is not allowed.

Here are some hints:

CREATE TEMPORARY TABLE items (
  item_id int unsigned not null auto_increment,
  data_ varchar(5) not null,
  primary key(item_id)
);

INSERT INTO items (data_)
  VALUES ('data1'), ('data2'), ('data3'), ('data4'), ('data5'), ('data6');

# select ids of the excess items
SELECT item_id
  FROM items, (select @cnt:=0) as cnt
  WHERE IF((@cnt:=@cnt+1)<=3, 0, 1)
  ORDER BY item_id DESC;

The last query will return:

+-------+
|item_id|
+-------+
|   3   |
|   2   |
|   1   |
+-------+

newtover 2010-04-06 11:32:47

Do you think this method is inferior(performance wise) to just the simple approach I had for just getting the values in PHP and just make a decision in PHP whether to insert or delete and insert. I have my doubts on whether your method will be efficient on large data sets.

Saif Bechan 2010-04-06 13:41:14

Answer 3

+1 A:

Implement it in a stored procedure (table is named ibt which stand for in-between-table):

delimiter ;
DROP TABLE IF EXISTS `ibt`;
CREATE TABLE `ibt` (
  `seqid` int(10) unsigned NOT NULL auto_increment,
  `article_id` varchar(10) NOT NULL default '',
  `user_id` varchar(10) NOT NULL default '',  
   anotherVar VARCHAR(10),
  PRIMARY KEY  (`article_id`,`user_id`),
  KEY `seqid` (`seqid`)
) ENGINE=MEMORY AUTO_INCREMENT=7 DEFAULT CHARSET=latin1;


drop procedure if exists addEntry;
delimiter $$
create procedure addEntry(_article_id INT, _user_id INT, _anotherVar VARCHAR(10))
begin
  DECLARE done INT DEFAULT 0;
  declare seq INT;    
  declare seqNew INT DEFAULT 1;  
  declare Cnt INT DEFAULT 0;  

  declare cur CURSOR for
      SELECT seqid
      from ibt 
      where user_id=_user_id   
      order by seqid desc;  
  DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;

  START TRANSACTION;

  open cur;  
  REPEAT  
    FETCH cur INTO seq;    
    IF NOT done THEN        
      SET Cnt = Cnt+1;       
      IF Cnt = 3 THEN            
         DELETE FROM `ibt` where seqid = seq;
      END IF;      
      IF Cnt = 1 THEN            
         SET seqNew = seq + 1;
      END IF;      
    END IF;
  UNTIL done END REPEAT;

  INSERT into `ibt` 
  SET article_id=_article_id, 
        user_id=_user_id,  
      seqid=seqNew,      
      anotherVar=_anotherVar;

  close cur;  

  COMMIT;
end $$

delimiter ;

call addEntry(1, 1, 'a');
call addEntry(2, 1, 'b');
call addEntry(3, 1, 'c');
call addEntry(4, 1, 'd');

You can run the above SQL as a unit to test. I've used HeidiSQL.

Once you have the stored procedure in your DB you can do "call addEntry" from your PHP code.

Hannes de Jager 2010-04-06 14:38:36

+1 Thank you this looks somewhat like the solution I was looking for. I was not aware of the fact that such an operation was so hard to achieve. There is a lot going on as in counting declaring etc. Do you think this solution will perform better than the first solution where you just get the items in php and choose what to do with them. It will cause a little more traffic between php and mysql.

Saif Bechan 2010-04-06 14:54:47

Its more just syntactically verbose than alot going on - the nature of SQL. I believe it will perform better. Except for the traffic there is also the fact that Mysql must parse the 2 PHP SQL statements and work out query plans for them each time. Using this solution most of the extra work is done the moment you create the stored procedure. Its pre-compiled.

Hannes de Jager 2010-04-06 15:08:54

Great! Thank you for your time sir.

Saif Bechan 2010-04-06 15:37:15

My pleasure! I'd be interested to see stats on performance improvements. Lastly, come to think of it, it may be possible to put this stored procedure logic in a trigger for the original table instead of the in-between table (which could not work), so that, when you insert a row in the original table, all is done for you by the trigger. No extra PHP. But just a thought at this point

Hannes de Jager 2010-04-06 15:51:17

ansaurus

tags:

views:

answers:

Efficiently store last X records per item

Edit

related questions