views:

135

answers:

2

Hi All, I have a Patient information table with ~50 million records. I need to check some samples for each year which may be in any order. Here are the sample date available in database "20090722", "20080817", ... "19980301". Also i have a primary-key column called "PID". My requirement is to get 2 or 3 samples for each year with a query.

I tried to get 2 samples for each year using sub-queries, i am not succeeded.

Any one in this forum have idea on this kind of requirement, If so please help me.

Guys i tried the following query in sql server and it worked find. But i need the query in MYSQL. Please help me out.

select pid,studydate  
  FROM (SELECT ROW_NUMBER() OVER ( PARTITION BY studydate ORDER BY pid DESC ) AS     
      'RowNumber', pid,studydate 
      FROM patient
  ) pt
  WHERE RowNumber <= 2
A: 

If I understand you correctly you could do something like this:

select year(datecolumn) as Year, 
  (select id from PatiendRecords pr2 where pr2.id>=min(pr.id)+rand()*max(pr.id) LIMIT 1), 
  (select id from PatiendRecords pr2 where pr2.id>=min(pr.id)+rand()*max(pr.id) LIMIT 1), 
  (select id from PatiendRecords pr2 where pr2.id>=min(pr.id)+rand()*max(pr.id) LIMIT 1) 
from PatiendRecords pr
group by year(datecolumn);

EDIT

delimiter //

CREATE PROCEDURE RandomRecordsPerYear(n INT)
BEGIN
  CREATE TEMPORARY TABLE lookup
  (id INT) ENGINE = MEMORY;

  SET @x = 0;
  REPEAT SET @x = @x + 1; 
     INSERT INTO lookup (id)  
     SELECT (SELECT id FROM PatientRecords pr WHERE pr2.id>=min(pr.id)+rand()*max(pr.id) LIMIT 1) AS Id FROM PatientRecords pr GROUP BY year(created_at);
  UNTIL @x >= n END REPEAT;

  SELECT * FROM PatientRecords s JOIN lookup l ON l.id=pr.id;
  DROP TABLE lookup;
END
//

call RandomRecordsPerYear(3)//

PS. I find it pretty cool that you have 50 million patient records in a MySQL database. DS.

Jonas Elfström
Hi Jonas,You almost did it. Actually i need the entire row not IDs. Could you please tell me the changes i need to made in the query to get 'n' rows of each year. - Thank you
Anil Kumar.C
I couldn't think of a way to do it in one query.
Jonas Elfström
A: 
SELECT  md.*
FROM    (
        SELECT  @r := @r + 1 AS y
        FROM    (
                @r := 0
                ) vars
        CROSS JOIN
                mytable
        LIMIT 200
        ) years
JOIN    mytable md
ON      md.datecol >= CAST('1900-01-01' AS DATETIME) + INTERVAL y YEARS
        AND md.datecol < CAST('1900-01-01' AS DATETIME) + INTERVAL y + 1 YEARS
        AND md.id <=
        COALESCE(
        (
        SELECT  id
        FROM    mytable mi
        WHERE   mi.datecol >= CAST('1900-01-01' AS DATETIME) + INTERVAL y YEARS
                AND mi.datecol < CAST('1900-01-01' AS DATETIME) + INTERVAL y + 1 YEARS
        ORDER BY
                id
        LIMIT 2
        ), 0xFFFFFFFF)
Quassnoi