views:

98

answers:

2

I have generated a dataset that contains data spanning thirty days. Im trying to issolate new data elements that have appeared in the last 2 days but not in the previous 28 days before that.

  1. I run a PHP script that generates the test data. (PHP and MYSQL return the same time when tested)

  2. I run the following query against it.

  3. Results are returned accuretly for aproximetly half an hour. Then despite the fact I believe there to be matching records none are returned when running this query.

Is there any obvious mistake I'm making in the SQL that would cause this apparent 'drift' to occur?

About The Data:

The script generates a 'race' per day. It populates the ranking tables with ranking of the 10 'jokeys'. For the purposes of testing the script generates races from the previous 2 days with 2 new 'jokeys' in the top 10. The remaining 30 days the races are identical.

Results Expected:

The names of two jokeys who have recently ranked in a race (in the last two days and have not ranked in the previous 28).

The SQL:

SELECT *, FROM_UNIXTIME(`race_timestamp`) as ts FROM `rankings`
WHERE `race_venue` = UNHEX(MD5('someplace'))
AND `jokey` IN
  (
SELECT `jokey`
FROM `rankings`
 WHERE `race_timestamp`
BETWEEN # Get results for races between now and two days ago
UNIX_TIMESTAMP(DATE_SUB(NOW(), INTERVAL 2 DAY)) # timestamp two days ago
 AND
UNIX_TIMESTAMP() # time stamp now
   )
AND 
`jokey` NOT IN
(SELECT `jokey`
 FROM `rankings`
 WHERE `race_timestamp`
 BETWEEN  # Get results between 2 and 30 days ago
 UNIX_TIMESTAMP(DATE_SUB(NOW(), INTERVAL 30 DAY)) # time stamp 30 days ago
 AND
 UNIX_TIMESTAMP(DATE_SUB(NOW(), INTERVAL 2 DAY)) # time stamp 2 days ago
 )
GROUP BY jockey;

Hope someone can help! Ben

+1  A: 

If you want to do this by date, rather than by the exact minute and second, you might change:

UNIX_TIMESTAMP(DATE_SUB(NOW(), INTERVAL 30 DAY))

to something like:

DATE(DATE_SUB(NOW(), INTERVAL 30 DAY))
Scott Saunders
Thanks Scott! I just about managed to get to the answer to the question when talking through the scenario with a colleage. I hadn't found the implemenation yet though.Cheers!Ben
Ben Waine
A: 

Well there are some small errors in your SQL. You switch between jokey and jockey I'm going to just guess you mean jockey since this seems race-related. Not sure if this is in your actual code but it probably wouldn't run at all if it was. Also, you have no reason to use GROUP BY jockey as there are no aggregate functions being used.

Try this:

SELECT *, FROM_UNIXTIME(race_timestamp) AS ts FROM rankings
WHERE
  race_venue = UNHEX(MD5('someplace'))
  AND jockey IN (
    SELECT jockey FROM rankings
    WHERE race_timestamp
      BETWEEN UNIX_TIMESTAMP(DATE_SUB(DATE(NOW()), INTERVAL 2 DAY))
      AND UNIX_TIMESTAMP()
  )
  AND jockey NOT IN (
    SELECT jockey FROM rankings
    WHERE race_timestamp
      BETWEEN UNIX_TIMESTAMP(DATE_SUB(DATE(NOW()), INTERVAL 30 DAY))
      AND UNIX_TIMESTAMP(DATE_SUB(DATE(NOW()), INTERVAL 2 DAY))
  )

Other than that, there are no actual errors I can detect in your logic. This query should return all jockeys for the requested venue who are ranked within the past 2 days and weren't ranked within 2-30 days ago.

Dustin Fineout
Sorry - This isnt the actual code. I did subsitiute the col names before submitting the code. The issue is definitly the fact that NOW() means NOW rather than the ifxed time frame of a day.Thanks for your answer.
Ben Waine
See fix, this will do it.
Dustin Fineout