tags:

views:

73

answers:

2

Following my previous question.

I have a table named activity with 2 columns:

`when` as datetime // last time i saw a user
guid as varchar  // a unique identifier for each user

I have a query that returns how many new users I have for a given date and timespan:

 SET @till_date='2009-11-01'; #for today, give CURDATE()+1
 SET @duration=1; # this can be changed to 7, 31, etc
 SELECT COUNT(guid) AS total_new_users, DATE_SUB(@till_date, INTERVAL @duration DAY) AS since_date, @till_date AS till_date
 FROM `activity` WHERE 1
  AND `when` BETWEEN DATE_SUB(@till_date, INTERVAL @duration DAY) AND @till_date
  AND guid NOT IN
  (
   SELECT guid
   FROM `activity`
   WHERE `when` < DATE_SUB(@till_date, INTERVAL @duration DAY)
  )

I want to be able to build a query that will return a list of new users per date, for a given time span. It will be used to build a graph of new users per day for a month, week, etc.

Something like this:

total_new_users  |  since_date  |  till_date
----------------------------------------------
   10            |  2009-11-20  |  2009-11-21
   10            |  2009-11-21  |  2009-11-22
   10            |  2009-11-22  |  2009-11-23

I thought about using UNION with LOOP, but my SQL knowledge is pretty basic, help?


The result (thanks to @pilcrow):

SET @till_date='2009-11-15';
SET @since_date='2009-11-01';
SELECT first AS yyyymmdd, 
   COUNT('x') AS new_user_count
FROM (SELECT guid,
             DATE(MIN(`when`)) AS first
      FROM activity
      GROUP BY 1) first_appearance
WHERE first BETWEEN @since_date AND @till_date
GROUP BY 1
A: 
SELECT 
    COUNT(DISTINCT guid) AS total_new_users, 
    DATE(when) AS since_date, 
    DATE(DATE_ADD(when, INTERVAL 1 DAY)) AS till_date 
FROM 
    Activity 
GROUP BY 
    DATE(when)
Suroot
@Suroot: your query returns the unique guids per day. but a "new user" is identified by showing up in that day, and not in any of the previous days. this is why i have the complicated query in my question.
Am
@Suroot, see Am's linked question -- the dataset is more clearly articulated there.
pilcrow
+2  A: 

Presuming *SET SESSION sql_mode = 'ANSI_QUOTES'*:

SELECT first AS yyyymmdd, 
       COUNT('x') AS new_user_count
FROM (SELECT guid,
             MIN("when") AS first
      FROM activity
      GROUP BY 1) first_appearance
WHERE first BETWEEN @since_date AND @till_date
GROUP BY 1

You may of course substitute your DATE_SUB() for @since_date, if you like.

pilcrow
simply beautiful!
Am