我每天都在努力寻找#活跃用户.
用户在连续4周每周发出超过10个请求时处于活动状态.
即.在2014年10月31日,如果用户每周总共发出超过10个请求,则用户处于活动状态:
> 2014年10月24日至10月30日AND
> 2014年10月17日至10月23日AND
> 2014年10月10日至10月16日AND
> 2014年10月3日至10月9日
我有一张请求表:
- CREATE TABLE requests (
- id text PRIMARY KEY,-- id of the request
- amount bigint,-- sum of requests made by accounts_id to recipient_id,-- aggregated on a daily basis based on "date"
- accounts_id text,-- id of the user
- recipient_id text,-- id of the recipient
- date timestamp -- date that the request was made in YYYY-MM-DD
- );
样本值:
- INSERT INTO requests2
- VALUES
- ('1',19,'a1','b1','2014-10-05 00:00:00'),('2','a2','b2','2014-10-06 00:00:00'),('3',85,'a3','b3','2014-10-07 00:00:00'),('4',11,'b4','2014-10-13 00:00:00'),('5',2,'b5','2014-10-14 00:00:00'),('6',50,'2014-10-15 00:00:00'),('7',787323,'b6','2014-10-17 00:00:00'),('8',33,'b8','2014-10-18 00:00:00'),('9',14,'b9','2014-10-19 00:00:00'),('10','a4','b10',('11',1628,'b11','2014-10-25 00:00:00'),('13',101,'2014-10-25 00:00:00');
输出示例:
- Date | # Active users
- -----------+---------------
- 10-01-2014 | 600
- 10-02-2014 | 703
- 10-03-2014 | 891
以下是我尝试查找特定日期的活跃用户数(例如10-01-2014):
- SELECT count(*)
- FROM
- (SELECT accounts_id
- FROM requests
- WHERE "date" BETWEEN '2014-10-01'::date - interval '2 weeks' AND '2014-10-01'::date - interval '1 week'
- GROUP BY accounts_id HAVING sum(amount) > 10) week_1
- JOIN
- (SELECT accounts_id
- FROM requests
- WHERE "date" BETWEEN '2014-10-01'::date - interval '3 weeks' AND '2014-10-01'::date - interval '2 week'
- GROUP BY accounts_id HAVING sum(amount) > 10) week_2 ON week_1.accounts_id = week_2.accounts_id
- JOIN
- (SELECT accounts_id
- FROM requests
- WHERE "date" BETWEEN '2014-10-01'::date - interval '4 weeks' AND '2014-10-01'::date - interval '3 week'
- GROUP BY accounts_id HAVING sum(amount) > 10) week_3 ON week_2.accounts_id = week_3.accounts_id
- JOIN
- (SELECT accounts_id
- FROM requests
- WHERE "date" BETWEEN '2014-10-01'::date - interval '5 weeks' AND '2014-10-01'::date - interval '4 week'
- GROUP BY accounts_id HAVING sum(amount) > 10) week_4 ON week_3.accounts_id = week_4.accounts_id
由于这只是获取1天数的查询,因此我需要每天获得此数字.我认为这个想法是做一个连接来获取日期,所以我尝试做这样的事情:
- SELECT week_1."Date_series",count(*)
- FROM
- (SELECT to_char(DAY::date,'YYYY-MM-DD') AS "Date_series",accounts_id
- FROM generate_series('2014-10-01'::date,CURRENT_DATE,'1 day') DAY,requests
- WHERE to_char(DAY::date,'YYYY-MM-DD')::date BETWEEN requests.date::date - interval '2 weeks' AND requests.date::date - interval '1 week'
- GROUP BY "Date_series",accounts_id HAVING sum(amount) > 10) week_1
- JOIN
- (SELECT to_char(DAY::date,'YYYY-MM-DD')::date BETWEEN requests.date::date - interval '3 weeks' AND requests.date::date - interval '2 week'
- GROUP BY "Date_series",accounts_id HAVING sum(amount) > 10) week_2 ON week_1.accounts_id = week_2.accounts_id
- AND week_1."Date_series" = week_2."Date_series"
- JOIN
- (SELECT to_char(DAY::date,'YYYY-MM-DD')::date BETWEEN requests.date::date - interval '4 weeks' AND requests.date::date - interval '3 week'
- GROUP BY "Date_series",accounts_id HAVING sum(amount) > 10) week_3 ON week_2.accounts_id = week_3.accounts_id
- AND week_2."Date_series" = week_3."Date_series"
- JOIN
- (SELECT to_char(DAY::date,'YYYY-MM-DD')::date BETWEEN requests.date::date - interval '5 weeks' AND requests.date::date - interval '4 week'
- GROUP BY "Date_series",accounts_id HAVING sum(amount) > 10) week_4 ON week_3.accounts_id = week_4.accounts_id
- AND week_3."Date_series" = week_4."Date_series"
- GROUP BY week_1."Date_series"
但是,我想我没有得到正确的答案,我不知道为什么.任何提示/指导/指针非常感谢!