创建用于计算表中数据出现次数的列的最佳方法是什么?该表需要按一列分组.
我见过
SELECT sum(CASE WHEN question1 = 0 THEN 1 ELSE 0 END) AS ZERO,sum(CASE WHEN question1 = 1 THEN 1 ELSE 0 END) AS ONE,sum(CASE WHEN question1 = 2 THEN 1 ELSE 0 END) AS TWO,category FROM reviews GROUP BY category
其中question1的值可以是0,1或2.
我也看过一个使用计数的版本(CASE WHEN question1 = 0 THEN 1)
解决方法
在Postgres 9.4中有一个新的,更清晰的聚合FILTER选项:
SELECT category,count(*) FILTER (WHERE question1 = 0) AS zero,count(*) FILTER (WHERE question1 = 1) AS one,count(*) FILTER (WHERE question1 = 2) AS two FROM reviews GROUP BY 1;
新FILTER子句的详细信息:
> How can I simplify this game statistics query?
如果你想要它简短:
SELECT category,count(question1 = 0 OR NULL) AS zero,count(question1 = 1 OR NULL) AS one,count(question1 = 2 OR NULL) AS two FROM reviews GROUP BY 1;
可能的变体概述:
> For absolute performance,is SUM faster or COUNT?
正确的交叉表查询
crosstab()产生最佳性能,并且对于更长的选项列表更短:
SELECT * FROM crosstab( 'SELECT category,question1,count(*)::int AS ct FROM reviews GROUP BY 1,2 ORDER BY 1,2','VALUES (0),(1),(2)' ) AS ct (category text,zero int,one int,two int);
详细说明: