基本上,分区是以这种方式创建的:
CREATE TABLE foo_0 (CHECK (id % 30 = 0)) INHERITS (foo); CREATE TABLE foo_1 (CHECK (id % 30 = 1)) INHERITS (foo); CREATE TABLE foo_2 (CHECK (id % 30 = 2)) INHERITS (foo); CREATE TABLE foo_3 (CHECK (id % 30 = 3)) INHERITS (foo); . . .
我为整个数据库运行了ANALYZE,特别是,我通过运行以下方法收集了该表的id列的额外统计信息:
ALTER TABLE foo ALTER COLUMN id SET STATISTICS 10000;
但是,当我运行在id列上过滤的查询时,计划程序会显示它仍在扫描所有分区. constraint_exclusion设置为partition,因此不是问题.
EXPLAIN ANALYZE SELECT * FROM foo WHERE (id = 2); QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------- Result (cost=0.00..8106617.40 rows=3620981 width=54) (actual time=30.544..215.540 rows=171477 loops=1) -> Append (cost=0.00..8106617.40 rows=3620981 width=54) (actual time=30.539..106.446 rows=171477 loops=1) -> Seq Scan on foo (cost=0.00..0.00 rows=1 width=203) (actual time=0.002..0.002 rows=0 loops=1) Filter: (id = 2) -> Bitmap Heap Scan on foo_0 foo (cost=3293.44..281055.75 rows=122479 width=52) (actual time=0.020..0.020 rows=0 loops=1) Recheck Cond: (id = 2) -> Bitmap Index Scan on foo_0_idx_1 (cost=0.00..3262.82 rows=122479 width=0) (actual time=0.018..0.018 rows=0 loops=1) Index Cond: (id = 2) -> Bitmap Heap Scan on foo_1 foo (cost=3312.59..274769.09 rows=122968 width=56) (actual time=0.012..0.012 rows=0 loops=1) Recheck Cond: (id = 2) -> Bitmap Index Scan on foo_1_idx_1 (cost=0.00..3281.85 rows=122968 width=0) (actual time=0.010..0.010 rows=0 loops=1) Index Cond: (id = 2) -> Bitmap Heap Scan on foo_2 foo (cost=3280.30..272541.10 rows=121903 width=56) (actual time=30.504..77.033 rows=171477 loops=1) Recheck Cond: (id = 2) -> Bitmap Index Scan on foo_2_idx_1 (cost=0.00..3249.82 rows=121903 width=0) (actual time=29.825..29.825 rows=171477 loops=1) Index Cond: (id = 2) . . .
我能做些什么让刨床有更好的计划?我是否需要运行ALTER TABLE foo ALTER COLUMN id SET STATISTICS 10000;对于所有分区呢?
编辑
在使用Erwin建议的查询更改后,计划程序仅扫描正确的分区,但执行时间实际上比完整扫描(至少是索引)更差.
EXPLAIN ANALYZE select * from foo where (id % 30 = 2) and (id = 2); QUERY PLAN QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------- Result (cost=0.00..8106617.40 rows=3620981 width=54) (actual time=32.611..224.934 rows=171477 loops=1) -> Append (cost=0.00..8106617.40 rows=3620981 width=54) (actual time=32.606..116.565 rows=171477 loops=1) -> Seq Scan on foo (cost=0.00..0.00 rows=1 width=203) (actual time=0.002..0.002 rows=0 loops=1) Filter: (id = 2) -> Bitmap Heap Scan on foo_0 foo (cost=3293.44..281055.75 rows=122479 width=52) (actual time=0.046..0.046 rows=0 loops=1) Recheck Cond: (id = 2) -> Bitmap Index Scan on foo_0_idx_1 (cost=0.00..3262.82 rows=122479 width=0) (actual time=0.044..0.044 rows=0 loops=1) Index Cond: (id = 2) -> Bitmap Heap Scan on foo_1 foo (cost=3312.59..274769.09 rows=122968 width=56) (actual time=0.021..0.021 rows=0 loops=1) Recheck Cond: (id = 2) -> Bitmap Index Scan on foo_1_idx_1 (cost=0.00..3281.85 rows=122968 width=0) (actual time=0.020..0.020 rows=0 loops=1) Index Cond: (id = 2) -> Bitmap Heap Scan on foo_2 foo (cost=3280.30..272541.10 rows=121903 width=56) (actual time=32.536..86.730 rows=171477 loops=1) Recheck Cond: (id = 2) -> Bitmap Index Scan on foo_2_idx_1 (cost=0.00..3249.82 rows=121903 width=0) (actual time=31.842..31.842 rows=171477 loops=1) Index Cond: (id = 2) -> Bitmap Heap Scan on foo_3 foo (cost=3475.87..285574.05 rows=129032 width=52) (actual time=0.035..0.035 rows=0 loops=1) Recheck Cond: (id = 2) -> Bitmap Index Scan on foo_3_idx_1 (cost=0.00..3443.61 rows=129032 width=0) (actual time=0.031..0.031 rows=0 loops=1) . . . -> Bitmap Heap Scan on foo_29 foo (cost=3401.84..276569.90 rows=126245 width=56) (actual time=0.019..0.019 rows=0 loops=1) Recheck Cond: (id = 2) -> Bitmap Index Scan on foo_29_idx_1 (cost=0.00..3370.28 rows=126245 width=0) (actual time=0.018..0.018 rows=0 loops=1) Index Cond: (id = 2) Total runtime: 238.790 ms
与:
EXPLAIN ANALYZE select * from foo where (id % 30 = 2) and (id = 2); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------ Result (cost=0.00..273120.30 rows=611 width=56) (actual time=31.519..257.051 rows=171477 loops=1) -> Append (cost=0.00..273120.30 rows=611 width=56) (actual time=31.516..153.356 rows=171477 loops=1) -> Seq Scan on foo (cost=0.00..0.00 rows=1 width=203) (actual time=0.002..0.002 rows=0 loops=1) Filter: ((id = 2) AND ((id % 30) = 2)) -> Bitmap Heap Scan on foo_2 foo (cost=3249.97..273120.30 rows=610 width=56) (actual time=31.512..124.177 rows=171477 loops=1) Recheck Cond: (id = 2) Filter: ((id % 30) = 2) -> Bitmap Index Scan on foo_2_idx_1 (cost=0.00..3249.82 rows=121903 width=0) (actual time=30.816..30.816 rows=171477 loops=1) Index Cond: (id = 2) Total runtime: 270.384 ms
解决方法
With constraint exclusion enabled,the planner will examine the
constraints of each partition and try to prove that the partition need
not be scanned because it could not contain any rows meeting the
query’sWHERE
clause. When the planner can prove this,it excludes
the partition from the query plan.
大胆强调我的.规划者不理解复杂的表达方式.
当然,这也必须得到满足:
Ensure that the 07001 configuration parameter is not
disabled inpostgresql.conf
. If it is,queries will not be optimized as desired.
代替
SELECT * FROM foo WHERE (id = 2);
尝试:
SELECT * FROM foo WHERE id % 30 = 2 AND id = 2;
和:
The default (and recommended) setting of 07001 is
actually neitheron
noroff
,but an intermediate setting called
partition
,which causes the technique to be applied only to queries
that are likely to be working on partitioned tables. The on setting
causes the planner to examineCHECK
constraints in all queries,even
simple ones that are unlikely to benefit.
您可以尝试使用constraint_exclusion = on来查看计划程序是否捕获而没有多余的逐字条件.但是你必须权衡这个设置的成本和收益.
替代方案是分区的简单条件,如outlined by @harmic所示.