二、Partitioning_table
2.1 关于分区表
借助表的继承特性Postgresql实现了分区表功能,虽然相比Oracle、MysqL的分区表来说其实现过程比较麻烦,但是这种方式同样能达到分区的效果,而且对大表的查询优化效果很明显。
Postgresql的分区表概念与其它数据库的分区表概念相同,都是将逻辑上的一个大表分割成物理上的多个子块。分区带来的不仅仅是访问性能上的提升,而且还可以使管理与维护变得更加方便、简单。
表分区的优点:
分区减小了每个块上的索引大小,使之前由于整个数据集上的索引过大而无法完全放入内存中导致读写操作产生大量的磁盘访问的问题得到改善;
批量加载或删除可通过添加或删除分区来完成。使用ALTER TABLE NO INHERIT和DROP TABLE远远要比批量操作快得多,此类命令避免了由于批量DELETE而引发的VACUUM额外开销;
可以将很少用到的数据移到便宜的、速度慢的存储介质上去。
通常建议当表的大小超过了服务器物理内存的时候创建分区表,但是究竟哪个表在分区策略中受益最终取决于应用程序。
目前,Postgresql没有单独的分区表功能,而是通过表继承的方式来实现该功能,每个分区被创建为一个唯一父表的子表,父表中通常不存放数据,它的存在只是为了代表整个数据集。
Postgresql目前支持的分区类型:Range Partitioning(范围分区)、List Partitioning(列表分区)
范围分区:表被一个或多个键字字段分区成“范围”,范围中不重叠的数值分布到不同的分区里。
列表分区:明确地列出每个分区里应该出现哪些键字值。
组合分区:将现有分区类型组合使用(先范围再列表)。
2.2 分区表实例
2.2.1 范围分区实例
创建主表:
part=#createtablepeople(idintnotnull,namevarchar(20)notnull,logdatedatenotnull); CREATETABLE
创建分区表:
part=#createtablepeople_y2013m12(check(logdate>=DATE'2013-12-01'ANDlogdate<DATE'2014-01-01'))inherits(people); CREATETABLE part=#createtablepeople_y2014m01(check(logdate>=DATE'2014-01-01'ANDlogdate<DATE'2014-02-01'))inherits(people); CREATETABLE part=#createtablepeople_y2014m02(check(logdate>=DATE'2014-02-01'ANDlogdate<DATE'2014-03-01'))inherits(people); CREATETABLE
为分区表创建索引:
part=#createindexpeople_y2013m12_logdateonpeople_y2013m12(logdate); CREATEINDEX part=#createindexpeople_y2014m01_logdateonpeople_y2014m01(logdate); CREATEINDEX part=#createindexpeople_y2014m02_logdateonpeople_y2014m02(logdate); CREATEINDEX
建立触发器函数:
part=#createorreplacefunctionpeople_insert_trigger() part-#returnstriggeras$$ part$#begin part$#if(new.logdate>=DATE'2013-12-01'and part$#new.logdate<DATE'2014-01-01')then part$#insertintopeople_y2013m12values(new.*); part$#elsif(new.logdate>=DATE'2014-01-01'and part$#new.logdate<DATE'2014-02-01')then part$#insertintopeople_y2014m01values(new.*); part$#elsif(new.logdate>=DATE'2014-02-01'and part$#new.logdate<DATE'2014-03-01')then part$#insertintopeople_y2014m02values(new.*); part$#else part$#RAISEEXCEPTION'Dateoutofrange.Fixthepeople_insert_trigger()function!'; part$#endif; part$#returnnull; part$#end; part$#$$ part-#languageplpgsql; CREATEFUNCTION
创建触发器:
part=#CREATETRIGGERinsert_people_trigger part-#BEFOREINSERTONpeople part-#FOREACHROWEXECUTEPROCEDUREpeople_insert_trigger(); CREATETRIGGER
查看主表信息:
part=#\d+people Table"public.people" Column|Type|Modifiers|Storage|Statstarget|Description ---------+-----------------------+-----------+----------+--------------+------------- id|integer|notnull|plain|| name|charactervarying(20)|notnull|extended|| logdate|date|notnull|plain|| Triggers: insert_people_triggerBEFOREINSERTONpeopleFOREACHROWEXECUTEPROCEDUREpeople_insert_trigger() Childtables:people_y2013m12,people_y2014m01,people_y2014m02 HasOIDs:no
测试:
part=#insertintopeoplevalues(1,'lian1','2013-12-10'); INSERT00 part=#insertintopeoplevalues(2,'lian2','2014-01-10'); INSERT00 part=#insertintopeoplevalues(3,'lian3','2014-02-10'); INSERT00 part=#insertintopeoplevalues(4,'lian4','2014-03-10'); ERROR:Dateoutofrange.Fixthepeople_insert_trigger()function! STATEMENT:insertintopeoplevalues(4,'2014-03-10'); ERROR:Dateoutofrange.Fixthepeople_insert_trigger()function!
查看数据及数据在分区上的具体分布:
part=#select*frompeople; id|name|logdate ----+-------+------------ 1|lian1|2013-12-10 2|lian2|2014-01-10 3|lian3|2014-02-10 (3rows) part=#SELECTp.relname,c.*FROMpeoplec,pg_classpWHEREc.tableoid=p.oid; relname|id|name|logdate -----------------+----+-------+------------ people_y2013m12|1|lian1|2013-12-10 people_y2014m01|2|lian2|2014-01-10 people_y2014m02|3|lian3|2014-02-10 (3rows)
2.2.2 列表分区实例
list=#createtablecustomer(idint,namevarchar(20),cityvarchar(30)); CREATETABLE
list=#createtablecustomer_sd(check(cityin('jinan','qingdao','weifang','zibo')))inherits(customer); CREATETABLE list=#createtablecustomer_sx(check(cityin('xian','xianyang','weinan','baoji')))inherits(customer); CREATETABLE
查看主表结构信息:
list=#\d+customer Table"public.customer" Column|Type|Modifiers|Storage|Statstarget|Description --------+-----------------------+-----------+----------+--------------+------------- id|integer||plain|| name|charactervarying(20)||extended|| city|charactervarying(30)||extended|| Childtables:customer_sd,customer_sx HasOIDs:no
list=#createindexcustomer_sd_cityoncustomer_sd(city); CREATEINDEX list=#createindexcustomer_sx_cityoncustomer_sx(city); CREATEINDEX
创建触发器函数:
list=#createorreplacefunctioncustomer_insert_trigger() list-#returnstriggeras$$ list$#begin list$#if(new.cityin('jinan','zibo'))then list$#insertintocustomer_sdvalues(new.*); list$#elsif(new.cityin('xian','baoji'))then list$#insertintocustomer_sxvalues(new.*); list$#else list$#RAISEEXCEPTION'Cityoutoflist.Fixthecustomer_insert_trigger()function!'; list$#endif; list$#returnnull; list$#end; list$#$$ list-#languageplpgsql; CREATEFUNCTION
list=#CREATETRIGGERinsert_customer_trigger list-#BEFOREINSERTONcustomer list-#FOREACHROWEXECUTEPROCEDUREcustomer_insert_trigger(); CREATETRIGGER
查看主表结构:
list=#\d+customer Table"public.customer" Column|Type|Modifiers|Storage|Statstarget|Description --------+-----------------------+-----------+----------+--------------+------------- id|integer||plain|| name|charactervarying(20)||extended|| city|charactervarying(30)||extended|| Triggers: insert_customer_triggerBEFOREINSERTONcustomerFOREACHROWEXECUTEPROCEDUREcustomer_insert_trigger() Childtables:customer_sd,'宋体';font-size:13px;white-space:normal;">插入测试数据:list=#insertintocustomervalues(1,'xian'); INSERT00 list=#insertintocustomervalues(2,'qingdao'); INSERT00 list=#insertintocustomervalues(3,'jinan'); INSERT00 list=#insertintocustomervalues(4,'weinan'); INSERT00 list=#insertintocustomervalues(5,'lian5','xianyang'); INSERT00 list=#insertintocustomervalues(6,'lian6','beijing'); ERROR:Cityoutoflist.Fixthecustomer_insert_trigger()function!查看数据在分区上的分布:
list=#SELECTp.relname,c.*FROMcustomerc,pg_classpWHEREc.tableoid=p.oid; relname|id|name|city -------------+----+-------+---------- customer_sd|2|lian2|qingdao customer_sd|3|lian3|jinan customer_sx|1|lian1|xian customer_sx|4|lian4|weinan customer_sx|5|lian5|xianyang (5rows)
2.3 分区管理
2.3.1 删除分区
直接删除:
part=#droptablepeople_y2013m12; DROPTABLE part=#select*frompeople; id|name|logdate ----+-------+------------ 2|lian2|2014-01-10 3|lian3|2014-02-10 (2rows)
part=#\d+people Table"public.people" Column|Type|Modifiers|Storage|Statstarget|Description ---------+-----------------------+-----------+----------+--------------+------------- id|integer|notnull|plain|| name|charactervarying(20)|notnull|extended|| logdate|date|notnull|plain|| Triggers: insert_people_triggerBEFOREINSERTONpeopleFOREACHROWEXECUTEPROCEDUREpeople_insert_trigger() Childtables:people_y2014m01,'宋体';font-size:13px;white-space:normal;">或不删除分区而是仅取消继承关系:part=#altertablepeople_y2014m02noinheritpeople; ALTERTABLE {将成为一个普通表存在,数据依然存在} part=#select*frompeople_y2014m02; id|name|logdate ----+-------+------------ 3|lian3|2014-02-10 (1row) part=#\d+people Table"public.people" Column|Type|Modifiers|Storage|Statstarget|Description ---------+-----------------------+-----------+----------+--------------+------------- id|integer|notnull|plain|| name|charactervarying(20)|notnull|extended|| logdate|date|notnull|plain|| Triggers: insert_people_triggerBEFOREINSERTONpeopleFOREACHROWEXECUTEPROCEDUREpeople_insert_trigger() Childtables:people_y2014m01 HasOIDs:no取消继承关系的分区同样可以恢复继承关系(或清空数据后再重新继承):
part=#altertablepeople_y2014m02inheritpeople; ALTERTABLE part=#\d+people Table"public.people" Column|Type|Modifiers|Storage|Statstarget|Description ---------+-----------------------+-----------+----------+--------------+------------- id|integer|notnull|plain|| name|charactervarying(20)|notnull|extended|| logdate|date|notnull|plain|| Triggers: insert_people_triggerBEFOREINSERTONpeopleFOREACHROWEXECUTEPROCEDUREpeople_insert_trigger() Childtables:people_y2014m01,people_y2014m02 HasOIDs:no part=#select*frompeople; id|name|logdate ----+-------+------------ 2|lian2|2014-01-10 3|lian3|2014-02-10 (2rows)2.3.2 添加分区
为people重新添加分区people_y2013m12:
part=#createtablepeople_y2013m12(check(logdate>=DATE'2013-12-01'ANDlogdate<DATE'2014-01-01'))inherits(people); CREATETABLE part=#createindexpeople_y2013m12_logdateonpeople_y2013m12(logdate); CREATEINDEX part=#\d+people Table"public.people" Column|Type|Modifiers|Storage|Statstarget|Description ---------+-----------------------+-----------+----------+--------------+------------- id|integer|notnull|plain|| name|charactervarying(20)|notnull|extended|| logdate|date|notnull|plain|| Triggers: insert_people_triggerBEFOREINSERTONpeopleFOREACHROWEXECUTEPROCEDUREpeople_insert_trigger() Childtables:people_y2013m12,people_y2014m02 HasOIDs:no {people_y2013m12已加入进继承关系中}注意:若添加一个在触发器函数中未定义过范围的新分区则需要同时修改触发器函数。
建议:一般创建触发器函数时一个好习惯是将条件写得更未来一些(如多写一年或几年的时间),这样可以避免以后在需要创建新分区的时候需要重新创建触发器函数。
另一种添加新分区的方式
{创建一个与主表类似的表 → 为新表增加约束 → 建立继承关系}
语法如下:
createtablet_name(likeparent_nameincludingdefaultsincludingconstraints); altertablet_nameaddconstraintconstr_namecheck(logdate>=DATE'……'ANDlogdate<DATE'……'); altertablet_nameinheritparent_name;2.3.3 分区表查询优化
通过调整数据库系统参数constraint_exclusion,控制是否检查表约束达到优化查询的效果。
constraint_exclusion = on/off/partition
on:检查所有表中的约束来优化查询
off:不检查表中的约束
partition:只检查继承子表和UNION ALL子句中涉及的约束(默认)
partition(默认):
part=#showconstraint_exclusion; constraint_exclusion ---------------------- partition (1row) part=#explainselect*frompeoplewherelogdate>='2014-01-01'; QUERYPLAN ---------------------------------------------------------------------------------------------- Append(cost=0.00..39.76rows=567width=66) ->SeqScanonpeople(cost=0.00..0.00rows=1width=66) Filter:(logdate>='2014-01-01'::date) ->BitmapHeapScanonpeople_y2014m01(cost=6.34..19.88rows=283width=66) RecheckCond:(logdate>='2014-01-01'::date) ->BitmapIndexScanonpeople_y2014m01_logdate(cost=0.00..6.27rows=283width=0) IndexCond:(logdate>='2014-01-01'::date) ->BitmapHeapScanonpeople_y2014m02(cost=6.34..19.88rows=283width=66) RecheckCond:(logdate>='2014-01-01'::date) ->BitmapIndexScanonpeople_y2014m02_logdate(cost=0.00..6.27rows=283width=0) IndexCond:(logdate>='2014-01-01'::date) (11rows){只扫描符合条件的分区}
on:
part=#setconstraint_exclusion=on; SET part=#explainselect*frompeoplewherelogdate>='2014-01-01'; QUERYPLAN ---------------------------------------------------------------------------------------------- Append(cost=0.00..39.76rows=567width=66) ->SeqScanonpeople(cost=0.00..0.00rows=1width=66) Filter:(logdate>='2014-01-01'::date) ->BitmapHeapScanonpeople_y2014m01(cost=6.34..19.88rows=283width=66) RecheckCond:(logdate>='2014-01-01'::date) ->BitmapIndexScanonpeople_y2014m01_logdate(cost=0.00..6.27rows=283width=0) IndexCond:(logdate>='2014-01-01'::date) ->BitmapHeapScanonpeople_y2014m02(cost=6.34..19.88rows=283width=66) RecheckCond:(logdate>='2014-01-01'::date) ->BitmapIndexScanonpeople_y2014m02_logdate(cost=0.00..6.27rows=283width=0) IndexCond:(logdate>='2014-01-01'::date) (11rows)off:
part=#setconstraint_exclusion=off; SET part=#explainselect*frompeoplewherelogdate>='2014-01-01'; QUERYPLAN ---------------------------------------------------------------------------------------------- Append(cost=0.00..59.64rows=850width=66) ->SeqScanonpeople(cost=0.00..0.00rows=1width=66) Filter:(logdate>='2014-01-01'::date) ->BitmapHeapScanonpeople_y2014m01(cost=6.34..19.88rows=283width=66) RecheckCond:(logdate>='2014-01-01'::date) ->BitmapIndexScanonpeople_y2014m01_logdate(cost=0.00..6.27rows=283width=0) IndexCond:(logdate>='2014-01-01'::date) ->BitmapHeapScanonpeople_y2014m02(cost=6.34..19.88rows=283width=66) RecheckCond:(logdate>='2014-01-01'::date) ->BitmapIndexScanonpeople_y2014m02_logdate(cost=0.00..6.27rows=283width=0) IndexCond:(logdate>='2014-01-01'::date) ->BitmapHeapScanonpeople_y2013m12(cost=6.34..19.88rows=283width=66) RecheckCond:(logdate>='2014-01-01'::date) ->BitmapIndexScanonpeople_y2013m12_logdate(cost=0.00..6.27rows=283width=0) IndexCond:(logdate>='2014-01-01'::date) (15rows){对所有分区进行扫描}
2.4 通过rule重定向实现分区规则
实现分区规则不一定要用触发器,也可通过rule的方式来实现。
rule=#createtablepeople(idintnotnull,'宋体';border-width:1px 1px 1px 5px;border-style:solid;">rule=#createtablepeople_y2013m12(check(logdate>=DATE'2013-12-01'ANDlogdate<DATE'2014-01-01'))inherits(people); CREATETABLE rule=#createtablepeople_y2014m01(check(logdate>=DATE'2014-01-01'ANDlogdate<DATE'2014-02-01'))inherits(people); CREATETABLE rule=#createtablepeople_y2014m02(check(logdate>=DATE'2014-02-01'ANDlogdate<DATE'2014-03-01'))inherits(people); CREATETABLE
在主表上创建规则:
rule=#createrulepeople_insert_y2013m12as rule-#oninserttopeoplewhere rule-#(logdate>=DATE'2013-12-01'ANDlogdate<DATE'2014-01-01') rule-#doinstead rule-#insertintopeople_y2013m12values(new.*); CREATERULE rule=#createrulepeople_insert_y2014m01as rule-#oninserttopeoplewhere rule-#(logdate>=DATE'2014-01-01'ANDlogdate<DATE'2014-02-01') rule-#doinstead rule-#insertintopeople_y2014m01values(new.*); CREATERULE rule=#createrulepeople_insert_y2014m02as rule-#oninserttopeoplewhere rule-#(logdate>=DATE'2014-02-01'ANDlogdate<DATE'2014-03-01') rule-#doinstead rule-#insertintopeople_y2014m02values(new.*); CREATERULErule=#\d+people Table"public.people" Column|Type|Modifiers|Storage|Statstarget|Description ---------+-----------------------+-----------+----------+--------------+------------- id|integer|notnull|plain|| name|charactervarying(20)|notnull|extended|| logdate|date|notnull|plain|| Rules: people_insert_y2013m12AS ONINSERTTOpeople WHEREnew.logdate>='2013-12-01'::dateANDnew.logdate<'2014-01-01'::dateDOINSTEADINSERTINTOpeople_y2013m12(id,name,logdate) VALUES(new.id,new.name,new.logdate) people_insert_y2014m01AS ONINSERTTOpeople WHEREnew.logdate>='2014-01-01'::dateANDnew.logdate<'2014-02-01'::dateDOINSTEADINSERTINTOpeople_y2014m01(id,new.logdate) people_insert_y2014m02AS ONINSERTTOpeople WHEREnew.logdate>='2014-02-01'::dateANDnew.logdate<'2014-03-01'::dateDOINSTEADINSERTINTOpeople_y2014m02(id,new.logdate) Childtables:people_y2013m12,'宋体';border-width:1px 1px 1px 5px;border-style:solid;">rule=#insertintopeoplevalues(1,'2013-12-05'); INSERT00 rule=#insertintopeoplevalues(2,'2014-01-05'); INSERT00 rule=#insertintopeoplevalues(3,'2014-02-05'); INSERT00 rule=#insertintopeoplevalues(4,'2014-03-05'); INSERT01 rule=#SELECTp.relname,pg_classpWHEREc.tableoid=p.oid; relname|id|name|logdate -----------------+----+-------+------------ people|4|lian4|2014-03-05 people_y2013m12|1|lian1|2013-12-05 people_y2014m01|2|lian2|2014-01-05 people_y2014m02|3|lian3|2014-02-05 (4rows) {不符合条件的数据被插入到了主表中}
rule方式的优缺点:
批量插入(非copy)时,rule方式比trigger方式效率高。因为rule方式的额外开销是基于表而非基于行;
copy会忽略rule。若要使用copy插入数据并且分区是通过rule方式实现,则需要直接copy到对应的分区表中。而trigger方式则不存在这样的问题;
rule方式中不在定义范围内的插入不会报错,而是直接将数据插入到主表中。
2.5 注意事项
分区表
a=#SELECTp.relname,c.*FROMtc,pg_classpWHEREc.tableoid=p.oid; relname|id|logdate ------------+----+------------ t_y2014m01|1|2014-01-10 t_y2014m02|2|2014-02-10 (2rows) a=#updatetsetlogdate='2014-02-05'wherelogdate='2014-01-10'; ERROR:newrowforrelation"t_y2014m01"violatescheckconstraint"t_y2014m01_logdate_check" DETAIL:Failingrowcontains(1,2014-02-05). STATEMENT:updatetsetlogdate='2014-02-05'wherelogdate='2014-01-10'; ERROR:newrowforrelation"t_y2014m01"violatescheckconstraint"t_y2014m01_logdate_check" DETAIL:Failingrowcontains(1,2014-02-05). a=#updatetsetlogdate='2014-01-05'wherelogdate='2014-01-10'; UPDATE1 a=#SELECTp.relname,pg_classpWHEREc.tableoid=p.oid; relname|id|logdate ------------+----+------------ t_y2014m01|1|2014-01-05 t_y2014m02|2|2014-02-10 (2rows) a=#updatetsetlogdate='2014-01-05'wherelogdate='2014-01-10'; UPDATE1 a=#SELECTp.relname,pg_classpWHEREc.tableoid=p.oid; relname|id|logdate ------------+----+------------ t_y2014m01|1|2014-01-05 t_y2014m02|2|2014-02-10 (2rows) a=#updatet_y2014m01setlogdate='2014-02-05'wherelogdate='2014-01-05'; ERROR:newrowforrelation"t_y2014m01"violatescheckconstraint"t_y2014m01_logdate_check" DETAIL:Failingrowcontains(1,2014-02-05). STATEMENT:updatet_y2014m01setlogdate='2014-02-05'wherelogdate='2014-01-05'; ERROR:newrowforrelation"t_y2014m01"violatescheckconstraint"t_y2014m01_logdate_check" DETAIL:Failingrowcontains(1,2014-02-05).
在对表进行VACUUM或ANALYZE操作时,需要在每一个分区表上操作,并非直接对主表操作即可。
约束排除
只当查询的where子句中包含约束时有效;
CHECK中应避免跨数据类型的比较,因为规划器无法做出正确的判断;
由于规划器会对主表的所有分区表上的约束都会进行一次检查,所以过多的分区会需要大量的约束排除检查时间,因此要避免在分区多达成百上千的情况下使用。
Postgresql从继承到分区(一)
http://my.oschina.net/lianshunke/blog/205296
Postgresql从继承到分区(二)
http://my.oschina.net/lianshunke/blog/205296
Postgresql从继承到分区(三)
原文链接:https://www.f2er.com/postgresql/195671.html