一、 环境:
Postgresql 9.3.5
Bucardo 5.3.0
A与B通过Bucardo做同步
二、状态监测
--停掉B服务器的DB,能看到sync_adm的状态还是好的,这是假象 [postgres@his-db02 ~]$ bucardo status PID of Bucardo MCP: 25822 Name State Last good Time Last I/D Last bad Time ==========+========+============+========+===========+===========+======= sync_adm | Good | 11:18:12 | 9m 30s | 0/0 | none | --看sync_adm的状态 [postgres@his-db02 ~]$ bucardo status sync_adm ====================================================================== Last good : Jan 23,2015 11:18:12 (time to run: 1s) Rows deleted/inserted : 0 / 0 Sync name : sync_adm Current state : Good Source relgroup/database : herd_adm / source_db_adm Tables in sync : 1 Status : Stalled Check time : None Overdue time : 00:00:00 Expired time : 00:00:00 Stayalive/Kidsalive : Yes / Yes Rebuild index : No Autokick : Yes Onetimecopy : No Post-copy analyze : Yes Last error: : ======================================================================这时可以看到这个状态是Stalled的,两边的同步已经断掉了,此时需要重启一下bucardo,重启完后会自动同步,如果只是做主从的话,这点不如Postgresql内置的replication stream,好处是同步的粒度是表级的,比内置的流复制更细。
三、日志
(25862) [Fri Jan 23 10:57:17 2015] KID (sync_adm) Totals: deletes=1 inserts=1 conflicts=0 (25862) [Fri Jan 23 10:57:51 2015] KID (sync_adm) Delta count for source_db_adm.public.test_ken : 1 (25862) [Fri Jan 23 10:57:51 2015] KID (sync_adm) Totals: deletes=1 inserts=1 conflicts=0 (25822) [Fri Jan 23 11:17:49 2015] MCP Ping Failed for database target_db_adm,trying to reconnect (25862) [Fri Jan 23 11:17:49 2015] KID (sync_adm) Kid has died,error is: Ping Failed for database "target_db_adm" Line: 5413 (25862) [Fri Jan 23 11:17:49 2015] KID (sync_adm) Ping Failed for database target_db_adm (25862) [Fri Jan 23 11:17:49 2015] KID (sync_adm) Kid 25862 exiting at cleanup_kid. Sync "sync_adm" public.test_ken Reason: Ping Failed for database "target_db_adm" Line: 5413 (25822) [Fri Jan 23 11:17:49 2015] MCP Starting check_sync_health (25822) [Fri Jan 23 11:17:49 2015] MCP Database target_db_adm Failed ping (25822) [Fri Jan 23 11:17:49 2015] MCP Warning: Killed (line 44): DBI connect('dbname=admin;port=5432;host=192.168.2.90','postgres',...) Failed: could not connect to server: Connection refused Is the server running on host "192.168.2.90" and accepting TCP/IP connections on port 5432? at Bucardo.pm line 5644 (25822) [Fri Jan 23 11:17:49 2015] MCP Database target_db_adm is unreachable,marking as stalled (25822) [Fri Jan 23 11:17:49 2015] MCP Marked sync sync_adm as stalled (29392) [Fri Jan 23 11:17:50 2015] KID (sync_adm) New kid,sync "sync_adm" alive=1 Parent=25848 PID=29392 kicked=1 (29392) [Fri Jan 23 11:17:50 2015] KID (sync_adm) Kid has died,error is: DBI connect('dbname=admin;port=5432;host=192.168.2.90',...) Failed: could not connect to server: Connection refused Is the server running on host "192.168.2.90" and accepting TCP/IP connections on port 5432? at Bucardo.pm line 5644 Line: 3213 (29392) [Fri Jan 23 11:17:50 2015] KID (sync_adm) Missing target_db_adm database handle (29392) [Fri Jan 23 11:17:50 2015] KID (sync_adm) Kid 29392 exiting at cleanup_kid. Sync "sync_adm" Reason: DBI connect('dbname=admin;port=5432;host=192.168.2.90',...) Failed: could not connect to server: Connection refused Is the server running on host "192.168.2.90" and accepting TCP/IP connections on port 5432? at Bucardo.pm line 5644 Line: 3213 (25822) [Fri Jan 23 11:17:50 2015] MCP Starting check_sync_health (25822) [Fri Jan 23 11:17:50 2015] MCP Skipping stalled sync sync_adm (29519) [Fri Jan 23 11:18:01 2015] KID (sync_adm) New kid,sync "sync_adm" alive=1 Parent=25848 PID=29519 kicked=1 (29519) [Fri Jan 23 11:18:01 2015] KID (sync_adm) Kid has died,...) Failed: could not connect to server: Connection refused Is the server running on host "192.168.2.90" and accepting TCP/IP connections on port 5432? at Bucardo.pm line 5644 Line: 3213 (29519) [Fri Jan 23 11:18:01 2015] KID (sync_adm) Missing target_db_adm database handle (29519) [Fri Jan 23 11:18:01 2015] KID (sync_adm) Kid 29519 exiting at cleanup_kid. Sync "sync_adm" Reason: DBI connect('dbname=admin;port=5432;host=192.168.2.90',...) Failed: could not connect to server: Connection refused Is the server running on host "192.168.2.90" and accepting TCP/IP connections on port 5432? at Bucardo.pm line 5644 Line: 3213 (25822) [Fri Jan 23 11:18:01 2015] MCP Starting check_sync_health (25822) [Fri Jan 23 11:18:01 2015] MCP Skipping stalled sync sync_adm (29652) [Fri Jan 23 11:18:12 2015] KID (sync_adm) New kid,sync "sync_adm" alive=1 Parent=25848 PID=29652 kicked=1 (25822) [Fri Jan 23 11:18:49 2015] MCP Database "target_db_adm" Local epoch: 1421983129.73297 DB epoch: 1421983089.30327 (25822) [Fri Jan 23 11:18:49 2015] MCP Database "target_db_adm" Local time: Fri Jan 23 11:18:49 2015 DB time: 2015-01-23 11:18:09.303267+08 (25822) [Fri Jan 23 11:18:49 2015] MCP Database "target_db_adm" Local timezone: CST (+0800) DB timezone: PRC (25822) [Fri Jan 23 11:18:49 2015] MCP Database "target_db_adm" Postgres version: 90305 (25822) [Fri Jan 23 11:18:49 2015] MCP Database "target_db_adm" Database port: 5432 (25822) [Fri Jan 23 11:18:49 2015] MCP Database "target_db_adm" Database host: 192.168.2.90原文链接:https://www.f2er.com/postgresql/195399.html