我们有两张桌子Tbl1&订阅服务器上的Tbl2. Tbl1正在从Publisher Server A复制,它有两个触发器 – 插入和更新.触发器是将数据插入并更新到Tbl2中.
现在,我们必须从Tbl2中清除(大约9亿条记录),这个记录总计达到了1000万条.以下是一个月到一分钟的数据分布.
>一个月 – 14986826行
>一天 – 483446行
>一小时 – 20143行
>一分钟 – 335排
我在找什么;
清除数据的最快方法,没有任何生产问题,数据一致性,也可能没有停机时间.所以,我想按照以下步骤,但卡住:(
脚步:
> BCP从现有表Tbl2中输出所需数据(大约1亿条记录,可能需要大约30分钟).
>我们假设我开始在1Fab2018上午10:00开始活动,它在1Fab2018晚上10:30结束.到活动完成时,表Tbl2将获得变为delta的新记录
>在数据库中创建名为Tbl3的新表
> BCP将导出的数据导入新创建的表Tbl3(约1亿条记录,可能需要约30分钟)
>停止复制作业
>完成BCP-in后,使用tsql脚本插入新的增量数据.
>挑战是 – 如何处理delta“更新”声明?
>启动复制
附加问题:
处理场景的最佳方法是什么?
解决方法
一个简单的试验台,没有复制,显示了一般原则:
首先,我们将为我们的测试创建一个数据库:
USE master; IF (SELECT 1 FROM sys.databases d WHERE d.name = 'SwitchTest') IS NOT NULL BEGIN ALTER DATABASE SwitchTest SET SINGLE_USER WITH ROLLBACK IMMEDIATE; DROP DATABASE SwitchTest; END CREATE DATABASE SwitchTest; ALTER DATABASE SwitchTest SET RECOVERY FULL; BACKUP DATABASE SwitchTest TO DISK = 'NUL:'; GO
在这里,我们创建了几个表,触发器将行从“A”移动到“B”,近似于您的设置.
USE SwitchTest; GO CREATE TABLE dbo.A ( i int NOT NULL CONSTRAINT PK_A PRIMARY KEY CLUSTERED IDENTITY(1,1),d varchar(300) NOT NULL,rowdate datetime NOT NULL ) ON [PRIMARY] WITH (DATA_COMPRESSION = PAGE); CREATE TABLE dbo.B ( i int NOT NULL CONSTRAINT PK_B PRIMARY KEY CLUSTERED,rowdate datetime NOT NULL ) ON [PRIMARY] WITH (DATA_COMPRESSION = PAGE); GO CREATE TRIGGER t_a ON dbo.A AFTER INSERT,UPDATE AS BEGIN SET NOCOUNT ON; DELETE FROM dbo.B FROM dbo.B b INNER JOIN deleted d ON b.i = d.i INSERT INTO dbo.B (i,d,rowdate) SELECT i.i,i.d,i.rowdate FROM inserted i; END GO
在这里,我们在“A”中插入1,000,000行,由于触发器,这些行也将插入“B”.
;WITH src AS ( SELECT i.n FROM (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9))i(n) ) INSERT INTO dbo.A (d,rowdate) SELECT d = CRYPT_GEN_RANDOM(300),DATEADD(SECOND,s6.n + (s5.n * 100000) + (s4.n * 10000) + (s3.n * 1000) + (s2.n * 100) + (s1.n * 10),'2017-01-01T00:00:00.000') FROM src s1 CROSS JOIN src s2 CROSS JOIN src s3 CROSS JOIN src s4 CROSS JOIN src s5 CROSS JOIN src s6;
清除事务日志,以避免耗尽空间.不要在生产中运行它,因为它将事务日志数据发送到“NUL”设备.
BACKUP LOG SwitchTest TO DISK = 'NUL:'; GO
此代码创建一个事务,以确保在迁移行时不会写入任何受影响的表:
BEGIN TRANSACTION EXEC sys.sp_getapplock @Resource = N'TableSwitcher',@LockMode = 'Exclusive',@LockOwner = 'Transaction',@LockTimeout = '1000',@DbPrincipal = N'dbo'; BEGIN TRY -- create a table to hold the rows we want to keep CREATE TABLE dbo.C ( i int NOT NULL CONSTRAINT PK_C PRIMARY KEY CLUSTERED,rowdate datetime NOT NULL ) ON [PRIMARY] WITH (DATA_COMPRESSION = PAGE); --copy the rows we want to keep into "C" INSERT INTO dbo.C (i,rowdate) SELECT b.i,b.d,b.rowdate FROM dbo.B WHERE b.rowdate >= '2017-01-11T10:00:00'; --truncate the entire "B" table TRUNCATE TABLE dbo.B; --"switch" table "C" into "B" ALTER TABLE dbo.C SWITCH TO dbo.B; --drop table "C",since we no longer need it DROP TABLE dbo.C; --shows the count of rows in "B" which were retained. SELECT COUNT(1) FROM dbo.B WHERE b.rowdate >= '2017-01-11T10:00:00'; --look for rows in "B" that should no longer exist. SELECT COUNT(1) FROM dbo.B WHERE b.rowdate < '2017-01-11T10:00:00'; --release the applock and commit the transaction EXEC sys.sp_releaseapplock @Resource = N'TableSwitcher',@DbPrincipal = N'dbo'; COMMIT TRANSACTION; END TRY BEGIN CATCH DECLARE @message nvarchar(1000) = ERROR_MESSAGE(); DECLARE @severity int = ERROR_SEVERITY(); DECLARE @state int = ERROR_STATE(); RAISERROR (@message,@severity,@state); EXEC sys.sp_releaseapplock @Resource = N'TableSwitcher',@DbPrincipal = N'dbo'; ROLLBACK TRANSACTION; END CATCH GO
sp_getapplock和sp_releaseapplock会阻止此代码的多个实例同时运行.如果您通过GUI重新使用此代码,这将非常有用.
(请注意,只有当访问资源的每个进程明确地实现相同的手动资源锁定逻辑时,应用程序锁才有效 – 没有任何魔法可以“锁定”表,就像sql Server自动锁定行,页面等一样.插入/更新操作.)
现在,我们测试将行插入“A”的过程,以确保它们被触发器插入“B”.
INSERT INTO dbo.A (d,rowdate) VALUES ('testRow',GETDATE()); SELECT * FROM dbo.B WHERE B.d = 'testRow'
+---------+---------+-------------------------+ | i | d | rowdate | +---------+---------+-------------------------+ | 1000001 | testRow | 2018-04-13 03:49:53.343 | +---------+---------+-------------------------+