sql-server – 使用SQL Server 2016系统版本化的时态表为慢速变化维度查询策略

使用 system-versioned temporal table(sql Server 2016中的新增功能)时,使用此功能处理大型关系数据仓库中的缓慢变化维度时,查询创作和性能影响是什么？

例如,假设我有一个带有Postal Code列的100,000行Customer维度和一个带有CustomerID外键列的数十亿行Sales事实表.并假设我想查询“按客户邮政编码计算的2014年总销售额”.简化的DDL就像这样(为清楚起见,省略了许多列)：

CREATE TABLE Customer
(
    CustomerID int identity (1,1) NOT NULL PRIMARY KEY CLUSTERED,PostalCode varchar(50) NOT NULL,SysStartTime datetime2 GENERATED ALWAYS AS ROW START NOT NULL,SysEndTime datetime2 GENERATED ALWAYS AS ROW END NOT NULL,PERIOD FOR SYSTEM_TIME (SysStartTime,SysEndTime) 
)
WITH (SYSTEM_VERSIONING = ON);

CREATE TABLE Sale
(
    SaleId int identity(1,SaleDateTime datetime2 NOT NULL,CustomerId int NOT NULL FOREIGN KEY REFERENCES Customer(CustomerID),SaleAmount decimal(10,2) NOT NULL
);

有趣的是,客户可能在一年内搬家,因此同一客户可能有不同的邮政编码.甚至远程可能客户搬走然后搬回去,这意味着同一个客户可能会有相同邮政编码的多个历史记录！无论客户的邮政编码如何随时间变化,我对“邮政编码销售”的查询都应该能够计算出正确的结果.

我理解如何使用时态表来单独查询客户维度(例如SELECT * FROM Customer FOR SYSTEM_TIME FROM’2014-1-1’to’2015-1-1′)但我不确定如何最准确和有效加入事实表.

这是我应该如何查询它？

SELECT c.PostalCode,sum(s.SaleAmount) SaleAmount
FROM Customer c FOR SYSTEM_TIME FROM '2014-1-1' TO '2015-1-1'
    JOIN Sale s ON s.CustomerId = c.CustomerId
WHERE s.SaleDateTime >= '2014-1-1' AND s.SaleDateTime < '2015-1-1'
    AND c.SysStartTime >= s.SaleDateTime
    AND c.SysEndTime < s.SaleDateTime
GROUP BY c.PostalCode

在进行这样的查询时,我应该注意哪些性能考虑因素？

解决方法

我认为,在您的情况下,需要一个派生表来隔离每个客户的查询数量的邮政编码突变：

SELECT c.postalcode,sum(s.SaleAmount) SaleAmount,count(postcode_mutations.customerid) as CntCustomerChangedPostCode   
FROM dbo.Sale s
JOIN dbo.Customer c on s.customerid = c.customerid

LEFT JOIN (
SELECT 
    CustomerID
FROM [dbo].[Customer]
FOR SYSTEM_TIME FROM '20140101' TO '20150101'
GROUP BY CustomerID
HAVING COUNT(DISTINCT PostalCode) > 1
) postcode_mutations on s.customerid = postcode_mutations.customerid

WHERE s.SaleDateTime >= '2014-1-1' AND s.SaleDateTime < '2015-1-1'
GROUP BY c.PostalCode

upd：由于查询应该为DWH / Analytics方案提供服务,因此列存储索引是一个要检查的选项.我还为made some benchmarks previously换了1000万行表.

原文链接：https://www.f2er.com/mssql/79879.html

sql-server – 使用SQL Server 2016系统版本化的时态表为慢速变化维度查询策略

解决方法

猜你在找的MsSQL相关文章