我有一个sql语句,它将行插入到TRACKING_NUMBER列上具有聚簇索引的表中.
例如.:
INSERT INTO TABL_NAME (TRACKING_NUMBER,COLB,COLC) SELECT TRACKING_NUMBER,COL_B,COL_C FROM STAGING_TABLE
我的问题是 – 在SELECT语句中为聚集索引列使用ORDER BY子句是否有帮助,或者ORDER BY子句所需的额外排序是否可以取消任何获得的增益?
解决方法
由于其他答案已经表明sql Server可能会或可能不会明确确保在插入之前按聚簇索引顺序对行进行排序.
这取决于计划中的聚簇索引运算符是否具有DMLRequestSort属性集(这又取决于插入的估计行数).
如果您发现sql Server因为某种原因低估了这一点,您可能会从SELECT查询中添加显式ORDER BY以最大限度地减少页面拆分并从INSERT操作中产生碎片中获益
例:
use tempdb; GO CREATE TABLE T(N INT PRIMARY KEY,Filler char(2000)) CREATE TABLE T2(N INT PRIMARY KEY,Filler char(2000)) GO DECLARE @T TABLE (U UNIQUEIDENTIFIER PRIMARY KEY DEFAULT NEWID(),N int) INSERT INTO @T(N) SELECT number FROM master..spt_values WHERE type = 'P' AND number BETWEEN 0 AND 499 /*Estimated row count wrong as inserting from table variable*/ INSERT INTO T(N) SELECT T1.N*1000 + T2.N FROM @T T1,@T T2 /*Same operation using explicit sort*/ INSERT INTO T2(N) SELECT T1.N*1000 + T2.N FROM @T T1,@T T2 ORDER BY T1.N*1000 + T2.N SELECT avg_fragmentation_in_percent,fragment_count,page_count,avg_page_space_used_in_percent,record_count FROM sys.dm_db_index_physical_stats(2,OBJECT_ID('T'),NULL,'DETAILED') ; SELECT avg_fragmentation_in_percent,OBJECT_ID('T2'),'DETAILED') ;
表明T是大规模碎片
avg_fragmentation_in_percent fragment_count page_count avg_page_space_used_in_percent record_count ---------------------------- -------------------- -------------------- ------------------------------ -------------------- 99.3116118225536 92535 92535 67.1668272794663 250000 99.5 200 200 74.2868173956017 92535 0 1 1 32.0978502594514 200
但对于T2碎片很少
avg_fragmentation_in_percent fragment_count page_count avg_page_space_used_in_percent record_count ---------------------------- -------------------- -------------------- ------------------------------ -------------------- 0.376 262 62500 99.456387447492 250000 2.1551724137931 232 232 43.2438349394613 62500 0 1 1 37.2374598468001 232
相反,有时您可能希望在知道数据已经预先排序并希望避免不必要的排序时强制sql Server低估行数.一个值得注意的例子是使用newsequentialid聚簇索引键将大量行插入到表中. In versions of SQL Server prior to Denali SQL Server adds an unnecessary and potentially expensive sort operation.这可以通过以下方式避免
DECLARE @var INT =2147483647 INSERT INTO Foo SELECT TOP (@var) * FROM Bar
然后,sql Server将估计将插入100行,而不管Bar的大小是否低于将排序添加到计划的阈值.但是,正如下面的评论中所指出的那样,这确实意味着插件将无法利用最少的日志记录.