通常,聚簇索引选项在测试中表现更好,因为只需要维护一个结构,因为不需要书签查找.
本文未涉及的一个可能有趣的案例是堆上的非聚簇索引与聚簇索引上的非聚簇索引之间的比较.在那个例子中,我原本期望堆甚至可以在NCI叶级别上执行得更好sql Server有一个RID可以直接跟随而不需要遍历聚簇索引.
是否有人知道在这个领域进行了类似的正式测试,如果是的话,结果是什么?
解决方法
> 790万条代表余额信息的记录.
>一个身份字段,从1到790万
>一个数字字段,对大约500k组中的记录进行分组.
第一个名为heap的表在字段组上获得了非聚集索引.第二个名为clust的表在名为key的顺序字段上获得了聚簇索引,在字段组上获得了非聚集索引
测试在具有2个超线程内核,4Gb内存和64位窗口7的I5 M540处理器上运行.
Microsoft sql Server 2008 R2 (RTM) - 10.50.1600.1 (X64) Apr 2 2010 15:48:46 Developer Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1)
Update on 9 Mar 2011: I did a second more extensive benchmark by running the following .net code and logging Duration,cpu,Reads,Writes and RowCounts in sql Server Profiler. (The CommandText used will be mentioned in the results.)
NOTE: cpu and Duration are expressed in milliseconds
- 1000 queries
- zero cpu queries are eliminated from the results
- 0 rows affected are eliminated from the results
int[] idList = new int[] { 6816588,7086702,6498815 ... }; // 1000 values here. using (var conn = new sqlConnection(@"Data Source=myserver;Initial Catalog=mydb;Integrated Security=SSPI;")) { conn.Open(); using (var cmd = new sqlCommand()) { cmd.Connection = conn; cmd.CommandType = CommandType.Text; cmd.CommandText = "select * from heap where common_key between @id and @id+1000"; cmd.Parameters.Add("@id",sqlDbType.Int); cmd.Prepare(); foreach (int id in idList) { cmd.Parameters[0].Value = id; using (var reader = cmd.ExecuteReader()) { int count = 0; while (reader.Read()) { count++; } Console.WriteLine(String.Format("key: {0} => {1} rows",id,count)); } } } }
End of Update on 9 Mar 2011.
SELECT性能
为了检查performanc数字,我在堆表上执行了以下一次查询,在clust表上执行了一次:
select * from heap/clust where group between 5678910 and 5679410 select * from heap/clust where group between 6234567 and 6234967 select * from heap/clust where group between 6455429 and 6455729 select * from heap/clust where group between 6655429 and 6655729 select * from heap/clust where group between 6955429 and 6955729 select * from heap/clust where group between 7195542 and 7155729
这个基准测试的结果是针对堆的:
rows reads cpu Elapsed ----- ----- ----- -------- 1503 1510 31ms 309ms 401 405 15ms 283ms 2700 2709 0ms 472ms 0 3 0ms 30ms 2953 2962 32ms 257ms 0 0 0ms 0ms
Update on 9 Mar 2011:
cmd.CommandText = "select * from heap where group between @id and @id+1000";
- 721 Rows have > 0 cpu and affect more than 0 rows
Counter Minimum Maximum Average Weighted --------- ------- ---------- ------- --------- RowCounts 1001 69788 6368 - cpu 15 374 37 0.00754 Reads 1069 91459 7682 1.20155 Writes 0 0 0 0.00000 Duration 0.3716 282.4850 10.3672 0.00180
End of Update on 9 Mar 2011.
对于表格clust结果是:
rows reads cpu Elapsed ----- ----- ----- -------- 1503 4827 31ms 327ms 401 1241 0ms 242ms 2700 8372 0ms 410ms 0 3 0ms 0ms 2953 9060 47ms 213ms 0 0 0ms 0ms
Update on 9 Mar 2011:
cmd.CommandText = "select * from clust where group between @id and @id+1000";
- 721 Rows have > 0 cpu and affect more than 0 rows
Counter Minimum Maximum Average Weighted --------- ------- ---------- ------- --------- RowCounts 1001 69788 6056 - cpu 15 468 38 0.00782 Reads 3194 227018 20457 3.37618 Writes 0 0 0 0.0 Duration 0.3949 159.6223 11.5699 0.00214
End of Update on 9 Mar 2011.
SELECT WITH JOIN性能
cmd.CommandText =“select * from heap / clust h join keys k on h.group = k.group where h.group between @id and @id 1000”;
这个基准测试的结果是针对堆的:
873行具有> 0 cpu并影响超过0行
Counter Minimum Maximum Average Weighted --------- ------- ---------- ------- --------- RowCounts 1009 4170 1683 - cpu 15 47 18 0.01175 Reads 2145 5518 2867 1.79246 Writes 0 0 0 0.00000 Duration 0.8215 131.9583 1.9095 0.00123
这个基准测试的结果是针对clust的:
865行具有> 0 cpu并影响超过0行
Counter Minimum Maximum Average Weighted --------- ------- ---------- ------- --------- RowCounts 1000 4143 1685 - cpu 15 47 18 0.01193 Reads 5320 18690 8237 4.97813 Writes 0 0 0 0.00000 Duration 0.9699 20.3217 1.7934 0.00109
更新表现
第二批查询是更新语句:
update heap/clust set amount = amount + 0 where group between 5678910 and 5679410 update heap/clust set amount = amount + 0 where group between 6234567 and 6234967 update heap/clust set amount = amount + 0 where group between 6455429 and 6455729 update heap/clust set amount = amount + 0 where group between 6655429 and 6655729 update heap/clust set amount = amount + 0 where group between 6955429 and 6955729 update heap/clust set amount = amount + 0 where group between 7195542 and 7155729
堆的这个基准测试的结果:
rows reads cpu Elapsed ----- ----- ----- -------- 1503 3013 31ms 175ms 401 806 0ms 22ms 2700 5409 47ms 100ms 0 3 0ms 0ms 2953 5915 31ms 88ms 0 0 0ms 0ms
Update on 9 Mar 2011:
cmd.CommandText = "update heap set amount = amount + @id where group between @id and @id+1000";
- 811 Rows have > 0 cpu and affect more than 0 rows
Counter Minimum Maximum Average Weighted --------- ------- ---------- ------- --------- RowCounts 1001 69788 5598 811 cpu 15 873 56 0.01199 Reads 2080 167593 11809 2.11217 Writes 0 1687 121 0.02170 Duration 0.6705 514.5347 17.2041 0.00344
End of Update on 9 Mar 2011.
clust的基准测试结果如下:
rows reads cpu Elapsed ----- ----- ----- -------- 1503 9126 16ms 35ms 401 2444 0ms 4ms 2700 16385 31ms 54ms 0 3 0ms 0ms 2953 17919 31ms 35ms 0 0 0ms 0ms
Update on 9 Mar 2011:
cmd.CommandText = "update clust set amount = amount + @id where group between @id and @id+1000";
- 853 Rows have > 0 cpu and affect more than 0 rows
Counter Minimum Maximum Average Weighted --------- ------- ---------- ------- --------- RowCounts 1001 69788 5420 - cpu 15 594 50 0.01073 Reads 6226 432237 33597 6.20450 Writes 0 1730 110 0.01971 Duration 0.9134 193.7685 8.2919 0.00155
End of Update on 9 Mar 2011.
删除基准
delete heap/clust where group between 5678910 and 5679410 delete heap/clust where group between 6234567 and 6234967 delete heap/clust where group between 6455429 and 6455729 delete heap/clust where group between 6655429 and 6655729 delete heap/clust where group between 6955429 and 6955729 delete heap/clust where group between 7195542 and 7155729
这个堆基准测试的结果:
rows reads cpu Elapsed ----- ----- ----- -------- 1503 10630 62ms 179ms 401 2838 0ms 26ms 2700 19077 47ms 87ms 0 4 0ms 0ms 2953 20865 62ms 196ms 0 4 0ms 9ms
Update on 9 Mar 2011:
cmd.CommandText = "delete heap where group between @id and @id+1000";
- 724 Rows have > 0 cpu and affect more than 0 rows
Counter Minimum Maximum Average Weighted --------- ------- ---------- ------- --------- RowCounts 192 69788 4781 - cpu 15 499 45 0.01247 Reads 841 307958 20987 4.37880 Writes 2 1819 127 0.02648 Duration 0.3775 1534.3383 17.2412 0.00349
End of Update on 9 Mar 2011.
这个clust基准的结果:
rows reads cpu Elapsed ----- ----- ----- -------- 1503 9228 16ms 55ms 401 3681 0ms 50ms 2700 24644 46ms 79ms 0 3 0ms 0ms 2953 26955 47ms 92ms 0 3 0ms 0ms
Update on 9 Mar 2011:
cmd.CommandText = "delete clust where group between @id and @id+1000";
- 751 Rows have > 0 cpu and affect more than 0 rows
Counter Minimum Maximum Average Weighted --------- ------- ---------- ------- --------- RowCounts 144 69788 4648 - cpu 15 764 56 0.01538 Reads 989 458467 30207 6.48490 Writes 2 1830 127 0.02694 Duration 0.2938 2512.1968 24.3714 0.00555
End of Update on 9 Mar 2011.
INSERT基准测试
基准测试的最后一部分是插入语句的执行.
插入堆/ clust(…)
值(…),
(…),
(……)
这个堆基准测试的结果:
rows reads cpu Elapsed ----- ----- ----- -------- 6 38 0ms 31ms
Update on 9 Mar 2011:
string str = @"insert into heap (group,currency,year,period,domain_id,mtdAmount,ytdAmount,amount,ytd_restated,restated,auditDate,auditUser) values"; for (int x = 0; x < 999; x++) { str += string.Format(@"(@id + {0},'EUR',2012,2,100,1000 + @id,1000,current_timestamp,'test'),",x); } str += string.Format(@"(@id,'CAD','test') ",1000); cmd.CommandText = str;
- 912 statements have > 0 cpu
Counter Minimum Maximum Average Weighted --------- ------- ---------- ------- --------- RowCounts 1000 1000 1000 - cpu 15 2138 25 0.02500 Reads 5212 7069 6328 6.32837 Writes 16 34 22 0.02222 Duration 1.6336 293.2132 4.4009 0.00440
End of Update on 9 Mar 2011.
这个clust基准的结果:
rows reads cpu Elapsed ----- ----- ----- -------- 6 50 0ms 18ms
Update on 9 Mar 2011:
string str = @"insert into clust (group,1000); cmd.CommandText = str;
- 946 statements have > 0 cpu
Counter Minimum Maximum Average Weighted --------- ------- ---------- ------- --------- RowCounts 1000 1000 1000 - cpu 15 2403 21 0.02157 Reads 6810 8997 8412 8.41223 Writes 16 25 19 0.01942 Duration 1.5375 268.2571 6.1463 0.00614
End of Update on 9 Mar 2011.
结论
虽然使用集群& amp;访问表时会有更多的逻辑读取.非聚集索引(使用非聚簇索引时)性能结果为:
> SELECT语句具有可比性
>使用聚簇索引时,UPDATE语句更快
>使用聚簇索引,DELETE语句更快
>使用聚簇索引时,INSERT语句更快
当然,我的基准测试对于特定类型的表以及非常有限的查询集非常有限,但我认为基于这些信息我们已经可以开始说,在表上创建聚簇索引几乎总是更好.
Update on 9 Mar 2011:
正如我们从增加的结果中看到的那样,有限测试的结论在每种情况下都不正确.
结果现在表明,受益于聚簇索引的唯一语句是update语句.使用聚簇索引的表上的其他语句约慢30%.
一些额外的图表,其中我绘制了堆与clust的每个查询的加权持续时间.
正如您所看到的,insert语句的性能配置文件非常有趣.尖峰是由一些数据点引起的,这些数据点需要更长的时间才能完成.
End of Update on 9 Mar 2011.