所以我的代码目前看起来像这样
public boolean in(TransactionType... types) { if (types == null || types.length == 0) return false; for (int i = 0; i < types.length; ++i) if (types[i] != null && types[i] == this) return true; return false; }
我把它改成了这个
public boolean in(TransactionType... types) { if (types == null || types.length == 0) return false; for (int i = 0; i < types.length; ++i) if (types[i] == this) return true; return false; }
(TransactionType是一个包含大约30个值的枚举)
结果震惊了我.在我的所有测试中,第二个测试速度提高了一个数量级.我预计可能快2倍,但不是一个数量级.为什么不同?这是一个非常慢的nullcheck,或者是额外的数组访问会发生什么奇怪的事情?
我的基准代码看起来像这样
public class App { public enum TransactionType { A(1,"A","A"),B(3,"B","B"),C(5,"C","C"),D(6,"D","D"),E(7,"E","E"),F(8,"F","F"),G(9,"G","G"),H(10,"H","H"),I(11,"I","I"),J(12,"J","J"),K(13,"K","K"),L(14,"L","L"),M(15,"M","M"),N(16,"N","N"),O(17,"O","O"),P(18,"P","P"),Q(19,"Q","Q"),R(20,"R","R"),S(21,"S","S"),T(22,"T","T"),U(25,"U","U"),V(26,"V","V"),W(27,"W","W"),X(28,"X","X"),Y(29,"Y","Y"),Z(30,"Z","Z"),AA(31,"AA","AA"),AB(32,"AB","AB"),AC(33,"AC","AC"),AD(35,"AD","AD"),AE(36,"AE","AE"),AF(37,"AF","AF"),AG(38,"AG","AG"),AH(39,"AH","AH"),AI(40,"AI","AI"),AJ(41,"AJ","AJ"),AK(42,"AK","AK"),AL(43,"AL","AL"),AM(44,"AM","AM"),AN(45,"AN","AN"),AO(46,"AO","AO"),AP(47,"AP","AP"); public final static TransactionType[] aArray = { O,Z,N,Y,AB }; public final static TransactionType[] bArray = { J,P,AA,L,Q,M,K,AE,AK,AF,AD,AG,AH }; public final static TransactionType[] cArray = { S,U,V }; public final static TransactionType[] dArray = { A,B,D,G,C,E,T,R,I,F,H,AC,AI,AJ,AL,AM,AN,AO }; private int id; private String abbrev; private String name; private TransactionType(int id,String abbrev,String name) { this.id = id; this.abbrev = abbrev; this.name = name; } public boolean in(TransactionType... types) { if (types == null || types.length == 0) return false; for (int i = 0; i < types.length; ++i) if (types[i] == this) return true; return false; } public boolean inOld(TransactionType... types) { if (types == null || types.length == 0) return false; for (int i = 0; i < types.length; ++i) { if (types[i] != null && types[i] == this) return true; } return false; } } public static void main(String[] args) { for (int i = 0; i < 10; ++i) bench2(); for (int i = 0; i < 10; ++i) bench1(); } private static void bench1() { final TransactionType[] values = TransactionType.values(); long runs = 0; long currTime = System.currentTimeMillis(); while (System.currentTimeMillis() - currTime < 1000) { for (TransactionType value : values) { value.inOld(TransactionType.dArray); } ++runs; } System.out.println("old " + runs); } private static void bench2() { final TransactionType[] values = TransactionType.values(); long runs = 0; long currTime = System.currentTimeMillis(); while (System.currentTimeMillis() - currTime < 1000) { for (TransactionType value : values) { value.in(TransactionType.dArray); } ++runs; } System.out.println("new " + runs); } }
以下是基准测试运行的结果
new 20164901 new 20084651 new 45739657 new 45735251 new 45757756 new 45726575 new 45413016 new 45649661 new 45325360 new 45380665 old 2021652 old 2022286 old 2246888 old 2237484 old 2246172 old 2268073 old 2271554 old 2259544 old 2272642 old 2268579
这是使用Oracle JDK 1.7.0.67
解决方法
空检查没有完成任何事情,我也很惊讶它会产生这样的差异.但我相信你的评论基本上回答了你自己的问题.
@Cogman写道:
…iterating over an array involves very little branching and is a highly local operation (meaning it is likely to take a lot of advantage of the cpus cache). The type of branching is also highly predictable and optimized for in most modern cpus…
如果您编译您的类并使用javap打印出这两种方法的反汇编字节代码,您将看到:
public boolean in(App$TransactionType...); Code: 0: aload_1 1: ifnull 9 4: aload_1 5: arraylength 6: ifne 11 9: iconst_0 10: ireturn 11: iconst_0 12: istore_2 13: iload_2 14: aload_1 15: arraylength 16: if_icmpge 34 19: aload_1 20: iload_2 21: aaload 22: aload_0 23: if_acmpne 28 26: iconst_1 27: ireturn 28: iinc 2,1 31: goto 13 34: iconst_0 35: ireturn
并且:
public boolean inOld(App$TransactionType...); Code: 0: aload_1 1: ifnull 9 4: aload_1 5: arraylength 6: ifne 11 9: iconst_0 10: ireturn 11: iconst_0 12: istore_2 13: iload_2 14: aload_1 15: arraylength 16: if_icmpge 40 19: aload_1 20: iload_2 21: aaload 22: ifnull 34 25: aload_1 26: iload_2 27: aaload 28: aload_0 29: if_acmpne 34 32: iconst_1 33: ireturn 34: iinc 2,1 37: goto 13 40: iconst_0 41: ireturn
之前循环很紧,现在它非常紧凑.
我本以为Java会将这两种方法JIT这两种方法完全相同.你的时间数字另有说明.
一些随机数:
1.6.33 32b:646100 vs 727173
1.6.33 64b:1667665 vs 2668513
1.7.67 32b:661003 vs 716417
1.7.07 64b:1663926 vs 32493989
1.7.60 64b:1700574 vs 32368506
1.8.20 64b:1648382 vs 32222823
所有64位JVM都比32位版本更快地执行这两种实现.