完善资料让更多小伙伴认识你,还能领取20积分哦, 立即完善>
摘要: 在stackoverflow上有一个非常有名的问题:为什么处理有序数组要比非有序数组快?,可见分支预测对代码运行效率有非常大的影响。要提高代码执行效率,一个重要的原则就是尽量避免CPU把流水线清空,那么提高分支预测的成功率就非常重要。分支预测在stackoverflow上有一个非常有名的问题:为什么处理有序数组要比非有序数组快?,可见分支预测对代码运行效率有非常大的影响。现代CPU都支持分支预测(branch prediction)和指令流水线(instruction pipeline),这两个结合可以极大提高CPU效率。对于像简单的if跳转,CPU是可以比较好地做分支预测的。但是对于switch跳转,CPU则没有太多的办法。switch本质上是据索引,从地址数组里取地址再跳转。要提高代码执行效率,一个重要的原则就是尽量避免CPU把流水线清空,那么提高分支预测的成功率就非常重要。那么对于代码里,如果某个switch分支概率很高,是否可以考虑代码层面帮CPU把判断提前,来提高代码执行效率呢?Dubbo里ChannelEventRunnable的switch判断在ChannelEventRunnable里有一个switch来判断channel state,然后做对应的逻辑:查看一个channel建立起来之后,超过99.9%情况它的state都是ChannelState.RECEIVED,那么可以考虑把这个判断提前。benchmark验证下面通过jmh来验证下:public class TestBenchMarks {public enum ChannelState { CONNECTED, DISCONNECTED, SENT, RECEIVED, CAUGHT}@State(Scope.Benchmark)public static class ExecutionPlan { @Param({ "1000000" }) public int size; public ChannelState[] states = null; @Setup public void setUp() { ChannelState[] values = ChannelState.values(); states = new ChannelState[size]; Random random = new Random(new Date().getTime()); for (int i = 0; i < size; i++) { int nextInt = random.nextInt(1000000); if (nextInt > 100) { states = ChannelState.RECEIVED; } else { states = values[nextInt % values.length]; } } }}@Fork(value = 5)@Benchmark@BenchmarkMode(Mode.Throughput)public void benchSiwtch(ExecutionPlan plan, Blackhole bh) { int result = 0; for (int i = 0; i < plan.size; ++i) { switch (plan.states) { case CONNECTED: result += ChannelState.CONNECTED.ordinal(); break; case DISCONNECTED: result += ChannelState.DISCONNECTED.ordinal(); break; case SENT: result += ChannelState.SENT.ordinal(); break; case RECEIVED: result += ChannelState.RECEIVED.ordinal(); break; case CAUGHT: result += ChannelState.CAUGHT.ordinal(); break; } } bh.consume(result);}@Fork(value = 5)@Benchmark@BenchmarkMode(Mode.Throughput)public void benchIfAndSwitch(ExecutionPlan plan, Blackhole bh) { int result = 0; for (int i = 0; i < plan.size; ++i) { ChannelState state = plan.states; if (state == ChannelState.RECEIVED) { result += ChannelState.RECEIVED.ordinal(); } else { switch (state) { case CONNECTED: result += ChannelState.CONNECTED.ordinal(); break; case SENT: result += ChannelState.SENT.ordinal(); break; case DISCONNECTED: result += ChannelState.DISCONNECTED.ordinal(); break; case CAUGHT: result += ChannelState.CAUGHT.ordinal(); break; } } } bh.consume(result);}
}
Result "io.github.hengyunabc.jmh.TestBenchMarks.benchSiwtch": 576.745 ±(99.9%) 6.806 ops/s [Average] (min, avg, max) = (490.348, 576.745, 618.360), stdev = 20.066 CI (99.9%): 569.939, 583.550 Run complete. Total time: 00:06:48Benchmark (size) Mode Cnt Score Error Units TestBenchMarks.benchIfAndSwitch 1000000 thrpt 100 1535.867 ± 61.212 ops/s TestBenchMarks.benchSiwtch 1000000 thrpt 100 576.745 ± 6.806 ops/s可以看到提前if判断的确提高了代码效率,这种技巧可以放在性能要求严格的地方。 Benchmark代码:https://github.com/hengyunabc/jmh-demo总结
|
|
|
|
只有小组成员才能发言,加入小组>>
小黑屋| 手机版| Archiver| 电子发烧友 ( 湘ICP备2023018690号 )
GMT+8, 2025-1-24 10:38 , Processed in 0.461661 second(s), Total 68, Slave 49 queries .
Powered by 电子发烧友网
© 2015 bbs.elecfans.com
关注我们的微信
下载发烧友APP
电子发烧友观察
版权所有 © 湖南华秋数字科技有限公司
电子发烧友 (电路图) 湘公网安备 43011202000918 号 电信与信息服务业务经营许可证:合字B2-20210191 工商网监 湘ICP备2023018690号