Sunday September the 18th marks a month since the Go 1.8 cycle opened officially. I’m passionate about the performance of Go programs, and of the compiler itself. This post is a brief look at the state of play, roughly 1/2 way into the development cycle for Go 1.81.
Note: these results are of course preliminary and represent only a point in time, not the performance of the final Go 1.8 release.
Compile times
Nothing much to report here. Using the methodology from my previous Go 1.7 benchmarks, there is a 3.22%–5.11% improvement in full compile time compared to Go 1.7.
Performance improvements
Intel amd64
Better code generation and small improvements to the runtime and standard library show some small improvements for amd642, but really nothing to write home about yet.
name old time/op new time/op delta BinaryTree17-4 3.07s ± 2% 3.06s ± 2% ~ (p=0.661 n=10+9) Fannkuch11-4 3.23s ± 1% 3.22s ± 0% -0.43% (p=0.008 n=9+10) FmtFprintfEmpty-4 64.4ns ± 0% 61.8ns ± 4% -4.17% (p=0.005 n=9+10) FmtFprintfString-4 162ns ± 0% 162ns ± 0% ~ (p=0.065 n=10+9) FmtFprintfInt-4 142ns ± 0% 142ns ± 0% ~ (p=0.137 n=8+10) FmtFprintfIntInt-4 220ns ± 0% 217ns ± 0% -1.18% (p=0.000 n=9+10) FmtFprintfPrefixedInt-4 224ns ± 0% 224ns ± 1% ~ (p=0.206 n=9+9) FmtFprintfFloat-4 313ns ± 0% 312ns ± 0% -0.26% (p=0.001 n=10+9) FmtManyArgs-4 906ns ± 0% 894ns ± 0% -1.32% (p=0.000 n=7+6) GobDecode-4 8.88ms ± 1% 8.81ms ± 0% -0.81% (p=0.003 n=10+10) GobEncode-4 7.93ms ± 1% 7.88ms ± 0% -0.66% (p=0.008 n=9+10) Gzip-4 272ms ± 1% 277ms ± 0% +1.95% (p=0.000 n=10+9) Gunzip-4 47.4ms ± 0% 47.4ms ± 0% ~ (p=0.720 n=9+10) HTTPClientServer-4 201µs ± 4% 202µs ± 2% ~ (p=0.631 n=10+10) JSONEncode-4 19.3ms ± 0% 19.3ms ± 0% ~ (p=0.063 n=10+10) JSONDecode-4 61.0ms ± 0% 61.2ms ± 0% +0.33% (p=0.000 n=10+8) Mandelbrot200-4 5.20ms ± 0% 5.20ms ± 0% ~ (p=0.475 n=10+7) GoParse-4 3.95ms ± 1% 3.97ms ± 1% +0.65% (p=0.003 n=9+9) RegexpMatchEasy0_32-4 88.4ns ± 0% 88.7ns ± 0% +0.34% (p=0.001 n=10+9) RegexpMatchEasy0_1K-4 1.14µs ± 0% 1.14µs ± 0% ~ (p=0.369 n=9+6) RegexpMatchEasy1_32-4 82.6ns ± 0% 82.0ns ± 0% -0.70% (p=0.000 n=9+10) RegexpMatchEasy1_1K-4 469ns ± 0% 463ns ± 0% -1.23% (p=0.000 n=6+9) RegexpMatchMedium_32-4 138ns ± 1% 136ns ± 0% -1.38% (p=0.000 n=10+9) RegexpMatchMedium_1K-4 43.6µs ± 1% 42.0µs ± 0% -3.74% (p=0.000 n=9+9) RegexpMatchHard_32-4 2.25µs ± 1% 2.23µs ± 0% -0.57% (p=0.000 n=8+8) RegexpMatchHard_1K-4 68.8µs ± 0% 68.6µs ± 0% -0.37% (p=0.000 n=8+8) Revcomp-4 477ms ± 1% 472ms ± 0% -1.03% (p=0.000 n=8+8) Template-4 76.1ms ± 0% 76.4ms ± 0% +0.35% (p=0.000 n=9+9) TimeParse-4 367ns ± 0% 366ns ± 0% -0.16% (p=0.003 n=10+8) TimeFormat-4 386ns ± 0% 384ns ± 0% -0.58% (p=0.000 n=9+9) name old speed new speed delta GobDecode-4 86.4MB/s ± 1% 87.1MB/s ± 0% +0.81% (p=0.003 n=10+10) GobEncode-4 96.7MB/s ± 1% 97.4MB/s ± 0% +0.66% (p=0.007 n=9+10) Gzip-4 71.4MB/s ± 1% 70.0MB/s ± 0% -1.91% (p=0.000 n=10+9) Gunzip-4 409MB/s ± 0% 410MB/s ± 0% ~ (p=0.703 n=9+10) JSONEncode-4 101MB/s ± 0% 100MB/s ± 0% ~ (p=0.084 n=10+10) JSONDecode-4 31.8MB/s ± 0% 31.7MB/s ± 0% -0.33% (p=0.000 n=10+8) GoParse-4 14.7MB/s ± 1% 14.6MB/s ± 1% -0.67% (p=0.002 n=9+9) RegexpMatchEasy0_32-4 362MB/s ± 0% 361MB/s ± 0% -0.36% (p=0.000 n=10+9) RegexpMatchEasy0_1K-4 898MB/s ± 0% 898MB/s ± 0% ~ (p=0.762 n=9+8) RegexpMatchEasy1_32-4 387MB/s ± 0% 390MB/s ± 0% +0.70% (p=0.000 n=9+10) RegexpMatchEasy1_1K-4 2.18GB/s ± 0% 2.21GB/s ± 0% +1.20% (p=0.000 n=9+9) RegexpMatchMedium_32-4 7.23MB/s ± 1% 7.32MB/s ± 0% +1.19% (p=0.000 n=10+9) RegexpMatchMedium_1K-4 23.5MB/s ± 1% 24.4MB/s ± 0% +3.88% (p=0.000 n=9+9) RegexpMatchHard_32-4 14.2MB/s ± 1% 14.3MB/s ± 0% +0.58% (p=0.000 n=8+8) RegexpMatchHard_1K-4 14.9MB/s ± 0% 14.9MB/s ± 0% +0.34% (p=0.000 n=8+7) Revcomp-4 533MB/s ± 1% 539MB/s ± 0% +1.04% (p=0.000 n=8+8) Template-4 25.5MB/s ± 0% 25.4MB/s ± 0% -0.36% (p=0.000 n=9+9)
ARM
The major improvement that landed recently in the development branch is the conversion of the remaining architecture backends to use the compiler’s SSA form. This has brought a substantial improvement in generated code for non Intel architectures, like ARM3.
name old time/op new time/op delta BinaryTree17-4 33.8s ± 1% 27.7s ± 0% -18.06% (p=0.000 n=10+10) Fannkuch11-4 42.0s ± 0% 19.3s ± 0% -54.10% (p=0.000 n=10+10) FmtFprintfEmpty-4 670ns ± 1% 581ns ± 1% -13.30% (p=0.000 n=10+10) FmtFprintfString-4 2.04µs ± 1% 1.65µs ± 0% -19.09% (p=0.000 n=10+10) FmtFprintfInt-4 1.71µs ± 0% 1.21µs ± 0% -29.39% (p=0.000 n=10+9) FmtFprintfIntInt-4 2.69µs ± 1% 1.94µs ± 0% -27.77% (p=0.000 n=10+10) FmtFprintfPrefixedInt-4 2.70µs ± 0% 1.85µs ± 0% -31.41% (p=0.000 n=10+9) FmtFprintfFloat-4 5.15µs ± 0% 3.65µs ± 0% -29.01% (p=0.000 n=9+10) FmtManyArgs-4 11.3µs ± 0% 8.5µs ± 0% -24.79% (p=0.000 n=10+9) GobDecode-4 112ms ± 0% 77ms ± 1% -31.04% (p=0.000 n=9+9) GobEncode-4 88.5ms ± 1% 77.2ms ± 1% -12.78% (p=0.000 n=10+10) Gzip-4 4.79s ± 0% 3.34s ± 0% -30.18% (p=0.000 n=9+9) Gunzip-4 702ms ± 0% 463ms ± 0% -34.05% (p=0.000 n=10+10) HTTPClientServer-4 645µs ± 3% 571µs ± 3% -11.45% (p=0.000 n=10+10) JSONEncode-4 227ms ± 0% 186ms ± 0% -18.16% (p=0.000 n=10+10) JSONDecode-4 845ms ± 0% 618ms ± 0% -26.81% (p=0.000 n=10+10) Mandelbrot200-4 59.3ms ± 0% 40.0ms ± 0% -32.47% (p=0.000 n=10+10) GoParse-4 45.0ms ± 0% 37.0ms ± 0% -17.68% (p=0.000 n=9+9) RegexpMatchEasy0_32-4 974ns ± 0% 878ns ± 0% -9.81% (p=0.000 n=10+9) RegexpMatchEasy0_1K-4 4.60µs ± 0% 4.48µs ± 0% -2.57% (p=0.000 n=10+10) RegexpMatchEasy1_32-4 1.02µs ± 0% 0.94µs ± 0% -8.08% (p=0.000 n=8+10) RegexpMatchEasy1_1K-4 6.92µs ± 0% 6.08µs ± 0% -12.10% (p=0.000 n=10+10) RegexpMatchMedium_32-4 1.61µs ± 0% 1.27µs ± 0% -20.98% (p=0.000 n=9+6) RegexpMatchMedium_1K-4 447µs ± 0% 317µs ± 0% -29.05% (p=0.000 n=10+9) RegexpMatchHard_32-4 24.9µs ± 0% 18.4µs ± 0% -25.89% (p=0.000 n=10+10) RegexpMatchHard_1K-4 740µs ± 0% 552µs ± 0% -25.36% (p=0.000 n=10+10) Revcomp-4 81.0ms ± 1% 65.2ms ± 0% -19.53% (p=0.000 n=9+9) Template-4 1.17s ± 0% 0.81s ± 0% -31.28% (p=0.000 n=9+9) TimeParse-4 5.52µs ± 0% 3.79µs ± 0% -31.42% (p=0.000 n=10+9) TimeFormat-4 10.6µs ± 0% 8.5µs ± 0% -19.14% (p=0.000 n=10+10) name old speed new speed delta GobDecode-4 6.86MB/s ± 0% 9.95MB/s ± 1% +45.00% (p=0.000 n=9+9) GobEncode-4 8.67MB/s ± 1% 9.94MB/s ± 1% +14.69% (p=0.000 n=10+10) Gzip-4 4.05MB/s ± 0% 5.81MB/s ± 0% +43.32% (p=0.000 n=10+9) Gunzip-4 27.6MB/s ± 0% 41.9MB/s ± 0% +51.63% (p=0.000 n=10+10) JSONEncode-4 8.53MB/s ± 0% 10.43MB/s ± 0% +22.20% (p=0.000 n=10+10) JSONDecode-4 2.30MB/s ± 0% 3.14MB/s ± 0% +36.39% (p=0.000 n=9+10) GoParse-4 1.29MB/s ± 0% 1.56MB/s ± 0% +20.93% (p=0.000 n=9+10) RegexpMatchEasy0_32-4 32.8MB/s ± 0% 36.4MB/s ± 0% +10.87% (p=0.000 n=10+10) RegexpMatchEasy0_1K-4 222MB/s ± 0% 228MB/s ± 0% +2.64% (p=0.000 n=10+10) RegexpMatchEasy1_32-4 31.3MB/s ± 0% 34.0MB/s ± 0% +8.75% (p=0.000 n=9+10) RegexpMatchEasy1_1K-4 148MB/s ± 0% 168MB/s ± 0% +13.76% (p=0.000 n=10+10) RegexpMatchMedium_32-4 620kB/s ± 0% 790kB/s ± 0% +27.42% (p=0.000 n=10+8) RegexpMatchMedium_1K-4 2.29MB/s ± 0% 3.23MB/s ± 0% +41.05% (p=0.000 n=10+10) RegexpMatchHard_32-4 1.29MB/s ± 0% 1.74MB/s ± 0% +34.88% (p=0.000 n=9+10) RegexpMatchHard_1K-4 1.38MB/s ± 0% 1.85MB/s ± 0% +34.06% (p=0.000 n=10+10) Revcomp-4 31.4MB/s ± 1% 39.0MB/s ± 0% +24.26% (p=0.000 n=9+9) Template-4 1.65MB/s ± 0% 2.41MB/s ± 0% +45.71% (p=0.000 n=10+9)
Notes:
- Despite the Go 1.8 development cycle opening 18 days late, in order to keep to the 6 month cadence, the feature freeze for this cycle will still occur on the 1st of November.
- Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz, 3.13.0-95-generic #142-Ubuntu
- Freescale i.MX6, 3.14.77-1-ARCH
有疑问加站长微信联系(非本文作者)