-
- Downloads
runtime: fix 32B backward copy on ppc64x
The test to enter the 32b copy loop always fails, and execution falls back to a single 8B/iteration copy loop for copies of more than 7 bytes. Likewise, the 32B loop has SRC/DST args mixed, and fails to truncate DWORDS after completing. Fix these, and unroll the 8B/iteration loop as it will only execute 1-3 times if reached. POWER10 benchmarks: name old speed new speed delta MemmoveOverlap/32 5.28GB/s ± 0% 10.37GB/s ± 0% +96.22% MemmoveOverlap/64 5.97GB/s ± 0% 18.15GB/s ± 0% +203.95% MemmoveOverlap/128 7.67GB/s ± 0% 24.35GB/s ± 0% +217.41% MemmoveOverlap/256 14.1GB/s ± 0% 25.0GB/s ± 0% +77.48% MemmoveOverlap/512 14.2GB/s ± 0% 30.9GB/s ± 0% +118.19% MemmoveOverlap/1024 12.3GB/s ± 0% 36.4GB/s ± 0% +194.75% MemmoveOverlap/2048 13.7GB/s ± 0% 48.8GB/s ± 0% +255.24% MemmoveOverlap/4096 14.1GB/s ± 0% 43.4GB/s ± 0% +208.80% MemmoveUnalignedDstOverlap/32 5.07GB/s ± 0% 3.78GB/s ± 0% -25.33% MemmoveUnalignedDstOverlap/64 6.00GB/s ± 0% 9.59GB/s ± 0% +59.78% MemmoveUnalignedDstOverlap/128 7.66GB/s ± 0% 13.51GB/s ± 0% +76.42% MemmoveUnalignedDstOverlap/256 13.4GB/s ± 0% 24.3GB/s ± 0% +80.92% MemmoveUnalignedDstOverlap/512 13.9GB/s ± 0% 30.3GB/s ± 0% +118.29% MemmoveUnalignedDstOverlap/1024 12.3GB/s ± 0% 37.3GB/s ± 0% +203.07% MemmoveUnalignedDstOverlap/2048 13.7GB/s ± 0% 45.9GB/s ± 0% +235.39% MemmoveUnalignedDstOverlap/4096 13.9GB/s ± 0% 41.2GB/s ± 0% +196.34% MemmoveUnalignedSrcOverlap/32 5.13GB/s ± 0% 5.18GB/s ± 0% +0.98% MemmoveUnalignedSrcOverlap/64 6.26GB/s ± 0% 9.53GB/s ± 0% +52.29% MemmoveUnalignedSrcOverlap/128 7.94GB/s ± 0% 18.40GB/s ± 0% +131.76% MemmoveUnalignedSrcOverlap/256 14.1GB/s ± 0% 25.5GB/s ± 0% +81.40% MemmoveUnalignedSrcOverlap/512 14.2GB/s ± 0% 30.9GB/s ± 0% +116.76% MemmoveUnalignedSrcOverlap/1024 12.4GB/s ± 0% 46.4GB/s ± 0% +275.22% MemmoveUnalignedSrcOverlap/2048 13.7GB/s ± 0% 48.7GB/s ± 0% +255.16% MemmoveUnalignedSrcOverlap/4096 14.0GB/s ± 0% 43.2GB/s ± 0% +208.89% Change-Id: I9fc6956ff454a2856d56077d1014388fb74c1f52 Reviewed-on: https://go-review.googlesource.com/c/go/+/384074 Trust: Paul Murphy <murp@ibm.com> Run-TryBot: Paul Murphy <murp@ibm.com> Reviewed-by:Lynn Boger <laboger@linux.vnet.ibm.com> Reviewed-by:
Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
Please register or sign in to comment