Skip to content
Snippets Groups Projects
  • Michael Munday's avatar
    6e876f19
    cmd/compile: clean up and optimize s390x multiplication rules · 6e876f19
    Michael Munday authored
    Some of the existing optimizations aren't triggered because they
    are handled by the generic rules so this CL removes them. Also
    some constraints were copied without much thought from the amd64
    rules and they don't make sense on s390x, so we remove those
    constraints.
    
    Finally, add a 'multiply by the sum of two powers of two'
    optimization. This makes sense on s390x as shifts are low latency
    and can also sometimes be optimized further (especially if we add
    support for RISBG instructions).
    
    name                   old time/op  new time/op  delta
    IntMulByConst/3-8      1.70ns ±11%  1.10ns ± 5%  -35.26%  (p=0.000 n=10+10)
    IntMulByConst/5-8      1.64ns ± 7%  1.10ns ± 4%  -32.94%  (p=0.000 n=10+9)
    IntMulByConst/12-8     1.65ns ± 6%  1.20ns ± 4%  -27.16%  (p=0.000 n=10+9)
    IntMulByConst/120-8    1.66ns ± 4%  1.22ns ±13%  -26.43%  (p=0.000 n=10+10)
    IntMulByConst/-120-8   1.65ns ± 7%  1.19ns ± 4%  -28.06%  (p=0.000 n=9+10)
    IntMulByConst/65537-8  0.86ns ± 9%  1.12ns ±12%  +30.41%  (p=0.000 n=10+10)
    IntMulByConst/65538-8  1.65ns ± 5%  1.23ns ± 5%  -25.11%  (p=0.000 n=10+10)
    
    Change-Id: Ib196e6bff1e97febfd266134d0a2b2a62897989f
    Reviewed-on: https://go-review.googlesource.com/c/go/+/248937
    
    
    Run-TryBot: Michael Munday <mike.munday@ibm.com>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    Reviewed-by: default avatarKeith Randall <khr@golang.org>
    6e876f19
    History
    cmd/compile: clean up and optimize s390x multiplication rules
    Michael Munday authored
    Some of the existing optimizations aren't triggered because they
    are handled by the generic rules so this CL removes them. Also
    some constraints were copied without much thought from the amd64
    rules and they don't make sense on s390x, so we remove those
    constraints.
    
    Finally, add a 'multiply by the sum of two powers of two'
    optimization. This makes sense on s390x as shifts are low latency
    and can also sometimes be optimized further (especially if we add
    support for RISBG instructions).
    
    name                   old time/op  new time/op  delta
    IntMulByConst/3-8      1.70ns ±11%  1.10ns ± 5%  -35.26%  (p=0.000 n=10+10)
    IntMulByConst/5-8      1.64ns ± 7%  1.10ns ± 4%  -32.94%  (p=0.000 n=10+9)
    IntMulByConst/12-8     1.65ns ± 6%  1.20ns ± 4%  -27.16%  (p=0.000 n=10+9)
    IntMulByConst/120-8    1.66ns ± 4%  1.22ns ±13%  -26.43%  (p=0.000 n=10+10)
    IntMulByConst/-120-8   1.65ns ± 7%  1.19ns ± 4%  -28.06%  (p=0.000 n=9+10)
    IntMulByConst/65537-8  0.86ns ± 9%  1.12ns ±12%  +30.41%  (p=0.000 n=10+10)
    IntMulByConst/65538-8  1.65ns ± 5%  1.23ns ± 5%  -25.11%  (p=0.000 n=10+10)
    
    Change-Id: Ib196e6bff1e97febfd266134d0a2b2a62897989f
    Reviewed-on: https://go-review.googlesource.com/c/go/+/248937
    
    
    Run-TryBot: Michael Munday <mike.munday@ibm.com>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    Reviewed-by: default avatarKeith Randall <khr@golang.org>
Code owners
Assign users and groups as approvers for specific file changes. Learn more.