Skip to content
Snippets Groups Projects
  • Tobias Klauser's avatar
    029a5af6
    internal/bytealg: use word-wise comparison for Compare on arm · 029a5af6
    Tobias Klauser authored
    Use word-wise comparison for aligned buffers, otherwise fall back to the
    current byte-wise comparison.
    
    name                           old time/op    new time/op    delta
    BytesCompare/1-4                 41.3ns ± 0%    36.4ns ± 1%   -11.73%  (p=0.008 n=5+5)
    BytesCompare/2-4                 39.5ns ± 0%    39.5ns ± 1%      ~     (p=0.960 n=5+5)
    BytesCompare/4-4                 45.3ns ± 0%    41.0ns ± 1%    -9.40%  (p=0.008 n=5+5)
    BytesCompare/8-4                 64.8ns ± 1%    44.7ns ± 0%   -31.12%  (p=0.008 n=5+5)
    BytesCompare/16-4                86.3ns ± 0%    55.1ns ± 0%   -36.21%  (p=0.008 n=5+5)
    BytesCompare/32-4                 135ns ± 0%      70ns ± 1%   -47.73%  (p=0.008 n=5+5)
    BytesCompare/64-4                 231ns ± 1%      99ns ± 0%   -57.27%  (p=0.016 n=5+4)
    BytesCompare/128-4                424ns ± 0%     147ns ± 0%   -65.31%  (p=0.000 n=4+5)
    BytesCompare/256-4                810ns ± 0%     243ns ± 0%   -69.96%  (p=0.008 n=5+5)
    BytesCompare/512-4               1.59µs ± 0%    0.44µs ± 0%   -72.43%  (p=0.008 n=5+5)
    BytesCompare/1024-4              3.14µs ± 1%    0.83µs ± 1%   -73.56%  (p=0.008 n=5+5)
    BytesCompare/2048-4              6.23µs ± 0%    1.61µs ± 1%   -74.21%  (p=0.008 n=5+5)
    CompareBytesEqual-4              79.4ns ± 0%    52.2ns ± 0%   -34.23%  (p=0.008 n=5+5)
    CompareBytesToNil-4              31.0ns ± 0%    30.3ns ± 0%    -2.32%  (p=0.008 n=5+5)
    CompareBytesEmpty-4              25.7ns ± 0%    25.7ns ± 0%      ~     (p=0.556 n=4+5)
    CompareBytesIdentical-4          25.7ns ± 0%    25.7ns ± 0%      ~     (p=1.000 n=5+5)
    CompareBytesSameLength-4         49.1ns ± 0%    48.5ns ± 0%    -1.26%  (p=0.008 n=5+5)
    CompareBytesDifferentLength-4    49.8ns ± 1%    49.3ns ± 0%    -1.08%  (p=0.008 n=5+5)
    CompareBytesBigUnaligned-4       5.71ms ± 1%    5.68ms ± 1%      ~     (p=0.222 n=5+5)
    CompareBytesBig-4                4.95ms ± 0%    2.28ms ± 1%   -53.81%  (p=0.008 n=5+5)
    CompareBytesBigIdentical-4       27.2ns ± 1%    27.3ns ± 1%      ~     (p=0.310 n=5+5)
    
    name                           old speed      new speed      delta
    CompareBytesBigUnaligned-4      184MB/s ± 1%   185MB/s ± 1%      ~     (p=0.222 n=5+5)
    CompareBytesBig-4               212MB/s ± 0%   459MB/s ± 1%  +116.51%  (p=0.008 n=5+5)
    CompareBytesBigIdentical-4     38.5TB/s ± 0%  38.4TB/s ± 1%      ~     (p=0.421 n=5+5)
    
    Also, this reduces time for TestCompareBytes by about 20 sec on a
    linux-arm builder via gomote.
    
    Updates #29001
    
    Change-Id: I25f148739b9ccb7cb1fc97b3d8763549b0a66c16
    Reviewed-on: https://go-review.googlesource.com/c/go/+/165338
    
    
    Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
    029a5af6
    History
    internal/bytealg: use word-wise comparison for Compare on arm
    Tobias Klauser authored
    Use word-wise comparison for aligned buffers, otherwise fall back to the
    current byte-wise comparison.
    
    name                           old time/op    new time/op    delta
    BytesCompare/1-4                 41.3ns ± 0%    36.4ns ± 1%   -11.73%  (p=0.008 n=5+5)
    BytesCompare/2-4                 39.5ns ± 0%    39.5ns ± 1%      ~     (p=0.960 n=5+5)
    BytesCompare/4-4                 45.3ns ± 0%    41.0ns ± 1%    -9.40%  (p=0.008 n=5+5)
    BytesCompare/8-4                 64.8ns ± 1%    44.7ns ± 0%   -31.12%  (p=0.008 n=5+5)
    BytesCompare/16-4                86.3ns ± 0%    55.1ns ± 0%   -36.21%  (p=0.008 n=5+5)
    BytesCompare/32-4                 135ns ± 0%      70ns ± 1%   -47.73%  (p=0.008 n=5+5)
    BytesCompare/64-4                 231ns ± 1%      99ns ± 0%   -57.27%  (p=0.016 n=5+4)
    BytesCompare/128-4                424ns ± 0%     147ns ± 0%   -65.31%  (p=0.000 n=4+5)
    BytesCompare/256-4                810ns ± 0%     243ns ± 0%   -69.96%  (p=0.008 n=5+5)
    BytesCompare/512-4               1.59µs ± 0%    0.44µs ± 0%   -72.43%  (p=0.008 n=5+5)
    BytesCompare/1024-4              3.14µs ± 1%    0.83µs ± 1%   -73.56%  (p=0.008 n=5+5)
    BytesCompare/2048-4              6.23µs ± 0%    1.61µs ± 1%   -74.21%  (p=0.008 n=5+5)
    CompareBytesEqual-4              79.4ns ± 0%    52.2ns ± 0%   -34.23%  (p=0.008 n=5+5)
    CompareBytesToNil-4              31.0ns ± 0%    30.3ns ± 0%    -2.32%  (p=0.008 n=5+5)
    CompareBytesEmpty-4              25.7ns ± 0%    25.7ns ± 0%      ~     (p=0.556 n=4+5)
    CompareBytesIdentical-4          25.7ns ± 0%    25.7ns ± 0%      ~     (p=1.000 n=5+5)
    CompareBytesSameLength-4         49.1ns ± 0%    48.5ns ± 0%    -1.26%  (p=0.008 n=5+5)
    CompareBytesDifferentLength-4    49.8ns ± 1%    49.3ns ± 0%    -1.08%  (p=0.008 n=5+5)
    CompareBytesBigUnaligned-4       5.71ms ± 1%    5.68ms ± 1%      ~     (p=0.222 n=5+5)
    CompareBytesBig-4                4.95ms ± 0%    2.28ms ± 1%   -53.81%  (p=0.008 n=5+5)
    CompareBytesBigIdentical-4       27.2ns ± 1%    27.3ns ± 1%      ~     (p=0.310 n=5+5)
    
    name                           old speed      new speed      delta
    CompareBytesBigUnaligned-4      184MB/s ± 1%   185MB/s ± 1%      ~     (p=0.222 n=5+5)
    CompareBytesBig-4               212MB/s ± 0%   459MB/s ± 1%  +116.51%  (p=0.008 n=5+5)
    CompareBytesBigIdentical-4     38.5TB/s ± 0%  38.4TB/s ± 1%      ~     (p=0.421 n=5+5)
    
    Also, this reduces time for TestCompareBytes by about 20 sec on a
    linux-arm builder via gomote.
    
    Updates #29001
    
    Change-Id: I25f148739b9ccb7cb1fc97b3d8763549b0a66c16
    Reviewed-on: https://go-review.googlesource.com/c/go/+/165338
    
    
    Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
Code owners
Assign users and groups as approvers for specific file changes. Learn more.