Skip to content
Snippets Groups Projects
  • Noel Gordon's avatar
    zlib adler_simd.c · 17bbb3d7
    Noel Gordon authored
    Add SSSE3 implementation of the adler32 checksum, suitable for
    both large workloads, and small workloads commonly seen during
    PNG image decoding. Add a NEON implementation.
    
    Speed is comparable to the serial adler32 computation but near
    64 bytes of input data, the SIMD code paths begin to be faster
    than the serial path: 3x faster at 256 bytes of input data, to
    ~8x faster for 1M of input data (~4x on ARMv8 NEON).
    
    For the PNG 140 image corpus, PNG decoding speed is ~8% faster
    on average on the desktop machines tested, and ~2% on an ARMv8
    Pixel C Android (N) tablet, https://crbug.com/762564#c41
    
    Update x86.{c,h} to runtime detect SSSE3 support and use it to
    enable the adler32_simd code path and update inflate.c to call
    x86_check_features(). Update the name mangler file names.h for
    the new symbols added, add FIXME about simd.patch.
    
    Ignore data alignment in the SSSE3 case since unaligned access
    is no longer penalized on current generation Intel CPU. Use it
    in the NEON case however to avoid the extra costs of unaligned
    memory access on ARMv8/v7.
    
    NEON credits: the v_s1/s2 vector component accumulate code was
    provided by Adenilson Cavalcanti. The uint16 column vector sum
    code is from libdeflate with corrections to process NMAX input
    bytes which improves performance by 3% for large buffers.
    
    Update BUILD.gn to put the code in its own source set, and add
    it conditionally to the zlib library build rule. On ARM, build
    the SIMD with max-speed config to produce the smallest code.
    
    No change in behavior, covered by many existing tests.
    
    Bug: 762564
    Change-Id: I14a39940ae113b5a67ba70a99c3741e289b1796b
    Reviewed-on: https://chromium-review.googlesource.com/660019
    
    
    Commit-Queue: Chris Blume <cblume@chromium.org>
    Reviewed-by: default avatarAdenilson Cavalcanti <cavalcantii@chromium.org>
    Reviewed-by: default avatarChris Blume <cblume@chromium.org>
    Cr-Original-Commit-Position: refs/heads/master@{#505447}
    Cr-Mirrored-From: https://chromium.googlesource.com/chromium/src
    Cr-Mirrored-Commit: 09b784fd12f255a9da38107ac6e0386f4dde6d68
    17bbb3d7