Skip to content
Snippets Groups Projects
  1. Dec 20, 2017
  2. Dec 13, 2017
  3. Dec 12, 2017
  4. Dec 08, 2017
    • Noel Gordon's avatar
      Improve zlib inflate speed by using SSE2 chunk copy · 64ffef0b
      Noel Gordon authored
      Using SSE2 chunk copies improves the decoding rate of the PNG 140
      corpus by an average 17%, giving a total 37% performance increase
      when combined with SIMD adler32 code (https://crbug.com/772870#c3
      for details).
      
      Move the arm-specific code back into the main chunk copy code and
      generalize the SIMD parts of chunkset_core() with inline function
      calls for ARM, and Intel SSE2 devices. This removes the TODO from
      arm/chunkcopy_arm.h, and that file can be deleted as a result.
      
      Add SSE2 vector load / store SSE helpers for chunkset_core(). The
      existing NEON load code had alignment issues, as noted in review.
      Fix that: use unaligned loads in the ARM helper code.
      
      Change chunkcopy.h to use __builtin_memcpy if it's available, use
      zmemcpy otherwise such as on MSVC. Also call x86_check_features()
      in inflateInit2_() to keep the adler32 SIMD code path enabled.
      
      Update BUILD.gn to conditionally compile the SIMD chunk copy code
      on Intel SSE2 and ARM NEON devices. Update names.h to add the new
      symbol defined by the inflate chunk copy code path.
      
      Code had various comment styles; pick one and use it consistently
      everywhere. Add inffast_chunk.h TODO(cblume).
      
      Bug: 772870
      Change-Id: I47004c68ee675acf418825fb0e1f8fa8018d4342
      Reviewed-on: https://chromium-review.googlesource.com/708834
      
      
      Commit-Queue: Noel Gordon <noel@chromium.org>
      Reviewed-by: default avatarChris Blume <cblume@chromium.org>
      Cr-Original-Commit-Position: refs/heads/master@{#522764}
      Cr-Mirrored-From: https://chromium.googlesource.com/chromium/src
      Cr-Mirrored-Commit: c293a3255eb27dee8879f85f2c45dedff58e2452
      64ffef0b
  5. Nov 30, 2017
    • Boris Sazonov's avatar
      Revert "Using ARMv8 CRC32 specific instruction" · 0f473a1d
      Boris Sazonov authored
      This reverts commit 35988c821c051a57e30c76f9fcd87b7b677bd9bd.
      
      Reason for revert: broke build ('cpu-features.h' not found)
      https://uberchromegw.corp.google.com/i/internal.client.clank/builders/x64-builder/builds/13697
      
      Original change's description:
      > Using ARMv8 CRC32 specific instruction
      > 
      > CRC32 affects performance for both image decompression (PNG)
      > as also in general browsing while accessing websites that serve
      > content using compression (i.e. Content-Encoding: gzip).
      > 
      > This patch implements an optimized CRC32 function using the
      > dedicated instruction available in ARMv8. This instruction is available
      > in new Android devices featuring an ARMv8 SoC, like Nexus 5x and
      > Google Pixel.
      > 
      > It should be between 6x (A53: 116ms X 22ms for a 4Kx4Kx4 buffer) to
      > 10x faster (A72: 91ms x 9ms) than the C implementation currently used
      > by zlib.
      > 
      > PNG decoding performance gains should be around 5-9%.
      > 
      > Finally it also introduces code to perform the ARM CPU features detection
      > using getauxval()@Linux/CrOS or android_getCpuFeatures(). We pre-built
      > and link the CRC32 instruction dependent code but will decide if to
      > use it at run time.
      > 
      > If the feature is not supported, we fallback to the C implementation.
      > 
      > This approach allows to use the instruction in both 32bits and 64bits
      > builds and works fine either in ARMv7 or ARMv8 processor. I tested the
      > generated Chromium apk in both a ARMv7 (Nexus 4 and 6) and ARMv8 (Nexus 5x and
      > Google Pixel).
      > 
      > Change-Id: I069408ebc06c49a3c2be4ba3253319e025ee09d7
      > Bug: 709716
      > Reviewed-on: https://chromium-review.googlesource.com/612629
      
      
      > Reviewed-by: default avatarChris Blume <cblume@chromium.org>
      > Commit-Queue: Adenilson Cavalcanti <cavalcantii@chromium.org>
      > Cr-Commit-Position: refs/heads/master@{#520377}
      
      TBR=agl@chromium.org,noel@chromium.org,cavalcantii@chromium.org,cblume@chromium.org,mtklein@chromium.org,adenilson.cavalcanti@arm.com
      
      Change-Id: Ief2c32df5c8a37635f937cd6a671f5574f5a53a3
      No-Presubmit: true
      No-Tree-Checks: true
      No-Try: true
      Bug: 709716
      Reviewed-on: https://chromium-review.googlesource.com/799930
      
      
      Reviewed-by: default avatarChris Blume <cblume@chromium.org>
      Reviewed-by: default avatarBoris Sazonov <bsazonov@chromium.org>
      Commit-Queue: Boris Sazonov <bsazonov@chromium.org>
      Cr-Original-Commit-Position: refs/heads/master@{#520497}
      Cr-Mirrored-From: https://chromium.googlesource.com/chromium/src
      Cr-Mirrored-Commit: e7d9a4649bde6f047105d29f0026dd8c3d54143a
      0f473a1d
    • Adenilson Cavalcanti's avatar
      Using ARMv8 CRC32 specific instruction · d7601c23
      Adenilson Cavalcanti authored
      CRC32 affects performance for both image decompression (PNG)
      as also in general browsing while accessing websites that serve
      content using compression (i.e. Content-Encoding: gzip).
      
      This patch implements an optimized CRC32 function using the
      dedicated instruction available in ARMv8. This instruction is available
      in new Android devices featuring an ARMv8 SoC, like Nexus 5x and
      Google Pixel.
      
      It should be between 6x (A53: 116ms X 22ms for a 4Kx4Kx4 buffer) to
      10x faster (A72: 91ms x 9ms) than the C implementation currently used
      by zlib.
      
      PNG decoding performance gains should be around 5-9%.
      
      Finally it also introduces code to perform the ARM CPU features detection
      using getauxval()@Linux/CrOS or android_getCpuFeatures(). We pre-built
      and link the CRC32 instruction dependent code but will decide if to
      use it at run time.
      
      If the feature is not supported, we fallback to the C implementation.
      
      This approach allows to use the instruction in both 32bits and 64bits
      builds and works fine either in ARMv7 or ARMv8 processor. I tested the
      generated Chromium apk in both a ARMv7 (Nexus 4 and 6) and ARMv8 (Nexus 5x and
      Google Pixel).
      
      Change-Id: I069408ebc06c49a3c2be4ba3253319e025ee09d7
      Bug: 709716
      Reviewed-on: https://chromium-review.googlesource.com/612629
      
      
      Reviewed-by: default avatarChris Blume <cblume@chromium.org>
      Commit-Queue: Adenilson Cavalcanti <cavalcantii@chromium.org>
      Cr-Original-Commit-Position: refs/heads/master@{#520377}
      Cr-Mirrored-From: https://chromium.googlesource.com/chromium/src
      Cr-Mirrored-Commit: 35988c821c051a57e30c76f9fcd87b7b677bd9bd
      d7601c23
  6. Nov 15, 2017
  7. Nov 13, 2017
  8. Nov 10, 2017
  9. Nov 04, 2017
  10. Nov 03, 2017
  11. Nov 02, 2017
  12. Oct 31, 2017
  13. Oct 30, 2017
    • Adenilson Cavalcanti's avatar
      Isolating ARM specific code in inffast · f44229bb
      Adenilson Cavalcanti authored
      The NEON specific code will be hosted in the folder 
      'contrib/optimizations/arm' while the platform independent 
      C code is hosted in the upper directory.
      
      This allows to easily implement the inffast optimization for other
      architectures by simply implementing 2 functions and including the
      necessary header in chunk_copy.h (that is used by inflate and inffast).
      
      The idea is with time to move all optimizations to this new folder.
      
      Bug: 769880
      Change-Id: I404ec0fdf3f6867c9c124da859ca38bf57b25447
      Reviewed-on: https://chromium-review.googlesource.com/740907
      
      
      Reviewed-by: default avatarChris Blume <cblume@chromium.org>
      Commit-Queue: Adenilson Cavalcanti <cavalcantii@chromium.org>
      Cr-Original-Commit-Position: refs/heads/master@{#512542}
      Cr-Mirrored-From: https://chromium.googlesource.com/chromium/src
      Cr-Mirrored-Commit: 626df311481f7fac07b58799fbc94e09c848f01d
      f44229bb
  14. Oct 18, 2017
    • Jay Civelli's avatar
      Changing the FileAccessor API in zip.h to improve perfs over IPC. · 945bc9fb
      Jay Civelli authored
      When using zip::Zip() with an IPC based FileAccessor, zipping
      directories with large number of files triggers many IPC calls
      making the entire operation significantly slower than with direct file
      access.
      In order to alleviate this performance hit, this patch groups file
      reads by modifying the FileAccessor read method so it reads multiple
      files at once. zip::Zip() can then group these reads when writing the
      ZIP file.
      The writing code has been factored out into a new ZipWriter class to
      make that code more readable.
      
      Bug: 773310
      Change-Id: I8121980bf05d87a174c63164840ec6bf325c7e52
      Reviewed-on: https://chromium-review.googlesource.com/719356
      
      
      Commit-Queue: Jay Civelli <jcivelli@chromium.org>
      Reviewed-by: default avatarIlya Sherman <isherman@chromium.org>
      Cr-Original-Commit-Position: refs/heads/master@{#509693}
      Cr-Mirrored-From: https://chromium.googlesource.com/chromium/src
      Cr-Mirrored-Commit: d0cb5e408404d652492171bbed9c8ecd3d44a9aa
      945bc9fb
  15. Oct 13, 2017
  16. Oct 12, 2017
  17. Sep 29, 2017
    • Noel Gordon's avatar
      zlib adler_simd.c · 17bbb3d7
      Noel Gordon authored
      Add SSSE3 implementation of the adler32 checksum, suitable for
      both large workloads, and small workloads commonly seen during
      PNG image decoding. Add a NEON implementation.
      
      Speed is comparable to the serial adler32 computation but near
      64 bytes of input data, the SIMD code paths begin to be faster
      than the serial path: 3x faster at 256 bytes of input data, to
      ~8x faster for 1M of input data (~4x on ARMv8 NEON).
      
      For the PNG 140 image corpus, PNG decoding speed is ~8% faster
      on average on the desktop machines tested, and ~2% on an ARMv8
      Pixel C Android (N) tablet, https://crbug.com/762564#c41
      
      Update x86.{c,h} to runtime detect SSSE3 support and use it to
      enable the adler32_simd code path and update inflate.c to call
      x86_check_features(). Update the name mangler file names.h for
      the new symbols added, add FIXME about simd.patch.
      
      Ignore data alignment in the SSSE3 case since unaligned access
      is no longer penalized on current generation Intel CPU. Use it
      in the NEON case however to avoid the extra costs of unaligned
      memory access on ARMv8/v7.
      
      NEON credits: the v_s1/s2 vector component accumulate code was
      provided by Adenilson Cavalcanti. The uint16 column vector sum
      code is from libdeflate with corrections to process NMAX input
      bytes which improves performance by 3% for large buffers.
      
      Update BUILD.gn to put the code in its own source set, and add
      it conditionally to the zlib library build rule. On ARM, build
      the SIMD with max-speed config to produce the smallest code.
      
      No change in behavior, covered by many existing tests.
      
      Bug: 762564
      Change-Id: I14a39940ae113b5a67ba70a99c3741e289b1796b
      Reviewed-on: https://chromium-review.googlesource.com/660019
      
      
      Commit-Queue: Chris Blume <cblume@chromium.org>
      Reviewed-by: default avatarAdenilson Cavalcanti <cavalcantii@chromium.org>
      Reviewed-by: default avatarChris Blume <cblume@chromium.org>
      Cr-Original-Commit-Position: refs/heads/master@{#505447}
      Cr-Mirrored-From: https://chromium.googlesource.com/chromium/src
      Cr-Mirrored-Commit: 09b784fd12f255a9da38107ac6e0386f4dde6d68
      17bbb3d7
    • Adenilson Cavalcanti's avatar
      zlib: inflate using wider loads and stores · 3060dcbd
      Adenilson Cavalcanti authored
      In inflate_fast() the output pointer always has plenty of room to write.
      This means that so long as the target is capable, wide un-aligned
      loads and stores can be used to transfer several bytes at once.
      
      When the reference distance is too short simply unroll the data a
      little to increase the distance. Patch by Simon Hosie.
      
      PNG decoding performance gains should be around 30-33%.
      
      This also includes the fix reported in madler/zlib#245.
      
      Bug: 697280
      Change-Id: I90a9866cc56aa766df5de472cd10c007f4b560d8
      Reviewed-on: https://chromium-review.googlesource.com/689961
      
      
      Reviewed-by: default avatarChris Blume <cblume@chromium.org>
      Commit-Queue: Adenilson Cavalcanti <cavalcantii@chromium.org>
      Cr-Original-Commit-Position: refs/heads/master@{#505276}
      Cr-Mirrored-From: https://chromium.googlesource.com/chromium/src
      Cr-Mirrored-Commit: 78104f4d73e3bbb4155fa804d00ed66682180556
      3060dcbd
  18. Sep 28, 2017
  19. Sep 21, 2017
  20. Sep 19, 2017
  21. Sep 16, 2017
    • Mike Klein's avatar
      Revert "Reland "zlib: inflate using wider loads and stores"" · 567815a5
      Mike Klein authored
      This reverts commit 0397489124ce7e6aced020f8b85f5034c7d5f49b.
      
      Reason for revert: appears to break Cronet.
      
      Original change's description:
      > Reland "zlib: inflate using wider loads and stores"
      > 
      > This reverts commit e1f30a329eccf19ce1c8772e873abf88970cb65c.
      > 
      > Reason for revert: This patch was originally reverted because
      > we have an ARMv6 build target (part of Cronet) which
      > incorrectly set NEON support in our build files. As a result,
      > the NEON intrinsics were included despite ARMv6 not
      > supporting NEON.
      > 
      > The build problem was addressed here:
      > https://chromium-review.googlesource.com/c/chromium/src/+/639931
      > 
      > Now that ARMv6 builds do not set NEON support, these zlib
      > changes (which are only applied if NEON support is set in the
      > build) will not cause errors on ARMv6 targets.
      > 
      > BUG=697280
      > 
      > Original change's description:
      > > zlib: inflate using wider loads and stores
      > > 
      > > In inflate_fast() the output pointer always has plenty of room to write.  
      > > This means that so long as the target is capable, wide un-aligned 
      > > loads and stores can be used to transfer several bytes at once.
      > > 
      > > When the reference distance is too short simply unroll the data a 
      > > little to increase the distance. Patch by Simon Hosie.
      > > 
      > > PNG decoding performance gains should be around 30-33%.
      > > 
      > > BUG=697280
      > > 
      > > Change-Id: I59854eb25d2b1e43561c8a2afaf9175bf10cf674
      > > Reviewed-on: https://chromium-review.googlesource.com/627098
      
      
      > > Reviewed-by: default avatarAdenilson Cavalcanti <cavalcantii@chromium.org>
      > > Reviewed-by: default avatarMike Klein <mtklein@chromium.org>
      > > Commit-Queue: Adenilson Cavalcanti <cavalcantii@chromium.org>
      > > Cr-Commit-Position: refs/heads/master@{#497866}
      > >
      > 
      > Change-Id: I0b4cd1a393464c960c6a1e48a022a20340781e75
      > Reviewed-on: https://chromium-review.googlesource.com/641575
      
      
      > Commit-Queue: Chris Blume <cblume@chromium.org>
      > Reviewed-by: default avatarAdenilson Cavalcanti <cavalcantii@chromium.org>
      > Reviewed-by: default avatarLeon Scroggins <scroggo@chromium.org>
      > Cr-Commit-Position: refs/heads/master@{#498580}
      
      TBR=scroggo@chromium.org,agl@chromium.org,cavalcantii@chromium.org,cblume@chromium.org,mtklein@chromium.org,adenilson.cavalcanti@arm.com
      
      # Not skipping CQ checks because original CL landed > 1 day ago.
      
      Bug: 697280
      Change-Id: I200e0e3b9cb9c884acfd6868067464a5f6cef804
      Reviewed-on: https://chromium-review.googlesource.com/669259
      
      
      Reviewed-by: default avatarMike Klein <mtklein@chromium.org>
      Reviewed-by: default avatarAdenilson Cavalcanti <cavalcantii@chromium.org>
      Commit-Queue: Mike Klein <mtklein@chromium.org>
      Cr-Original-Commit-Position: refs/heads/master@{#502474}
      Cr-Mirrored-From: https://chromium.googlesource.com/chromium/src
      Cr-Mirrored-Commit: 7a620575b0512b33708e723c260fe66b90b5b103
      567815a5
    • Mike Klein's avatar
      Revert "Zlib patch: prevent uninitialized use of state->check" · 4c5e9406
      Mike Klein authored
      This reverts commit f4b484415281f09d0bbc0880f5d41dbdde96c209.
      
      Reason for revert: need to revert previous CL, which this depends on (because it added contrib/arm/inflate.c).
      
      Original change's description:
      > Zlib patch: prevent uninitialized use of state->check
      > 
      > No need to call the Adler32 checksum function, just set the
      > struct field to the expected value.
      > 
      > Upstream bug: madler/zlib#245
      > 
      > Bug: chromium:697481
      > Change-Id: Ib972cc2507c8e7ca0b0b48464db33880ef960fb8
      > Reviewed-on: https://chromium-review.googlesource.com/644505
      
      
      > Commit-Queue: Adenilson Cavalcanti <cavalcantii@chromium.org>
      > Reviewed-by: default avatarMike Klein <mtklein@chromium.org>
      > Cr-Commit-Position: refs/heads/master@{#498998}
      
      TBR=scroggo@chromium.org,agl@chromium.org,cavalcantii@chromium.org,npm@chromium.org,cblume@chromium.org,mtklein@chromium.org,adenilson.cavalcanti@arm.com
      
      # Not skipping CQ checks because original CL landed > 1 day ago.
      
      Bug: chromium:697481
      Change-Id: I12c6ca6867d1d7e97c9f372f2d592ed75d51f093
      Reviewed-on: https://chromium-review.googlesource.com/669480
      
      
      Reviewed-by: default avatarMike Klein <mtklein@chromium.org>
      Commit-Queue: Mike Klein <mtklein@chromium.org>
      Cr-Original-Commit-Position: refs/heads/master@{#502449}
      Cr-Mirrored-From: https://chromium.googlesource.com/chromium/src
      Cr-Mirrored-Commit: 4e96bed9f68c0d6ae106c1042a3041eafcf59b9a
      4c5e9406
  22. Sep 15, 2017
  23. Sep 07, 2017
  24. Sep 05, 2017
    • Chris Blume's avatar
      Add Adenilson to zlib owners · 223ce020
      Chris Blume authored
      Echoing Adenilson's words:
      
      "I was reading the documentation related to ownership of folders in Chromium
      (https://chromium.googlesource.com/chromium/src/+/master/docs/code_reviews.md)
      and found the section related to 'Expectation of owners'. I checked the
      requirements and feel I may have the right profile:
      
      a) Be already acting as an owner: I started working with zlib in past January
      (when Chromium was still using zlib 1.2.8), pointed to Matt Sarrett the Mozilla
      paper (https://wiki.mozilla.org/images/0/09/Zlib-report.pdf) suggesting to
      upgrade from zlib 1.2.8 to 1.2.11.
      
      Next, I contacted the PDFium team and helped them to migrate to zlib 1.2.11 and
      verify the feasibility of using Chromium's zlib (please see attached message).
      This removed YAZC (Yet Another Zlib Copy) from the code base.
      
      b) Be a Chromium project member with full commit access of at least 6 months: I
      started contributing to WebKit way back in 2011 (where I'm a committer) and
      became a Blink committer a few months after the fork in 2013.
      
      c) Have submitted a substantial number of non-trivial changes: ongoing work on
      ARM optimizations (inffast, Adler-32, CRC32).
      
      d) Have committed or reviewed work in the last 90 days: landed the first ARM
      specific optimization and helping to review/test the inffast64 patch.
      
      e) Have bandwidth to contribute to reviews in a timely manner: I'm used to
      check my email in the weekends and I'm always willing to go the extra mile.
      
      I understand this is a long term committment, independent of my position at ARM
      (i.e. I'm an ARM employee).
      
      But as an example, last year I led the work to solve a tough platform
      predictability bug (https://bugs.chromium.org/p/chromium/issues/detail?id=559258)
      that required coordination with Mozilla (verification on FF) + Microsoft (i.e.
      reverting the quirky behavior in MS Edge) + Google (i.e. required fixes in mobile
      gmail and mobile gcalendar plus landing the fix in Chromium).
      
      The bulk of the work was done in my spare time while I was working for a startup
      that had nothing to do with Browsers or opensource.
      
      I shared this story because I hope it helps to demonstrate my sense of
      committment to the Chromium project (and the web as whole)."
      
      Change-Id: Ie79fe8f31f58b181e27084c5bc5bde6e444fd8d3
      Reviewed-on: https://chromium-review.googlesource.com/650487
      
      
      Reviewed-by: default avatarMike Klein <mtklein@chromium.org>
      Commit-Queue: Adenilson Cavalcanti <cavalcantii@chromium.org>
      Cr-Original-Commit-Position: refs/heads/master@{#499738}
      Cr-Mirrored-From: https://chromium.googlesource.com/chromium/src
      Cr-Mirrored-Commit: fd3c2dfe28485d69364bf6bcd0e42f0c8a749cff
      223ce020
  25. Aug 31, 2017
  26. Aug 30, 2017
  27. Aug 28, 2017
  28. Aug 08, 2017
  29. Aug 04, 2017
    • mortonm's avatar
      Improve Zip File Scanning on Mac · b4298b0c
      mortonm authored
      This CL fixes two aspects of broken ZIP processing on Mac. First, it ensures
      that .app files are treated as directories and as such do not break binary
      feature extraction, causing analysis to fail. Second, it performs
      type-sniffing to identify the existence of executable MachO files that do not
      have any file extension, as is the usual case on Mac.
      
      BUG=600392
      
      Review-Url: https://codereview.chromium.org/2961373002
      Cr-Original-Commit-Position: refs/heads/master@{#492032}
      Cr-Mirrored-From: https://chromium.googlesource.com/chromium/src
      Cr-Mirrored-Commit: 034ecb569929e1202f39cf744a32e8deeade06c8
      b4298b0c
Loading