Skip to content
Snippets Groups Projects
  1. Nov 06, 2024
    • Rong Xu's avatar
      kbuild: Add AutoFDO support for Clang build · 315ad878
      Rong Xu authored
      Add the build support for using Clang's AutoFDO. Building the kernel
      with AutoFDO does not reduce the optimization level from the
      compiler. AutoFDO uses hardware sampling to gather information about
      the frequency of execution of different code paths within a binary.
      This information is then used to guide the compiler's optimization
      decisions, resulting in a more efficient binary. Experiments
      showed that the kernel can improve up to 10% in latency.
      
      The support requires a Clang compiler after LLVM 17. This submission
      is limited to x86 platforms that support PMU features like LBR on
      Intel machines and AMD Zen3 BRS. Support for SPE on ARM 1,
       and BRBE on ARM 1 is part of planned future work.
      
      Here is an example workflow for AutoFDO kernel:
      
      1) Build the kernel on the host machine with LLVM enabled, for example,
             $ make menuconfig LLVM=1
          Turn on AutoFDO build config:
            CONFIG_AUTOFDO_CLANG=y
          With a configuration that has LLVM enabled, use the following
          command:
             scripts/config -e AUTOFDO_CLANG
          After getting the config, build with
            $ make LLVM=1
      
      2) Install the kernel on the test machine.
      
      3) Run the load tests. The '-c' option in perf specifies the sample
         event period. We suggest     using a suitable prime number,
         like 500009, for this purpose.
         For Intel platforms:
            $ perf record -e BR_INST_RETIRED.NEAR_TAKEN:k -a -N -b -c <count> \
              -o <perf_file> -- <loadtest>
         For AMD platforms:
            The supported system are: Zen3 with BRS, or Zen4 with amd_lbr_v2
           For Zen3:
            $ cat proc/cpuinfo | grep " brs"
            For Zen4:
            $ cat proc/cpuinfo | grep amd_lbr_v2
            $ perf record --pfm-events RETIRED_TAKEN_BRANCH_INSTRUCTIONS:k -a \
              -N -b -c <count> -o <perf_file> -- <loadtest>
      
      4) (Optional) Download the raw perf file to the host machine.
      
      5) To generate an AutoFDO profile, two offline tools are available:
         create_llvm_prof and llvm_profgen. The create_llvm_prof tool is part
         of the AutoFDO project and can be found on GitHub
         (https://github.com/google/autofdo
      
      ), version v0.30.1 or later. The
         llvm_profgen tool is included in the LLVM compiler itself. It's
         important to note that the version of llvm_profgen doesn't need to
         match the version of Clang. It needs to be the LLVM 19 release or
         later, or from the LLVM trunk.
            $ llvm-profgen --kernel --binary=<vmlinux> --perfdata=<perf_file> \
              -o <profile_file>
         or
            $ create_llvm_prof --binary=<vmlinux> --profile=<perf_file> \
              --format=extbinary --out=<profile_file>
      
         Note that multiple AutoFDO profile files can be merged into one via:
            $ llvm-profdata merge -o <profile_file>  <profile_1> ... <profile_n>
      
      6) Rebuild the kernel using the AutoFDO profile file with the same config
         as step 1, (Note CONFIG_AUTOFDO_CLANG needs to be enabled):
            $ make LLVM=1 CLANG_AUTOFDO_PROFILE=<profile_file>
      
      Co-developed-by: default avatarHan Shen <shenhan@google.com>
      Signed-off-by: default avatarHan Shen <shenhan@google.com>
      Signed-off-by: default avatarRong Xu <xur@google.com>
      Suggested-by: default avatarSriraman Tallam <tmsriram@google.com>
      Suggested-by: default avatarKrzysztof Pszeniczny <kpszeniczny@google.com>
      Suggested-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Suggested-by: default avatarStephane Eranian <eranian@google.com>
      Tested-by: default avatarYonghong Song <yonghong.song@linux.dev>
      Tested-by: default avatarYabin Cui <yabinc@google.com>
      Tested-by: default avatarNathan Chancellor <nathan@kernel.org>
      Reviewed-by: default avatarKees Cook <kees@kernel.org>
      Tested-by: default avatarPeter Jung <ptr1337@cachyos.org>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      315ad878
    • Masahiro Yamada's avatar
      kbuild: simplify rustfmt target · 397a479b
      Masahiro Yamada authored
      
      There is no need to prune the rust/alloc directory because it was
      removed by commit 9d0441ba ("rust: alloc: remove our fork of the
      `alloc` crate").
      
      There is no need to prune the rust/test directory because no '*.rs'
      files are generated within it.
      
      To avoid forking the 'grep -Fv generated' process, filter out generated
      files using the option, ! -name '*generated*'.
      
      Now that the '-path ... -prune' option is no longer used, there is no
      need to use the absolute path. Searching in $(srctree), which can be
      a relative path, is sufficient.
      
      The comment mentions the use case where $(srctree) is '..', that is,
      $(objtree) is a sub-directory of $(srctree). In this scenario, all
      '*.rs' files under $(objtree) are generated files and filters out by
      the '*generated*' pattern.
      
      Add $(RCS_FIND_IGNORE) as a shortcut. Although I do not believe '*.rs'
      files would exist under the .git directory, there is no need for the
      'find' command to traverse it.
      
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Reviewed-by: default avatarNicolas Schier <n.schier@avm.de>
      Acked-by: default avatarMiguel Ojeda <ojeda@kernel.org>
      397a479b
  2. Nov 05, 2024
  3. Nov 04, 2024
  4. Nov 03, 2024
    • Linus Torvalds's avatar
      Merge tag 'mm-hotfixes-stable-2024-11-03-10-50' of... · a8cc7432
      Linus Torvalds authored
      Merge tag 'mm-hotfixes-stable-2024-11-03-10-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
      
      Pull misc fixes from Andrew Morton:
       "17 hotfixes.  9 are cc:stable.  13 are MM and 4 are non-MM.
      
        The usual collection of singletons - please see the changelogs"
      
      * tag 'mm-hotfixes-stable-2024-11-03-10-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
        mm: multi-gen LRU: use {ptep,pmdp}_clear_young_notify()
        mm: multi-gen LRU: remove MM_LEAF_OLD and MM_NONLEAF_TOTAL stats
        mm, mmap: limit THP alignment of anonymous mappings to PMD-aligned sizes
        mm: shrinker: avoid memleak in alloc_shrinker_info
        .mailmap: update e-mail address for Eugen Hristev
        vmscan,migrate: fix page count imbalance on node stats when demoting pages
        mailmap: update Jarkko's email addresses
        mm: allow set/clear page_type again
        nilfs2: fix potential deadlock with newly created symlinks
        Squashfs: fix variable overflow in squashfs_readpage_block
        kasan: remove vmalloc_percpu test
        tools/mm: -Werror fixes in page-types/slabinfo
        mm, swap: avoid over reclaim of full clusters
        mm: fix PSWPIN counter for large folios swap-in
        mm: avoid VM_BUG_ON when try to map an anon large folio to zero page.
        mm/codetag: fix null pointer check logic for ref and tag
        mm/gup: stop leaking pinned pages in low memory conditions
      a8cc7432
    • Linus Torvalds's avatar
      Merge tag 'phy-fixes-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy · d5aaa0bc
      Linus Torvalds authored
      Pull phy fixes from Vinod Koul:
      
       - Qualcomm QMP driver fixes for null deref on suspend, bogus supplies
         fix and reset entries fix
      
       - BCM usb driver init array fix
      
       - cadence array offset fix
      
       - starfive link configuration fix
      
       - config dependency fix for rockchip driver
      
       - freescale reset signal fix before pll lock
      
       - tegra driver fix for error pointer check
      
      * tag 'phy-fixes-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
        phy: tegra: xusb: Add error pointer check in xusb.c
        dt-bindings: phy: qcom,sc8280xp-qmp-pcie-phy: Fix X1E80100 resets entries
        phy: freescale: imx8m-pcie: Do CMN_RST just before PHY PLL lock check
        phy: phy-rockchip-samsung-hdptx: Depend on CONFIG_COMMON_CLK
        phy: ti: phy-j721e-wiz: fix usxgmii configuration
        phy: starfive: jh7110-usb: Fix link configuration to controller
        phy: qcom: qmp-pcie: drop bogus x1e80100 qref supplies
        phy: qcom: qmp-combo: move driver data initialisation earlier
        phy: qcom: qmp-usbc: fix NULL-deref on runtime suspend
        phy: qcom: qmp-usb-legacy: fix NULL-deref on runtime suspend
        phy: qcom: qmp-usb: fix NULL-deref on runtime suspend
        dt-bindings: phy: qcom,sc8280xp-qmp-pcie-phy: add missing x1e80100 pipediv2 clocks
        phy: usb: disable COMMONONN for dual mode
        phy: cadence: Sierra: Fix offset of DEQ open eye algorithm control register
        phy: usb: Fix missing elements in BCM4908 USB init array
      d5aaa0bc
    • Linus Torvalds's avatar
      Merge tag 'dmaengine-fix-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine · e8529dcb
      Linus Torvalds authored
      Pull dmaengine fixes from Vinod Koul:
      
       - TI driver fix to set EOP for cyclic BCDMA transfers
      
       - sh rz-dmac driver fix for handling config with zero address
      
      * tag 'dmaengine-fix-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
        dmaengine: ti: k3-udma: Set EOP for all TRs in cyclic BCDMA transfer
        dmaengine: sh: rz-dmac: handle configs where one address is zero
      e8529dcb
    • Linus Torvalds's avatar
      Merge tag 'driver-core-6.12-rc6' of... · 886b7e80
      Linus Torvalds authored
      Merge tag 'driver-core-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
      
      Pull driver core revert from Greg KH:
       "Here is a single driver core revert for 6.12-rc6. It reverts a change
        that came in -rc1 that was supposed to resolve a reported problem, but
        caused another one, so revert it for now so that we can get this all
        worked out properly in 6.13.
      
        The revert has been in linux-next all week with no reported issues"
      
      * tag 'driver-core-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        Revert "driver core: Fix uevent_show() vs driver detach race"
      886b7e80
    • Linus Torvalds's avatar
      Merge tag 'usb-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · be5bfa13
      Linus Torvalds authored
      Pull USB / Thunderbolt fixes from Greg KH:
       "Here are some small USB and Thunderbolt driver fixes for 6.12-rc6 that
        have been sitting in my tree this week. Included in here are the
        following:
      
         - thunderbolt driver fixes for reported issues
      
         - USB typec driver fixes
      
         - xhci driver fixes for reported problems
      
         - dwc2 driver revert for a broken change
      
         - usb phy driver fix
      
         - usbip tool fix
      
        All of these have been in linux-next this week with no reported
        issues"
      
      * tag 'usb-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        usb: typec: tcpm: restrict SNK_WAIT_CAPABILITIES_TIMEOUT transitions to non self-powered devices
        usb: phy: Fix API devm_usb_put_phy() can not release the phy
        usb: typec: use cleanup facility for 'altmodes_node'
        usb: typec: fix unreleased fwnode_handle in typec_port_register_altmodes()
        usb: typec: qcom-pmic-typec: fix missing fwnode removal in error path
        usb: typec: qcom-pmic-typec: use fwnode_handle_put() to release fwnodes
        usb: acpi: fix boot hang due to early incorrect 'tunneled' USB3 device links
        Revert "usb: dwc2: Skip clock gating on Broadcom SoCs"
        xhci: Fix Link TRB DMA in command ring stopped completion event
        xhci: Use pm_runtime_get to prevent RPM on unsupported systems
        usbip: tools: Fix detach_port() invalid port error path
        thunderbolt: Honor TMU requirements in the domain when setting TMU mode
        thunderbolt: Fix KASAN reported stack out-of-bounds read in tb_retimer_scan()
      be5bfa13
Loading