Skip to content
Snippets Groups Projects
  1. Nov 27, 2024
  2. Nov 06, 2024
    • Rong Xu's avatar
      objtool: Fix unreachable instruction warnings for weak functions · 18e88509
      Rong Xu authored
      
      In the presence of both weak and strong function definitions, the
      linker drops the weak symbol in favor of a strong symbol, but
      leaves the code in place. Code in ignore_unreachable_insn() has
      some heuristics to suppress the warning, but it does not work when
      -ffunction-sections is enabled.
      
      Suppose function foo has both strong and weak definitions.
      Case 1: The strong definition has an annotated section name,
      like .init.text. Only the weak definition will be placed into
      .text.foo. But since the section has no symbols, there will be no
      "hole" in the section.
      
      Case 2: Both sections are without an annotated section name.
      Both will be placed into .text.foo section, but there will be only one
      symbol (the strong one). If the weak code is before the strong code,
      there is no "hole" as it fails to find the right-most symbol before
      the offset.
      
      The fix is to use the first node to compute the hole if hole.sym
      is empty. If there is no symbol in the section, the first node
      will be NULL, in which case, -1 is returned to skip the whole
      section.
      
      Co-developed-by: default avatarHan Shen <shenhan@google.com>
      Signed-off-by: default avatarHan Shen <shenhan@google.com>
      Signed-off-by: default avatarRong Xu <xur@google.com>
      Suggested-by: default avatarSriraman Tallam <tmsriram@google.com>
      Suggested-by: default avatarKrzysztof Pszeniczny <kpszeniczny@google.com>
      Tested-by: default avatarYonghong Song <yonghong.song@linux.dev>
      Tested-by: default avatarYabin Cui <yabinc@google.com>
      Tested-by: default avatarNathan Chancellor <nathan@kernel.org>
      Reviewed-by: default avatarKees Cook <kees@kernel.org>
      Acked-by: default avatarJosh Poimboeuf <jpoimboe@kernel.org>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      18e88509
    • Rong Xu's avatar
      kbuild: Add AutoFDO support for Clang build · 315ad878
      Rong Xu authored
      Add the build support for using Clang's AutoFDO. Building the kernel
      with AutoFDO does not reduce the optimization level from the
      compiler. AutoFDO uses hardware sampling to gather information about
      the frequency of execution of different code paths within a binary.
      This information is then used to guide the compiler's optimization
      decisions, resulting in a more efficient binary. Experiments
      showed that the kernel can improve up to 10% in latency.
      
      The support requires a Clang compiler after LLVM 17. This submission
      is limited to x86 platforms that support PMU features like LBR on
      Intel machines and AMD Zen3 BRS. Support for SPE on ARM 1,
       and BRBE on ARM 1 is part of planned future work.
      
      Here is an example workflow for AutoFDO kernel:
      
      1) Build the kernel on the host machine with LLVM enabled, for example,
             $ make menuconfig LLVM=1
          Turn on AutoFDO build config:
            CONFIG_AUTOFDO_CLANG=y
          With a configuration that has LLVM enabled, use the following
          command:
             scripts/config -e AUTOFDO_CLANG
          After getting the config, build with
            $ make LLVM=1
      
      2) Install the kernel on the test machine.
      
      3) Run the load tests. The '-c' option in perf specifies the sample
         event period. We suggest     using a suitable prime number,
         like 500009, for this purpose.
         For Intel platforms:
            $ perf record -e BR_INST_RETIRED.NEAR_TAKEN:k -a -N -b -c <count> \
              -o <perf_file> -- <loadtest>
         For AMD platforms:
            The supported system are: Zen3 with BRS, or Zen4 with amd_lbr_v2
           For Zen3:
            $ cat proc/cpuinfo | grep " brs"
            For Zen4:
            $ cat proc/cpuinfo | grep amd_lbr_v2
            $ perf record --pfm-events RETIRED_TAKEN_BRANCH_INSTRUCTIONS:k -a \
              -N -b -c <count> -o <perf_file> -- <loadtest>
      
      4) (Optional) Download the raw perf file to the host machine.
      
      5) To generate an AutoFDO profile, two offline tools are available:
         create_llvm_prof and llvm_profgen. The create_llvm_prof tool is part
         of the AutoFDO project and can be found on GitHub
         (https://github.com/google/autofdo
      
      ), version v0.30.1 or later. The
         llvm_profgen tool is included in the LLVM compiler itself. It's
         important to note that the version of llvm_profgen doesn't need to
         match the version of Clang. It needs to be the LLVM 19 release or
         later, or from the LLVM trunk.
            $ llvm-profgen --kernel --binary=<vmlinux> --perfdata=<perf_file> \
              -o <profile_file>
         or
            $ create_llvm_prof --binary=<vmlinux> --profile=<perf_file> \
              --format=extbinary --out=<profile_file>
      
         Note that multiple AutoFDO profile files can be merged into one via:
            $ llvm-profdata merge -o <profile_file>  <profile_1> ... <profile_n>
      
      6) Rebuild the kernel using the AutoFDO profile file with the same config
         as step 1, (Note CONFIG_AUTOFDO_CLANG needs to be enabled):
            $ make LLVM=1 CLANG_AUTOFDO_PROFILE=<profile_file>
      
      Co-developed-by: default avatarHan Shen <shenhan@google.com>
      Signed-off-by: default avatarHan Shen <shenhan@google.com>
      Signed-off-by: default avatarRong Xu <xur@google.com>
      Suggested-by: default avatarSriraman Tallam <tmsriram@google.com>
      Suggested-by: default avatarKrzysztof Pszeniczny <kpszeniczny@google.com>
      Suggested-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Suggested-by: default avatarStephane Eranian <eranian@google.com>
      Tested-by: default avatarYonghong Song <yonghong.song@linux.dev>
      Tested-by: default avatarYabin Cui <yabinc@google.com>
      Tested-by: default avatarNathan Chancellor <nathan@kernel.org>
      Reviewed-by: default avatarKees Cook <kees@kernel.org>
      Tested-by: default avatarPeter Jung <ptr1337@cachyos.org>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      315ad878
    • Masahiro Yamada's avatar
      kbuild: simplify rustfmt target · 397a479b
      Masahiro Yamada authored
      
      There is no need to prune the rust/alloc directory because it was
      removed by commit 9d0441ba ("rust: alloc: remove our fork of the
      `alloc` crate").
      
      There is no need to prune the rust/test directory because no '*.rs'
      files are generated within it.
      
      To avoid forking the 'grep -Fv generated' process, filter out generated
      files using the option, ! -name '*generated*'.
      
      Now that the '-path ... -prune' option is no longer used, there is no
      need to use the absolute path. Searching in $(srctree), which can be
      a relative path, is sufficient.
      
      The comment mentions the use case where $(srctree) is '..', that is,
      $(objtree) is a sub-directory of $(srctree). In this scenario, all
      '*.rs' files under $(objtree) are generated files and filters out by
      the '*generated*' pattern.
      
      Add $(RCS_FIND_IGNORE) as a shortcut. Although I do not believe '*.rs'
      files would exist under the .git directory, there is no need for the
      'find' command to traverse it.
      
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Reviewed-by: default avatarNicolas Schier <n.schier@avm.de>
      Acked-by: default avatarMiguel Ojeda <ojeda@kernel.org>
      397a479b
  3. Nov 05, 2024
Loading