Skip to content
Snippets Groups Projects
  1. Oct 09, 2020
    • Masahiro Yamada's avatar
      kbuild: split the build log of kallsyms · 08beb669
      Masahiro Yamada authored
      
      Currently, the build log shows KSYM + object name.
      
      Precisely speaking, kallsyms generates a .S file and then the compiler
      compiles it into a .o file. Split the build log into two.
      
      [Before]
      
        GEN     modules.builtin
        LD      .tmp_vmlinux.kallsyms1
        KSYM    .tmp_vmlinux.kallsyms1.o
        LD      .tmp_vmlinux.kallsyms2
        KSYM    .tmp_vmlinux.kallsyms2.o
        LD      vmlinux
      
      [After]
      
        GEN     modules.builtin
        LD      .tmp_vmlinux.kallsyms1
        KSYMS   .tmp_vmlinux.kallsyms1.S
        AS      .tmp_vmlinux.kallsyms1.o
        LD      .tmp_vmlinux.kallsyms2
        KSYMS   .tmp_vmlinux.kallsyms2.S
        AS      .tmp_vmlinux.kallsyms2.o
        LD      vmlinux
      
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      08beb669
  2. Sep 23, 2020
  3. Aug 09, 2020
    • Masahiro Yamada's avatar
      kbuild: do not export LDFLAGS_vmlinux · 3ec8a5b3
      Masahiro Yamada authored
      
      When you clean the build tree for ARCH=arm, you may see the following
      error message from 'nm' command:
      
      $ make -j24 ARCH=arm clean
        CLEAN   arch/arm/crypto
        CLEAN   arch/arm/kernel
        CLEAN   arch/arm/mach-at91
        CLEAN   arch/arm/mach-omap2
        CLEAN   arch/arm/vdso
        CLEAN   certs
        CLEAN   lib
        CLEAN   usr
        CLEAN   net/wireless
        CLEAN   drivers/firmware/efi/libstub
      nm: 'arch/arm/boot/compressed/../../../../vmlinux': No such file
      /bin/sh: 1: arithmetic expression: expecting primary: " "
        CLEAN   arch/arm/boot/compressed
        CLEAN   drivers/scsi
        CLEAN   drivers/tty/vt
        CLEAN   arch/arm/boot
        CLEAN   vmlinux.symvers modules.builtin modules.builtin.modinfo
      
      Even if you rerun the same command, the error message will not be
      shown despite vmlinux is already gone.
      
      To reproduce it, the parallel option -j is needed. Single thread
      cleaning always executes 'archclean', 'vmlinuxclean' in this order,
      so vmlinux still exists when arch/arm/boot/compressed/ is cleaned.
      
      Looking at arch/arm/boot/compressed/Makefile does not help understand
      the reason of the error message. Both KBSS_SZ and LDFLAGS_vmlinux are
      assigned with '=' operator, hence, they are not expanded unless used.
      Obviously, 'make clean' does not use them.
      
      In fact, the root cause exists in the top Makefile:
      
        export LDFLAGS_vmlinux
      
      Since LDFLAGS_vmlinux is an exported variable, LDFLAGS_vmlinux in
      arch/arm/boot/compressed/Makefile is expanded when scripts/Makefile.clean
      has a command to execute. This is why the error message shows up only
      when there exist build artifacts in arch/arm/boot/compressed/.
      
      Adding 'unexport LDFLAGS_vmlinux' to arch/arm/boot/compressed/Makefile
      will fix it as far as ARCH=arm is concerned, but I think the proper fix
      is to get rid of 'export LDFLAGS_vmlinux' from the top Makefile.
      
      LDFLAGS_vmlinux in the top Makefile contains linker flags for the top
      vmlinux. LDFLAGS_vmlinux in arch/arm/boot/compressed/Makefile is for
      arch/arm/boot/compressed/vmlinux. They just happen to have the same
      variable name, but are used for different purposes. Stop shadowing
      LDFLAGS_vmlinux.
      
      This commit passes LDFLAGS_vmlinux to scripts/link-vmlinux.sh via a
      command line parameter instead of via an environment variable. LD and
      KBUILD_LDFLAGS are exported, but I did the same for consistency. Anyway,
      they must be included in cmd_link-vmlinux to allow if_changed to detect
      the changes in LD or KBUILD_LDFLAGS.
      
      The following Makefiles are not affected:
      
        arch/arm/boot/compressed/Makefile
        arch/h8300/boot/compressed/Makefile
        arch/nios2/boot/compressed/Makefile
        arch/parisc/boot/compressed/Makefile
        arch/s390/boot/compressed/Makefile
        arch/sh/boot/compressed/Makefile
        arch/sh/boot/romimage/Makefile
        arch/x86/boot/compressed/Makefile
      
      They use ':=' or '=' to clear the LDFLAGS_vmlinux inherited from the
      top Makefile.
      
      We need to take a closer look at the impact to unicore32 and xtensa.
      
      arch/unicore32/boot/compressed/Makefile only uses '+=' operator for
      LDFLAGS_vmlinux. So, the decompressor previously inherited the linker
      flags from the top Makefile.
      
      However, commit 70fac51f ("unicore32 additional architecture files:
      boot process") was merged before commit 1f2bfbd0 ("kbuild: link of
      vmlinux moved to a script"). So, I rather consider this is a bug fix of
      1f2bfbd0.
      
      arch/xtensa/boot/boot-elf/Makefile is also affected, but this is also
      considered a fix for the same reason. It did not inherit LDFLAGS_vmlinux
      when commit 4bedea94 ("[PATCH] xtensa: Architecture support for
      Tensilica Xtensa Part 2") was merged. I deleted $(LDFLAGS_vmlinux),
      which is now empty.
      
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Tested-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      3ec8a5b3
  4. Jul 13, 2020
  5. Jun 09, 2020
  6. Jun 06, 2020
    • Masahiro Yamada's avatar
      modpost: generate vmlinux.symvers and reuse it for the second modpost · 269a535c
      Masahiro Yamada authored
      
      The full build runs modpost twice, first for vmlinux.o and second for
      modules.
      
      The first pass dumps all the vmlinux symbols into Module.symvers, but
      the second pass parses vmlinux again instead of reusing the dump file,
      presumably because it needs to avoid accumulating stale symbols.
      
      Loading symbol info from a dump file is faster than parsing an ELF object.
      Besides, modpost deals with various issues to parse vmlinux in the second
      pass.
      
      A solution is to make the first pass dumps symbols into a separate file,
      vmlinux.symvers. The second pass reads it, and parses module .o files.
      The merged symbol information is dumped into Module.symvers in the same
      way as before.
      
      This makes further modpost cleanups possible.
      
      Also, it fixes the problem of 'make vmlinux', which previously overwrote
      Module.symvers, throwing away module symbols.
      
      I slightly touched scripts/link-vmlinux.sh so that vmlinux is re-linked
      when you cross this commit. Otherwise, vmlinux.symvers would not be
      generated.
      
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      269a535c
  7. Apr 22, 2020
  8. Mar 19, 2020
  9. Mar 04, 2020
    • Kees Cook's avatar
      kbuild: Remove debug info from kallsyms linking · af73d78b
      Kees Cook authored
      
      When CONFIG_DEBUG_INFO is enabled, the two kallsyms linking steps spend
      time collecting and writing the dwarf sections to the temporary output
      files. kallsyms does not need this information, and leaving it off
      halves their linking time. This is especially noticeable without
      CONFIG_DEBUG_INFO_REDUCED. The BTF linking stage, however, does still
      need those details.
      
      Refactor the BTF and kallsyms generation stages slightly for more
      regularized temporary names. Skip debug during kallsyms links.
      Additionally move "info BTF" to the correct place since commit
      8959e392 ("kbuild: Parameterize kallsyms generation and correct
      reporting"), which added "info LD ..." to vmlinux_link calls.
      
      For a full debug info build with BTF, my link time goes from 1m06s to
      0m54s, saving about 12 seconds, or 18%.
      
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/202003031814.4AEA3351@keescook
      af73d78b
  10. Feb 10, 2020
    • Masahiro Yamada's avatar
      kbuild: fix mismatch between .version and include/generated/compile.h · 083bc0e1
      Masahiro Yamada authored
      
      Since commit 56d58936 ("kbuild: do not create orphan built-in.a or
      obj-y objects"), scripts/link-vmlinux.sh does nothing when descending
      into init/.
      
      Once the version number becomes out of sync between .version and
      include/generated/compile.h, it is not self-healing.
      
      [How to reproduce]
      
       $ echo 100 > .version
       $ make
      
      You will see the number in the .version is always bigger than that in
      compile.h by one. After this, every time you run 'make', the vmlinux is
      re-linked even when none of source files is updated.
      
      Fixes: 56d58936 ("kbuild: do not create orphan built-in.a or obj-y objects")
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      083bc0e1
  11. Jan 22, 2020
  12. Jan 06, 2020
    • Masahiro Yamada's avatar
      kbuild: create modules.builtin without Makefile.modbuiltin or tristate.conf · 8b41fc44
      Masahiro Yamada authored
      
      Commit bc081dd6 ("kbuild: generate modules.builtin") added
      infrastructure to generate modules.builtin, the list of all
      builtin modules.
      
      Basically, it works like this:
      
        - Kconfig generates include/config/tristate.conf, the list of
          tristate CONFIG options with a value in a capital letter.
      
        - scripts/Makefile.modbuiltin makes Kbuild descend into
          directories to collect the information of builtin modules.
      
      I am not a big fan of it because Kbuild ends up with traversing
      the source tree twice.
      
      I am not sure how perfectly it should work, but this approach cannot
      avoid false positives; even if the relevant CONFIG option is tristate,
      some Makefiles forces obj-m to obj-y.
      
      Some examples are:
      
        arch/powerpc/platforms/powermac/Makefile:
          obj-$(CONFIG_NVRAM:m=y)         += nvram.o
      
        net/ipv6/Makefile:
          obj-$(subst m,y,$(CONFIG_IPV6)) += inet6_hashtables.o
      
        net/netlabel/Makefile:
          obj-$(subst m,y,$(CONFIG_IPV6)) += netlabel_calipso.o
      
      Nobody has complained about (or noticed) it, so it is probably fine to
      have false positives in modules.builtin.
      
      This commit simplifies the implementation. Let's exploit the fact
      that every module has MODULE_LICENSE(). (modpost shows a warning if
      MODULE_LICENSE is missing. If so, 0-day bot would already have blocked
      such a module.)
      
      I added MODULE_FILE to <linux/module.h>. When the code is being compiled
      as builtin, it will be filled with the file path of the module, and
      collected into modules.builtin.info. Then, scripts/link-vmlinux.sh
      extracts the list of builtin modules out of it.
      
      This new approach fixes the false-positives above, but adds another
      type of false-positives; non-modular code may have MODULE_LICENSE()
      by mistake. This is not a big deal, it is just the code is always
      orphan. We can clean it up if we like. You can see cleanup examples by:
      
        $ git log --grep='make.* explicitly non-modular'
      
      To sum up, this commits deletes lots of code, but still produces almost
      equivalent results. Please note it does not increase the vmlinux size at
      all. As you can see in include/asm-generic/vmlinux.lds.h, the .modinfo
      section is discarded in the link stage.
      
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      8b41fc44
  13. Dec 13, 2019
  14. Nov 29, 2019
    • Stanislav Fomichev's avatar
      bpf: Force .BTF section start to zero when dumping from vmlinux · df786c9b
      Stanislav Fomichev authored
      
      While trying to figure out why fentry_fexit selftest doesn't pass for me
      (old pahole, broken BTF), I found out that my latest patch can break vmlinux
      .BTF generation. objcopy preserves section start when doing --only-section,
      so there is a chance (depending on where pahole inserts .BTF section) to
      have leading empty zeroes. Let's explicitly force section offset to zero.
      
      Before:
      
      $ objcopy --set-section-flags .BTF=alloc -O binary \
      	--only-section=.BTF vmlinux .btf.vmlinux.bin
      $ xxd .btf.vmlinux.bin | head -n1
      00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
      
      After:
      
      $ objcopy --change-section-address .BTF=0 \
      	--set-section-flags .BTF=alloc -O binary \
      	--only-section=.BTF vmlinux .btf.vmlinux.bin
      $ xxd .btf.vmlinux.bin | head -n1
      00000000: 9feb 0100 1800 0000 0000 0000 80e1 1c00  ................
                ^BTF magic
      
      As part of this change, I'm also dropping '2>/dev/null' from objcopy
      invocation to be able to catch possible other issues (objcopy doesn't
      produce any warnings for me anymore, it did before with --dump-section).
      
      Fixes: da5fb182 ("bpf: Support pre-2.25-binutils objcopy for vmlinux BTF")
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/20191127225759.39923-1-sdf@google.com
      df786c9b
  15. Nov 27, 2019
  16. Sep 06, 2019
  17. Aug 21, 2019
  18. Aug 13, 2019
    • Andrii Nakryiko's avatar
      btf: rename /sys/kernel/btf/kernel into /sys/kernel/btf/vmlinux · 7fd78568
      Andrii Nakryiko authored
      
      Expose kernel's BTF under the name vmlinux to be more uniform with using
      kernel module names as file names in the future.
      
      Fixes: 341dfcf8 ("btf: expose BTF info through sysfs")
      Suggested-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      7fd78568
    • Andrii Nakryiko's avatar
      btf: expose BTF info through sysfs · 341dfcf8
      Andrii Nakryiko authored
      
      Make .BTF section allocated and expose its contents through sysfs.
      
      /sys/kernel/btf directory is created to contain all the BTFs present
      inside kernel. Currently there is only kernel's main BTF, represented as
      /sys/kernel/btf/kernel file. Once kernel modules' BTFs are supported,
      each module will expose its BTF as /sys/kernel/btf/<module-name> file.
      
      Current approach relies on a few pieces coming together:
      1. pahole is used to take almost final vmlinux image (modulo .BTF and
         kallsyms) and generate .BTF section by converting DWARF info into
         BTF. This section is not allocated and not mapped to any segment,
         though, so is not yet accessible from inside kernel at runtime.
      2. objcopy dumps .BTF contents into binary file and subsequently
         convert binary file into linkable object file with automatically
         generated symbols _binary__btf_kernel_bin_start and
         _binary__btf_kernel_bin_end, pointing to start and end, respectively,
         of BTF raw data.
      3. final vmlinux image is generated by linking this object file (and
         kallsyms, if necessary). sysfs_btf.c then creates
         /sys/kernel/btf/kernel file and exposes embedded BTF contents through
         it. This allows, e.g., libbpf and bpftool access BTF info at
         well-known location, without resorting to searching for vmlinux image
         on disk (location of which is not standardized and vmlinux image
         might not be even available in some scenarios, e.g., inside qemu
         during testing).
      
      Alternative approach using .incbin assembler directive to embed BTF
      contents directly was attempted but didn't work, because sysfs_proc.o is
      not re-compiled during link-vmlinux.sh stage. This is required, though,
      to update embedded BTF data (initially empty data is embedded, then
      pahole generates BTF info and we need to regenerate sysfs_btf.o with
      updated contents, but it's too late at that point).
      
      If BTF couldn't be generated due to missing or too old pahole,
      sysfs_btf.c handles that gracefully by detecting that
      _binary__btf_kernel_bin_start (weak symbol) is 0 and not creating
      /sys/kernel/btf at all.
      
      v2->v3:
      - added Documentation/ABI/testing/sysfs-kernel-btf (Greg K-H);
      - created proper kobject (btf_kobj) for btf directory (Greg K-H);
      - undo v2 change of reusing vmlinux, as it causes extra kallsyms pass
        due to initially missing  __binary__btf_kernel_bin_{start/end} symbols;
      
      v1->v2:
      - allow kallsyms stage to re-use vmlinux generated by gen_btf();
      
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      341dfcf8
  19. Jul 31, 2019
    • Masahiro Yamada's avatar
      kbuild: modpost: do not parse unnecessary rules for vmlinux modpost · a721588d
      Masahiro Yamada authored
      
      Since commit ff9b45c5 ("kbuild: modpost: read modules.order instead
      of $(MODVERDIR)/*.mod"), 'make vmlinux' emits a warning, like this:
      
      $ make defconfig vmlinux
        [ snip ]
        LD      vmlinux.o
      cat: modules.order: No such file or directory
        MODPOST vmlinux.o
        MODINFO modules.builtin.modinfo
        KSYM    .tmp_kallsyms1.o
        KSYM    .tmp_kallsyms2.o
        LD      vmlinux
        SORTEX  vmlinux
        SYSMAP  System.map
      
      When building only vmlinux, KBUILD_MODULES is not set. Hence, the
      modules.order is not generated. For the vmlinux modpost, it is not
      necessary at all.
      
      Separate scripts/Makefile.modpost for the vmlinux/modules stages.
      This works more efficiently because the vmlinux modpost does not
      need to include .*.cmd files.
      
      Fixes: ff9b45c5 ("kbuild: modpost: read modules.order instead of $(MODVERDIR)/*.mod")
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      a721588d
  20. May 07, 2019
    • Alexey Gladkov's avatar
      moduleparam: Save information about built-in modules in separate file · 898490c0
      Alexey Gladkov authored
      
      Problem:
      
      When a kernel module is compiled as a separate module, some important
      information about the kernel module is available via .modinfo section of
      the module.  In contrast, when the kernel module is compiled into the
      kernel, that information is not available.
      
      Information about built-in modules is necessary in the following cases:
      
      1. When it is necessary to find out what additional parameters can be
      passed to the kernel at boot time.
      
      2. When you need to know which module names and their aliases are in
      the kernel. This is very useful for creating an initrd image.
      
      Proposal:
      
      The proposed patch does not remove .modinfo section with module
      information from the vmlinux at the build time and saves it into a
      separate file after kernel linking. So, the kernel does not increase in
      size and no additional information remains in it. Information is stored
      in the same format as in the separate modules (null-terminated string
      array). Because the .modinfo section is already exported with a separate
      modules, we are not creating a new API.
      
      It can be easily read in the userspace:
      
      $ tr '\0' '\n' < modules.builtin.modinfo
      ext4.softdep=pre: crc32c
      ext4.license=GPL
      ext4.description=Fourth Extended Filesystem
      ext4.author=Remy Card, Stephen Tweedie, Andrew Morton, Andreas Dilger, Theodore Ts'o and others
      ext4.alias=fs-ext4
      ext4.alias=ext3
      ext4.alias=fs-ext3
      ext4.alias=ext2
      ext4.alias=fs-ext2
      md_mod.alias=block-major-9-*
      md_mod.alias=md
      md_mod.description=MD RAID framework
      md_mod.license=GPL
      md_mod.parmtype=create_on_open:bool
      md_mod.parmtype=start_dirty_degraded:int
      ...
      
      Co-Developed-by: default avatarGleb Fotengauer-Malinovskiy <glebfm@altlinux.org>
      Signed-off-by: default avatarGleb Fotengauer-Malinovskiy <glebfm@altlinux.org>
      Signed-off-by: default avatarAlexey Gladkov <gladkov.alexey@gmail.com>
      Acked-by: default avatarJessica Yu <jeyu@kernel.org>
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      898490c0
  21. May 06, 2019
  22. Apr 16, 2019
    • Andrii Nakryiko's avatar
      kbuild: handle old pahole more gracefully when generating BTF · 68e5ab1f
      Andrii Nakryiko authored
      
      When CONFIG_DEBUG_INFO_BTF is enabled but available version of pahole is too
      old to support BTF generation, build script is supposed to emit warning and
      proceed with the build. Due to using exit instead of return from BASH function,
      existing handling code prematurely exits exit code 0, not completing some of
      the build steps. This patch fixes issue by correctly returning just from
      gen_btf() function only.
      
      Fixes: e83b9f55 ("kbuild: add ability to generate BTF type info for vmlinux")
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      68e5ab1f
  23. Apr 02, 2019
    • Andrii Nakryiko's avatar
      kbuild: add ability to generate BTF type info for vmlinux · e83b9f55
      Andrii Nakryiko authored
      This patch adds new config option to trigger generation of BTF type
      information from DWARF debuginfo for vmlinux and kernel modules through
      pahole, which in turn relies on libbpf for btf_dedup() algorithm.
      
      The intent is to record compact type information of all types used
      inside kernel, including all the structs/unions/typedefs/etc. This
      enables BPF's compile-once-run-everywhere ([0]) approach, in which
      tracing programs that are inspecting kernel's internal data (e.g.,
      struct task_struct) can be compiled on a system running some kernel
      version, but would be possible to run on other kernel versions (and
      configurations) without recompilation, even if the layout of structs
      changed and/or some of the fields were added, removed, or renamed.
      
      This is only possible if BPF loader can get kernel type info to adjust
      all the offsets correctly. This patch is a first time in this direction,
      making sure that BTF type info is part of Linux kernel image in
      non-loadable ELF section.
      
      BTF deduplication ([1]) algorithm typically provides 100x savings
      compared to DWARF data, so resulting .BTF section is not big as is
      typically about 2MB in size.
      
      [0] http://vger.kernel.org/lpc-bpf2018.html#session-2
      [1] https://facebookmicrosites.github.io/bpf/blog/2018/11/14/btf-enhancement.html
      
      
      
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarDavid S. Miller <davem@davemloft.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      e83b9f55
  24. Mar 13, 2019
  25. Jan 28, 2019
  26. Aug 23, 2018
  27. May 17, 2018
  28. Mar 25, 2018
  29. Mar 02, 2018
  30. Nov 02, 2017
    • Greg Kroah-Hartman's avatar
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman authored
      
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
         lines).
      
      All documentation files were explicitly excluded.
      
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
      
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
      
         For non */uapi/* files that summary was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0                                              11139
      
         and resulted in the first patch in this series.
      
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0 WITH Linux-syscall-note                        930
      
         and resulted in the second patch in this series.
      
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|------
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
      
         and that resulted in the third patch in this series.
      
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
      
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
      
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
      
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
      
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      
      Reviewed-by: default avatarKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: default avatarPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2441318
  31. Oct 09, 2017
  32. Jun 30, 2017
Loading