Newer
Older
.. SPDX-License-Identifier: (GPL-2.0+ OR CC-BY-4.0)
.. [see the bottom of this file for redistribution information]
=========================================
How to verify bugs and bisect regressions
=========================================
This document describes how to check if some Linux kernel problem occurs in code
currently supported by developers -- to then explain how to locate the change
causing the issue, if it is a regression (e.g. did not happen with earlier
versions).
The text aims at people running kernels from mainstream Linux distributions on
commodity hardware who want to report a kernel bug to the upstream Linux
developers. Despite this intent, the instructions work just as well for users
who are already familiar with building their own kernels: they help avoid
mistakes occasionally made even by experienced developers.
..
Note: if you see this note, you are reading the text's source file. You
might want to switch to a rendered version: it makes it a lot easier to
read and navigate this document -- especially when you want to look something
up in the reference section, then jump back to where you left off.
..
Find the latest rendered version of this text here:
https://docs.kernel.org/admin-guide/verify-bugs-and-bisect-regressions.rst.html
The essence of the process (aka 'TL;DR')
========================================
*[If you are new to building or bisecting Linux, ignore this section and head
over to the* ':ref:`step-by-step guide <introguide_bissbs>`' *below. It utilizes
the same commands as this section while describing them in brief fashion. The
steps are nevertheless easy to follow and together with accompanying entries
in a reference section mention many alternatives, pitfalls, and additional
aspects, all of which might be essential in your present case.]*
**In case you want to check if a bug is present in code currently supported by
developers**, execute just the *preparations* and *segment 1*; while doing so,
consider the newest Linux kernel you regularly use to be the 'working' kernel.
In the following example that's assumed to be 6.0, which is why its sources
will be used to prepare the .config file.
**In case you face a regression**, follow the steps at least till the end of
*segment 2*. Then you can submit a preliminary report -- or continue with
*segment 3*, which describes how to perform a bisection needed for a
full-fledged regression report. In the following example 6.0.13 is assumed to be
the 'working' kernel and 6.1.5 to be the first 'broken', which is why 6.0
will be considered the 'good' release and used to prepare the .config file.
* **Preparations**: set up everything to build your own kernels::
# * Remove any software that depends on externally maintained kernel modules
# or builds any automatically during bootup.
# * Ensure Secure Boot permits booting self-compiled Linux kernels.
# * If you are not already running the 'working' kernel, reboot into it.
# * Install compilers and everything else needed for building Linux.
# * Ensure to have 15 Gigabyte free space in your home directory.
git clone -o mainline --no-checkout \
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ~/linux/
cd ~/linux/
git remote add -t master stable \
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
git switch --detach v6.0
# * Hint: if you used an existing clone, ensure no stale .config is around.
make olddefconfig
# * Ensure the former command picked the .config of the 'working' kernel.
# * Connect external hardware (USB keys, tokens, ...), start a VM, bring up
# VPNs, mount network shares, and briefly try the feature that is broken.
yes '' | make localmodconfig
./scripts/config --set-str CONFIG_LOCALVERSION '-local'
./scripts/config -e CONFIG_LOCALVERSION_AUTO
# * Note, when short on storage space, check the guide for an alternative:
./scripts/config -d DEBUG_INFO_NONE -e KALLSYMS_ALL -e DEBUG_KERNEL \
-e DEBUG_INFO -e DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT -e KALLSYMS
# * Hint: at this point you might want to adjust the build configuration;
# you'll have to, if you are running Debian.
make olddefconfig
cp .config ~/kernel-config-working
* **Segment 1**: build a kernel from the latest mainline codebase.
This among others checks if the problem was fixed already and which developers
later need to be told about the problem; in case of a regression, this rules
out a .config change as root of the problem.
a) Checking out latest mainline code::
cd ~/linux/
git switch --discard-changes --detach mainline/master
b) Build, install, and boot a kernel::
cp ~/kernel-config-working .config
make olddefconfig
make -j $(nproc --all)
# * Make sure there is enough disk space to hold another kernel:
df -h /boot/ /lib/modules/
# * Note: on Arch Linux, its derivatives and a few other distributions
# the following commands will do nothing at all or only part of the
# job. See the step-by-step guide for further details.
sudo make modules_install
command -v installkernel && sudo make install
# * Check how much space your self-built kernel actually needs, which
# enables you to make better estimates later:
du -ch /boot/*$(make -s kernelrelease)* | tail -n 1
du -sh /lib/modules/$(make -s kernelrelease)/
# * Hint: the output of the following command will help you pick the
# right kernel from the boot menu:
make -s kernelrelease | tee -a ~/kernels-built
reboot
# * Once booted, ensure you are running the kernel you just built by
# checking if the output of the next two commands matches:
tail -n 1 ~/kernels-built
uname -r
c) Check if the problem occurs with this kernel as well.
* **Segment 2**: ensure the 'good' kernel is also a 'working' kernel.
This among others verifies the trimmed .config file actually works well, as
bisecting with it otherwise would be a waste of time:
a) Start by checking out the sources of the 'good' version::
cd ~/linux/
git switch --discard-changes --detach v6.0
b) Build, install, and boot a kernel as described earlier in *segment 1,
section b* -- just feel free to skip the 'du' commands, as you have a rough
estimate already.
c) Ensure the feature that regressed with the 'broken' kernel actually works
with this one.
* **Segment 3**: perform and validate the bisection.
a) Retrieve the sources for your 'bad' version::
git remote set-branches --add stable linux-6.1.y
git fetch stable
b) Initialize the bisection::
cd ~/linux/
git bisect start
git bisect good v6.0
git bisect bad v6.1.5
c) Build, install, and boot a kernel as described earlier in *segment 1,
section b*.
In case building or booting the kernel fails for unrelated reasons, run
``git bisect skip``. In all other outcomes, check if the regressed feature
works with the newly built kernel. If it does, tell Git by executing
``git bisect good``; if it does not, run ``git bisect bad`` instead.
All three commands will make Git check out another commit; then re-execute
this step (e.g. build, install, boot, and test a kernel to then tell Git
the outcome). Do so again and again until Git shows which commit broke
things. If you run short of disk space during this process, check the
section 'Supplementary tasks: cleanup during and after the process'
below.
d) Once your finished the bisection, put a few things away::
cd ~/linux/
git bisect log > ~/bisect-log
cp .config ~/bisection-config-culprit
git bisect reset
e) Try to verify the bisection result::
git switch --discard-changes --detach mainline/master
git revert --no-edit cafec0cacaca0
cp ~/kernel-config-working .config
./scripts/config --set-str CONFIG_LOCALVERSION '-local-cafec0cacaca0-reverted'
This is optional, as some commits are impossible to revert. But if the
second command worked flawlessly, build, install, and boot one more kernel
kernel; just this time skip the first command copying the base .config file
over, as that already has been taken care off.
* **Supplementary tasks**: cleanup during and after the process.
a) To avoid running out of disk space during a bisection, you might need to
remove some kernels you built earlier. You most likely want to keep those
you built during segment 1 and 2 around for a while, but you will most
likely no longer need kernels tested during the actual bisection
(Segment 3 c). You can list them in build order using::
ls -ltr /lib/modules/*-local*
To then for example erase a kernel that identifies itself as
'6.0-rc1-local-gcafec0cacaca0', use this::
sudo rm -rf /lib/modules/6.0-rc1-local-gcafec0cacaca0
sudo kernel-install -v remove 6.0-rc1-local-gcafec0cacaca0
# * Note, on some distributions kernel-install is missing
Loading
Loading full blame...