Commits · c8a74954cde651726e0a2380de3907d0e903f809 · CodeLinaro / aic / kernel

Mar 14, 2025

Merge tag 'drm-intel-fixes-2025-03-13' of... · c8a74954

Dave Airlie authored 1 month ago

Merge tag 'drm-intel-fixes-2025-03-13' of https://gitlab.freedesktop.org/drm/i915/kernel

 into drm-fixes

- Do cdclk post plane programming later (Ville)
- Bump MMAP_GTT_VERSION: missing indication of partial mmaps support (Jose)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Z9MG4fH-6Q8dTHE1@intel.com

c8a74954

Merge tag 'drm-misc-fixes-2025-03-13' of... · d703575a

Dave Airlie authored 1 month ago

Merge tag 'drm-misc-fixes-2025-03-13' of https://gitlab.freedesktop.org/drm/misc/kernel

 into drm-fixes

A null pointer check for gma500, two clippy fixes for panic, a fix for
an interaction between DPMS and atomic leading to dropped frames, and
a locking fix  for dp_mst

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250313-holistic-clay-moose-fead28@houat

d703575a

Mar 13, 2025

Merge tag 'amd-drm-fixes-6.14-2025-03-12' of... · 385b6432

Dave Airlie authored 1 month ago

Merge tag 'amd-drm-fixes-6.14-2025-03-12' of https://gitlab.freedesktop.org/agd5f/linux

 into drm-fixes

amd-drm-fixes-6.14-2025-03-12:

amdgpu:
- GC 12.x DCC fix
- DC DCE 6.x fix
- Hibernation fix
- HPD fix
- Backlight fixes
- Color depth fix
- UAF fix in hdcp_work
- VCE 2.x fix
- GC 12.x PTE fix

amdkfd:
- Queue eviction fix

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250312190931.216506-1-alexander.deucher@amd.com

385b6432

Mar 12, 2025

drm/amdgpu: NULL-check BO's backing store when determining GFX12 PTE flags · 6cc30748

Natalie Vock authored 1 month ago


PRT BOs may not have any backing store, so bo->tbo.resource will be
NULL. Check for that before dereferencing.

Fixes: 0cce5f28 ("drm/amdkfd: Check correct memory types for is_system variable")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3e3fcd29)
Cc: stable@vger.kernel.org # 6.12.x

6cc30748

drm/amd/amdkfd: Evict all queues even HWS remove queue failed · 0882ca4e

Yifan Zha authored 1 month ago


[Why]
If reset is detected and kfd need to evict working queues, HWS moving queue will be failed.
Then remaining queues are not evicted and in active state.

After reset done, kfd uses HWS to termination remaining activated queues but HWS is resetted.
So remove queue will be failed again.

[How]
Keep removing all queues even if HWS returns failed.
It will not affect cpsch as it checks reset_domain->sem.

v2: If any queue failed, evict queue returns error.
v3: Declare err inside the if-block.

Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 42c854b8)
Cc: stable@vger.kernel.org

0882ca4e

Mar 11, 2025

drm/i915: Increase I915_PARAM_MMAP_GTT_VERSION version to indicate support for partial mmaps · a8045e46

José Roberto de Souza authored 1 month ago

Commit 255fc170 ("drm/i915/gem: Calculate object page offset for partial memory mapping")
was the last patch of several patches fixing multiple partial mmaps.
But without a bump in I915_PARAM_MMAP_GTT_VERSION there is no clean
way for UMD to know if it can do multiple partial mmaps.

Fixes: 255fc170 ("drm/i915/gem: Calculate object page offset for partial memory mapping")
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306210827.171147-1-jose.souza@intel.com

(cherry picked from commit bfef148f)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

a8045e46

drm/dp_mst: Fix locking when skipping CSN before topology probing · 12d8f318

Imre Deak authored 1 month ago


The handling of the MST Connection Status Notify message is skipped if
the probing of the topology is still pending. Acquiring the
drm_dp_mst_topology_mgr::probe_lock for this in
drm_dp_mst_handle_up_req() is problematic: the task/work this function
is called from is also responsible for handling MST down-request replies
(in drm_dp_mst_handle_down_rep()). Thus drm_dp_mst_link_probe_work() -
holding already probe_lock - could be blocked waiting for an MST
down-request reply while drm_dp_mst_handle_up_req() is waiting for
probe_lock while processing a CSN message. This leads to the probe
work's down-request message timing out.

A scenario similar to the above leading to a down-request timeout is
handling a CSN message in drm_dp_mst_handle_conn_stat(), holding the
probe_lock and sending down-request messages while a second CSN message
sent by the sink subsequently is handled by drm_dp_mst_handle_up_req().

Fix the above by moving the logic to skip the CSN handling to
drm_dp_mst_process_up_req(). This function is called from a work
(separate from the task/work handling new up/down messages), already
holding probe_lock. This solves the above timeout issue, since handling
of down-request replies won't be blocked by probe_lock.

Fixes: ddf98348 ("drm/dp_mst: Skip CSN if topology probing is not done yet")
Cc: Wayne Lin <Wayne.Lin@amd.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org # v6.6+
Reviewed-by: Wayne Lin <Wayne.Lin@amd.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250307183152.3822170-1-imre.deak@intel.com

12d8f318

Mar 10, 2025

drm/amdgpu/vce2: fix ip block reference · ded6ad4c

Alex Deucher authored 1 month ago

Need to use the correct IP block type.  VCE vs VCN.
Fixes mclk issues on Hawaii.

Suggested by selendym.

Fixes: 82ae6619 ("drm/amdgpu: update the handle ptr in wait_for_idle")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3997


Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Cc: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 02438acd)
Cc: stable@vger.kernel.org

ded6ad4c

drm/amd/display: Fix slab-use-after-free on hdcp_work · e65e7bea

Mario Limonciello authored 1 month ago

[Why]
A slab-use-after-free is reported when HDCP is destroyed but the
property_validate_dwork queue is still running.

[How]
Cancel the delayed work when destroying workqueue.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4006


Fixes: da3fd7ac ("drm/amd/display: Update CP property based on HW query")
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 725a04ba)
Cc: stable@vger.kernel.org

e65e7bea

drm/amd/display: Assign normalized_pix_clk when color depth = 14 · 79e31396

Alex Hung authored 1 month ago


[WHY & HOW]
A warning message "WARNING: CPU: 4 PID: 459 at ... /dc_resource.c:3397
calculate_phy_pix_clks+0xef/0x100 [amdgpu]" occurs because the
display_color_depth == COLOR_DEPTH_141414 is not handled. This is
observed in Radeon RX 6600 XT.

It is fixed by assigning pix_clk * (14 * 3) / 24 - same as the rests.

Also fixes the indentation in get_norm_pix_clk.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 274a87eb)
Cc: stable@vger.kernel.org

79e31396

drm/amd/display: Restore correct backlight brightness after a GPU reset · 5760388d

Mario Limonciello authored 1 month ago


[Why]
GPU reset will attempt to restore cached state, but brightness doesn't
get restored. It will come back at 100% brightness, but userspace thinks
it's the previous value.

[How]
When running resume sequence if GPU is in reset restore brightness
to previous value.

Acked-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5e19e2b5)
Cc: stable@vger.kernel.org

5760388d

drm/amd/display: fix default brightness · b5a981e1

Mario Limonciello authored 1 month ago


[Why]
To avoid flickering during boot default brightness level set by BIOS
should be maintained for as much of the boot as feasible.
commit 2fe87f54 ("drm/amd/display: Set default brightness according
to ACPI") attempted to set the right levels for AC vs DC, but brightness
still got reset to maximum level in initialization code for
setup_backlight_device().

[How]
Remove the hardcoded initialization in setup_backlight_device() and
instead program brightness value to match BIOS (AC or DC).  This avoids a
brightness flicker from kernel changing the value.  Userspace may however
still change it during boot.

Fixes: 2fe87f54 ("drm/amd/display: Set default brightness according to ACPI")
Acked-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 0747acf3)
Cc: stable@vger.kernel.org

b5a981e1

drm/amd/display: Disable unneeded hpd interrupts during dm_init · 40b8c149

Leo Li authored 2 months ago


[Why]

It seems HPD interrupts are enabled by default for all connectors, even
if the hpd source isn't valid. An eDP for example, does not have a valid
hpd source (but does have a valid hpdrx source; see construct_phy()).
Thus, eDPs should have their hpd interrupt disabled.

In the past, this wasn't really an issue. Although the driver gets
interrupted, then acks by writing to hw registers, there weren't any
subscribed handlers that did anything meaningful (see
register_hpd_handlers()).

But things changed with the introduction of IPS. s2idle requires that
the driver allows IPS for DMUB fw to put hw to sleep. Since register
access requires hw to be awake, the driver will block IPS entry to do
so. And no IPS means no hw sleep during s2idle.

This was the observation on DCN35 systems with an eDP. During suspend,
the eDP toggled its hpd pin as part of the panel power down sequence.
The driver was then interrupted, and acked by writing to registers,
blocking IPS entry.

[How]

Since DC marks eDP connections as having invalid hpd sources (see
construct_phy()), DM should disable them at the hw level. Do so in
amdgpu_dm_hpd_init() by disabling all hpd ints first, then selectively
enabling ones for connectors that have valid hpd sources.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7b1ba19e)
Cc: stable@vger.kernel.org

40b8c149

drm/amd: Keep display off while going into S4 · 4afacc99

Mario Limonciello authored 1 month ago


When userspace invokes S4 the flow is:

1) amdgpu_pmops_prepare()
2) amdgpu_pmops_freeze()
3) Create hibernation image
4) amdgpu_pmops_thaw()
5) Write out image to disk
6) Turn off system

Then on resume amdgpu_pmops_restore() is called.

This flow has a problem that because amdgpu_pmops_thaw() is called
it will call amdgpu_device_resume() which will resume all of the GPU.

This includes turning the display hardware back on and discovering
connectors again.

This is an unexpected experience for the display to turn back on.
Adjust the flow so that during the S4 sequence display hardware is
not turned back on.

Reported-by: Xaver Hugl <xaver.hugl@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2038


Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>
Tested-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Link: https://lore.kernel.org/r/20250306185124.44780-1-mario.limonciello@amd.com


Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 68bfdc8d)

4afacc99

drm/amd/display: fix missing .is_two_pixels_per_container · e204aab7

Aliaksei Urbanski authored 1 month ago


Starting from 6.11, AMDGPU driver, while being loaded with amdgpu.dc=1,
due to lack of .is_two_pixels_per_container function in dce60_tg_funcs,
causes a NULL pointer dereference on PCs with old GPUs, such as R9 280X.

So this fix adds missing .is_two_pixels_per_container to dce60_tg_funcs.

Reported-by: Rosen Penev <rosenp@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3942


Fixes: e6a901a0 ("drm/amd/display: use even ODM slice width for two pixels per container")
Signed-off-by: Aliaksei Urbanski <aliaksei.urbanski@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit bd4b125e)
Cc: stable@vger.kernel.org

e204aab7

drm/amdgpu/display: Allow DCC for video formats on GFX12 · df1e82e7

David Rosca authored 2 months ago


We advertise DCC as supported for NV12/P010 formats on GFX12,
but it would fail on this check on atomic commit.

Signed-off-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ba795235)
Cc: stable@vger.kernel.org # 6.12.x

df1e82e7

drm/atomic: Filter out redundant DPMS calls · de93ddf8

Ville Syrjälä authored 2 months ago


Video players (eg. mpv) do periodic XResetScreenSaver() calls to
keep the screen on while the video playing. The modesetting ddx
plumbs these straight through into the kernel as DPMS setproperty
ioctls, without any filtering whatsoever. When implemented via
atomic these end up as empty commits on the crtc (which will
nonetheless take one full frame), which leads to a dropped
frame every time XResetScreenSaver() is called.

Let's just filter out redundant DPMS property changes in the
kernel to avoid this issue.

v2: Explain the resulting commits a bit better (Sima)
    Document the behaviour in uapi docs (Sima)

Cc: stable@vger.kernel.org
Testcase: igt/kms_flip/flip-vs-dpms-on-nop
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250219160239.17502-1-ville.syrjala@linux.intel.com

de93ddf8

drm/i915/cdclk: Do cdclk post plane programming later · 6266f4a7

Ville Syrjälä authored 2 months ago


We currently call intel_set_cdclk_post_plane_update() far
too early. When pipes are active during the reprogramming
the current spot only works for the cd2x divider update
case, as that is synchronize to the pipe's vblank. Squashing
and crawling are not synchronized in any way, so doing the
programming while the pipes/planes are potentially still using
the old hardware state could lead to underruns.

Move the post plane reprgramming to a spot where we know
that the pipes/planes have switched over the new hardware
state.

Cc: stable@vger.kernel.org
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250218211913.27867-2-ville.syrjala@linux.intel.com


Reviewed-by: Vinod Govindapillai <vinod.govindapillai@intel.com>
(cherry picked from commit fb64f556)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

6266f4a7

Mar 09, 2025

Linux 6.14-rc6 · 80e54e84
Linus Torvalds authored 1 month ago

v6.14-rc6

80e54e84

Merge tag 'kbuild-fixes-v6.14-3' of... · 9712d38c

Linus Torvalds authored 1 month ago

Merge tag 'kbuild-fixes-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - Use the specified $(LD) when building userprogs with Clang

 - Pass the correct target triple when compile-testing UAPI headers
   with Clang

 - Fix pacman-pkg build error with KBUILD_OUTPUT

* tag 'kbuild-fixes-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: install-extmod-build: Fix build when specifying KBUILD_OUTPUT
  docs: Kconfig: fix defconfig description
  kbuild: hdrcheck: fix cross build with clang
  kbuild: userprogs: use correct lld when linking through clang

9712d38c

Merge tag 'usb-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 0dc1f314

Linus Torvalds authored 1 month ago

Pull USB fixes from Greg KH:
 "Here are some small USB driver fixes for some reported issues. These
  contain:

   - typec driver fixes

   - dwc3 driver fixes

   - xhci driver fixes

   - renesas controller fixes

   - gadget driver fixes

   - a new USB quirk added

  All of these have been in linux-next with no reported issues"

* tag 'usb-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: typec: ucsi: Fix NULL pointer access
  usb: quirks: Add DELAY_INIT and NO_LPM for Prolific Mass Storage Card Reader
  usb: xhci: Fix host controllers "dying" after suspend and resume
  usb: dwc3: Set SUSPENDENABLE soon after phy init
  usb: hub: lack of clearing xHC resources
  usb: renesas_usbhs: Flush the notify_hotplug_work
  usb: renesas_usbhs: Use devm_usb_get_phy()
  usb: renesas_usbhs: Call clk_put()
  usb: dwc3: gadget: Prevent irq storm when TH re-executes
  usb: gadget: Check bmAttributes only if configuration is valid
  xhci: Restrict USB4 tunnel detection for USB3 devices to Intel hosts
  usb: xhci: Enable the TRB overfetch quirk on VIA VL805
  usb: gadget: Fix setting self-powered state on suspend
  usb: typec: ucsi: increase timeout for PPM reset operations
  acpi: typec: ucsi: Introduce a ->poll_cci method
  usb: typec: tcpci_rt1711h: Unmask alert interrupts to fix functionality
  usb: gadget: Set self-powered based on MaxPower and bmAttributes
  usb: gadget: u_ether: Set is_suspend flag if remote wakeup fails
  usb: atm: cxacru: fix a flaw in existing endpoint checks

0dc1f314

Merge tag 'driver-core-6.14-rc6' of... · 51b38f3c

Linus Torvalds authored 1 month ago

Merge tag 'driver-core-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core fix from Greg KH:
 "Here is a single driver core fix that resolves a reported memory leak.

  It's been in linux-next for 2 weeks now with no reported problems"

* tag 'driver-core-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  drivers: core: fix device leak in __fw_devlink_relax_cycles()

51b38f3c

Merge tag 'char-misc-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 2cc699b3

Linus Torvalds authored 1 month ago

Pull char/misc/IIO driver fixes from Greg KH:
 "Here are a number of misc and char and iio driver fixes that have been
  sitting in my tree for way too long. They contain:

   - iio driver fixes for reported issues

   - regression fix for rtsx_usb card reader

   - mei and mhi driver fixes

   - small virt driver fixes

   - ntsync permissions fix

   - other tiny driver fixes for reported problems.

  All of these have been in linux-next for quite a while with no
  reported issues"

* tag 'char-misc-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (30 commits)
  Revert "drivers/card_reader/rtsx_usb: Restore interrupt based detection"
  ntsync: Check wait count based on byte size.
  bus: simple-pm-bus: fix forced runtime PM use
  char: misc: deallocate static minor in error path
  eeprom: digsy_mtc: Make GPIO lookup table match the device
  drivers: virt: acrn: hsm: Use kzalloc to avoid info leak in pmcmd_ioctl
  binderfs: fix use-after-free in binder_devices
  slimbus: messaging: Free transaction ID in delayed interrupt scenario
  vbox: add HAS_IOPORT dependency
  cdx: Fix possible UAF error in driver_override_show()
  intel_th: pci: Add Panther Lake-P/U support
  intel_th: pci: Add Panther Lake-H support
  intel_th: pci: Add Arrow Lake support
  intel_th: msu: Fix less trivial kernel-doc warnings
  intel_th: msu: Fix kernel-doc warnings
  MAINTAINERS: change maintainer for FSI
  ntsync: Set the permissions to be 0666
  bus: mhi: host: pci_generic: Use pci_try_reset_function() to avoid deadlock
  mei: vsc: Use "wakeuphostint" when getting the host wakeup GPIO
  mei: me: add panther lake P DID
  ...

2cc699b3

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · a382b06d

Linus Torvalds authored 1 month ago

Pull KVM fixes from Paolo Bonzini:
 "arm64:

   - Fix a couple of bugs affecting pKVM's PSCI relay implementation
     when running in the hVHE mode, resulting in the host being entered
     with the MMU in an unknown state, and EL2 being in the wrong mode

  x86:

   - Set RFLAGS.IF in C code on SVM to get VMRUN out of the STI shadow

   - Ensure DEBUGCTL is context switched on AMD to avoid running the
     guest with the host's value, which can lead to unexpected bus lock
     #DBs

   - Suppress DEBUGCTL.BTF on AMD (to match Intel), as KVM doesn't
     properly emulate BTF. KVM's lack of context switching has meant BTF
     has always been broken to some extent

   - Always save DR masks for SNP vCPUs if DebugSwap is *supported*, as
     the guest can enable DebugSwap without KVM's knowledge

   - Fix a bug in mmu_stress_tests where a vCPU could finish the "writes
     to RO memory" phase without actually generating a write-protection
     fault

   - Fix a printf() goof in the SEV smoke test that causes build
     failures with -Werror

   - Explicitly zero EAX and EBX in CPUID.0x8000_0022 output when
     PERFMON_V2 isn't supported by KVM"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Explicitly zero EAX and EBX when PERFMON_V2 isn't supported by KVM
  KVM: selftests: Fix printf() format goof in SEV smoke test
  KVM: selftests: Ensure all vCPUs hit -EFAULT during initial RO stage
  KVM: SVM: Don't rely on DebugSwap to restore host DR0..DR3
  KVM: SVM: Save host DR masks on CPUs with DebugSwap
  KVM: arm64: Initialize SCTLR_EL1 in __kvm_hyp_init_cpu()
  KVM: arm64: Initialize HCR_EL2.E2H early
  KVM: x86: Snapshot the host's DEBUGCTL after disabling IRQs
  KVM: SVM: Manually context switch DEBUGCTL if LBR virtualization is disabled
  KVM: x86: Snapshot the host's DEBUGCTL in common x86
  KVM: SVM: Suppress DEBUGCTL.BTF on AMD
  KVM: SVM: Drop DEBUGCTL[5:2] from guest's effective value
  KVM: selftests: Assert that STI blocking isn't set after event injection
  KVM: SVM: Set RFLAGS.IF=1 in C code, to get VMRUN out of the STI shadow

a382b06d

Merge tag 'kvm-x86-fixes-6.14-rcN.2' of https://github.com/kvm-x86/linux into HEAD · ea9bd29a

Paolo Bonzini authored 1 month ago

KVM x86 fixes for 6.14-rcN #2

 - Set RFLAGS.IF in C code on SVM to get VMRUN out of the STI shadow.

 - Ensure DEBUGCTL is context switched on AMD to avoid running the guest with
   the host's value, which can lead to unexpected bus lock #DBs.

 - Suppress DEBUGCTL.BTF on AMD (to match Intel), as KVM doesn't properly
   emulate BTF.  KVM's lack of context switching has meant BTF has always been
   broken to some extent.

 - Always save DR masks for SNP vCPUs if DebugSwap is *supported*, as the guest
   can enable DebugSwap without KVM's knowledge.

 - Fix a bug in mmu_stress_tests where a vCPU could finish the "writes to RO
   memory" phase without actually generating a write-protection fault.

 - Fix a printf() goof in the SEV smoke test that causes build failures with
   -Werror.

 - Explicitly zero EAX and EBX in CPUID.0x8000_0022 output when PERFMON_V2
   isn't supported by KVM.

ea9bd29a

Merge tag 'kvmarm-fixes-6.14-4' of... · 1cdad678

Paolo Bonzini authored 1 month ago

Merge tag 'kvmarm-fixes-6.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 6.14, take #4

- Fix a couple of bugs affecting pKVM's PSCI relay implementation
  when running in the hVHE mode, resulting in the host being entered
  with the MMU in an unknown state, and EL2 being in the wrong mode.

1cdad678

Merge tag 'mm-hotfixes-stable-2025-03-08-16-27' of... · 1110ce6a

Linus Torvalds authored 1 month ago

Merge tag 'mm-hotfixes-stable-2025-03-08-16-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "33 hotfixes. 24 are cc:stable and the remainder address post-6.13
  issues or aren't considered necessary for -stable kernels.

  26 are for MM and 7 are for non-MM.

   - "mm: memory_failure: unmap poisoned folio during migrate properly"
     from Ma Wupeng fixes a couple of two year old bugs involving the
     migration of hwpoisoned folios.

   - "selftests/damon: three fixes for false results" from SeongJae Park
     fixes three one year old bugs in the SAMON selftest code.

  The remainder are singletons and doubletons. Please see the individual
  changelogs for details"

* tag 'mm-hotfixes-stable-2025-03-08-16-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (33 commits)
  mm/page_alloc: fix uninitialized variable
  rapidio: add check for rio_add_net() in rio_scan_alloc_net()
  rapidio: fix an API misues when rio_add_net() fails
  MAINTAINERS: .mailmap: update Sumit Garg's email address
  Revert "mm/page_alloc.c: don't show protection in zone's ->lowmem_reserve[] for empty zone"
  mm: fix finish_fault() handling for large folios
  mm: don't skip arch_sync_kernel_mappings() in error paths
  mm: shmem: remove unnecessary warning in shmem_writepage()
  userfaultfd: fix PTE unmapping stack-allocated PTE copies
  userfaultfd: do not block on locking a large folio with raised refcount
  mm: zswap: use ATOMIC_LONG_INIT to initialize zswap_stored_pages
  mm: shmem: fix potential data corruption during shmem swapin
  mm: fix kernel BUG when userfaultfd_move encounters swapcache
  selftests/damon/damon_nr_regions: sort collected regiosn before checking with min/max boundaries
  selftests/damon/damon_nr_regions: set ops update for merge results check to 100ms
  selftests/damon/damos_quota: make real expectation of quota exceeds
  include/linux/log2.h: mark is_power_of_2() with __always_inline
  NFS: fix nfs_release_folio() to not deadlock via kcompactd writeback
  mm, swap: avoid BUG_ON in relocate_cluster()
  mm: swap: use correct step in loop to wait all clusters in wait_for_allocation()
  ...

1110ce6a

Mar 08, 2025

Merge tag 'x86-urgent-2025-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b7c90e3e

Linus Torvalds authored 1 month ago

Pull more x86 fixes from Ingo Molnar:

 - Add more model IDs to the AMD microcode version check, more people
   are hitting these checks

 - Fix a Xen guest boot warning related to AMD northbridge setup

 - Fix SEV guest bugs related to a recent changes in its locking logic

 - Fix a missing definition of PTRS_PER_PMD that assembly builds can hit

* tag 'x86-urgent-2025-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/microcode/AMD: Add some forgotten models to the SHA check
  x86/mm: Define PTRS_PER_PMD for assembly code too
  virt: sev-guest: Move SNP Guest Request data pages handling under snp_cmd_mutex
  virt: sev-guest: Allocate request data dynamically
  x86/amd_nb: Use rdmsr_safe() in amd_get_mmconfig_range()

b7c90e3e

x86/microcode/AMD: Add some forgotten models to the SHA check · 058a6bec

Borislav Petkov (AMD) authored 1 month ago


Add some more forgotten models to the SHA check.

Fixes: 50cef76d ("x86/microcode/AMD: Load only SHA256-checksummed patches")
Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Toralf Förster <toralf.foerster@gmx.de>
Link: https://lore.kernel.org/r/20250307220256.11816-1-bp@kernel.org

058a6bec

Merge branch 'linus' into x86/urgent, to pick up dependent patches · 14296d0e
Ingo Molnar authored 1 month ago
```
Signed-off-by: Ingo Molnar <mingo@kernel.org>
```
14296d0e

Merge tag 'loongarch-fixes-6.14-2' of... · 2e51e0ac

Linus Torvalds authored 1 month ago

Merge tag 'loongarch-fixes-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch fixes from Huacai Chen:
 "Fix bugs in kernel build, hibernation, memory management and KVM"

* tag 'loongarch-fixes-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: KVM: Fix GPA size issue about VM
  LoongArch: KVM: Reload guest CSR registers after sleep
  LoongArch: KVM: Add interrupt checking for AVEC
  LoongArch: Set hugetlb mmap base address aligned with pmd size
  LoongArch: Set max_pfn with the PFN of the last page
  LoongArch: Use polling play_dead() when resuming from hibernation
  LoongArch: Eliminate superfluous get_numa_distances_cnt()
  LoongArch: Convert unreachable() to BUG()

2e51e0ac

LoongArch: KVM: Fix GPA size issue about VM · 6bdbb73d

Bibo Mao authored 1 month ago


Physical address space is 48 bit on Loongson-3A5000 physical machine,
however it is 47 bit for VM on Loongson-3A5000 system. Size of physical
address space of VM is the same with the size of virtual user space (a
half) of physical machine.

Variable cpu_vabits represents user address space, kernel address space
is not included (user space and kernel space are both a half of total).
Here cpu_vabits, rather than cpu_vabits - 1, is to represent the size of
guest physical address space.

Also there is strict checking about page fault GPA address, inject error
if it is larger than maximum GPA address of VM.

Cc: stable@vger.kernel.org
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

6bdbb73d

LoongArch: KVM: Reload guest CSR registers after sleep · 78d7bc5a

Bibo Mao authored 1 month ago


On host, the HW guest CSR registers are lost after suspend and resume
operation. Since last_vcpu of boot CPU still records latest vCPU pointer
so that the guest CSR register skips to reload when boot CPU resumes and
vCPU is scheduled.

Here last_vcpu is cleared so that guest CSR registers will reload from
scheduled vCPU context after suspend and resume.

Cc: stable@vger.kernel.org
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

78d7bc5a

LoongArch: KVM: Add interrupt checking for AVEC · 6fb1867d

Bibo Mao authored 1 month ago


There is a newly added macro INT_AVEC with CSR ESTAT register, which is
bit 14 used for LoongArch AVEC support. AVEC interrupt status bit 14 is
supported with macro CSR_ESTAT_IS, so here replace the hard-coded value
0x1fff with macro CSR_ESTAT_IS so that the AVEC interrupt status is also
supported by KVM.

Cc: stable@vger.kernel.org
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

6fb1867d

LoongArch: Set hugetlb mmap base address aligned with pmd size · 3109d5ff

Bibo Mao authored 1 month ago


With ltp test case "testcases/bin/hugefork02", there is a dmesg error
report message such as:

 kernel BUG at mm/hugetlb.c:5550!
 Oops - BUG[#1]:
 CPU: 0 UID: 0 PID: 1517 Comm: hugefork02 Not tainted 6.14.0-rc2+ #241
 Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022
 pc 90000000004eaf1c ra 9000000000485538 tp 900000010edbc000 sp 900000010edbf940
 a0 900000010edbfb00 a1 9000000108d20280 a2 00007fffe9474000 a3 00007ffff3474000
 a4 0000000000000000 a5 0000000000000003 a6 00000000003cadd3 a7 0000000000000000
 t0 0000000001ffffff t1 0000000001474000 t2 900000010ecd7900 t3 00007fffe9474000
 t4 00007fffe9474000 t5 0000000000000040 t6 900000010edbfb00 t7 0000000000000001
 t8 0000000000000005 u0 90000000004849d0 s9 900000010edbfa00 s0 9000000108d20280
 s1 00007fffe9474000 s2 0000000002000000 s3 9000000108d20280 s4 9000000002b38b10
 s5 900000010edbfb00 s6 00007ffff3474000 s7 0000000000000406 s8 900000010edbfa08
    ra: 9000000000485538 unmap_vmas+0x130/0x218
   ERA: 90000000004eaf1c __unmap_hugepage_range+0x6f4/0x7d0
  PRMD: 00000004 (PPLV0 +PIE -PWE)
  EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
 ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
 PRID: 0014c010 (Loongson-64bit, Loongson-3A5000)
 Process hugefork02 (pid: 1517, threadinfo=00000000a670eaf4, task=000000007a95fc64)
 Call Trace:
 [<90000000004eaf1c>] __unmap_hugepage_range+0x6f4/0x7d0
 [<9000000000485534>] unmap_vmas+0x12c/0x218
 [<9000000000494068>] exit_mmap+0xe0/0x308
 [<900000000025fdc4>] mmput+0x74/0x180
 [<900000000026a284>] do_exit+0x294/0x898
 [<900000000026aa30>] do_group_exit+0x30/0x98
 [<900000000027bed4>] get_signal+0x83c/0x868
 [<90000000002457b4>] arch_do_signal_or_restart+0x54/0xfa0
 [<90000000015795e8>] irqentry_exit_to_user_mode+0xb8/0x138
 [<90000000002572d0>] tlb_do_page_fault_1+0x114/0x1b4

The problem is that base address allocated from hugetlbfs is not aligned
with pmd size. Here add a checking for hugetlbfs and align base address
with pmd size. After this patch the test case "testcases/bin/hugefork02"
passes to run.

This is similar to the commit 7f24cbc9 ("mm/mmap: teach
generic_get_unmapped_area{_topdown} to handle hugetlb mappings").

Cc: stable@vger.kernel.org  # 6.13+
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

3109d5ff

LoongArch: Set max_pfn with the PFN of the last page · c8477bb0

Bibo Mao authored 1 month ago


The current max_pfn equals to zero. In this case, it causes user cannot
get some page information through /proc filesystem such as kpagecount.
The following message is displayed by stress-ng test suite with command
"stress-ng --verbose --physpage 1 -t 1".

 # stress-ng --verbose --physpage 1 -t 1
 stress-ng: error: [1691] physpage: cannot read page count for address 0x134ac000 in /proc/kpagecount, errno=22 (Invalid argument)
 stress-ng: error: [1691] physpage: cannot read page count for address 0x7ffff207c3a8 in /proc/kpagecount, errno=22 (Invalid argument)
 stress-ng: error: [1691] physpage: cannot read page count for address 0x134b0000 in /proc/kpagecount, errno=22 (Invalid argument)
 ...

After applying this patch, the kernel can pass the test.

 # stress-ng --verbose --physpage 1 -t 1
 stress-ng: debug: [1701] physpage: [1701] started (instance 0 on CPU 3)
 stress-ng: debug: [1701] physpage: [1701] exited (instance 0 on CPU 3)
 stress-ng: debug: [1700] physpage: [1701] terminated (success)

Cc: stable@vger.kernel.org  # 6.8+
Fixes: ff6c3d81 ("NUMA: optimize detection of memory with no node id assigned by firmware")
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

c8477bb0

LoongArch: Use polling play_dead() when resuming from hibernation · c9117434

Huacai Chen authored 1 month ago


When CONFIG_RANDOM_KMALLOC_CACHES or other randomization infrastructrue
enabled, the idle_task's stack may different between the booting kernel
and target kernel. So when resuming from hibernation, an ACTION_BOOT_CPU
IPI wakeup the idle instruction in arch_cpu_idle_dead() and jump to the
interrupt handler. But since the stack pointer is changed, the interrupt
handler cannot restore correct context.

So rename the current arch_cpu_idle_dead() to idle_play_dead(), make it
as the default version of play_dead(), and the new arch_cpu_idle_dead()
call play_dead() directly. For hibernation, implement an arch-specific
hibernate_resume_nonboot_cpu_disable() to use the polling version (idle
instruction is replace by nop, and irq is disabled) of play_dead(), i.e.
poll_play_dead(), to avoid IPI handler corrupting the idle_task's stack
when resuming from hibernation.

This solution is a little similar to commit 406f992e ("x86 /
hibernate: Use hlt_play_dead() when resuming from hibernation").

Cc: stable@vger.kernel.org
Tested-by: Erpeng Xu <xuerpeng@uniontech.com>
Tested-by: Yuli Wang <wangyuli@uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

c9117434

LoongArch: Eliminate superfluous get_numa_distances_cnt() · a0d3c8bc

Yuli Wang authored 1 month ago

In LoongArch, get_numa_distances_cnt() isn't in use, resulting in a
compiler warning.

Fix follow errors with clang-18 when W=1e:

arch/loongarch/kernel/acpi.c:259:28: error: unused function 'get_numa_distances_cnt' [-Werror,-Wunused-function]
  259 | static inline unsigned int get_numa_distances_cnt(struct acpi_table_slit *slit)
      |                            ^~~~~~~~~~~~~~~~~~~~~~
1 error generated.

Link: https://lore.kernel.org/all/Z7bHPVUH4lAezk0E@kernel.org/


Signed-off-by: Yuli Wang <wangyuli@uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

a0d3c8bc

LoongArch: Convert unreachable() to BUG() · da64a235

Tiezhu Yang authored 1 month ago


When compiling on LoongArch, there exists the following objtool warning
in arch/loongarch/kernel/machine_kexec.o:

  kexec_reboot() falls through to next function crash_shutdown_secondary()

Avoid using unreachable() as it can (and will in the absence of UBSAN)
generate fall-through code. Use BUG() so we get a "break BRK_BUG" trap
(with unreachable annotation).

Cc: stable@vger.kernel.org  # 6.12+
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

da64a235

Merge tag 's390-6.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 2a520073

Linus Torvalds authored 1 month ago

Pull s390 fixes from Vasily Gorbik:

 - Fix return address recovery of traced function in ftrace to ensure
   reliable stack unwinding

 - Fix compiler warnings and runtime crashes of vDSO selftests on s390
   by introducing a dedicated GNU hash bucket pointer with correct
   32-bit entry size

 - Fix test_monitor_call() inline asm, which misses CC clobber, by
   switching to an instruction that doesn't modify CC

* tag 's390-6.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/ftrace: Fix return address recovery of traced function
  selftests/vDSO: Fix GNU hash table entry size for s390x
  s390/traps: Fix test_monitor_call() inline assembly

2a520073