- Nov 03, 2022
-
-
Xiaobo Liu authored
[ Upstream commit d8bde3bf ] Then the input contains '\0' or '\n', proc_mpc_write has read them, so the return value needs +1. Fixes: 1da177e4 ("Linux-2.6.12-rc2") Signed-off-by:
Xiaobo Liu <cppcoffee@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
José Expósito authored
[ Upstream commit bb5f0c85 ] Under certain conditions the Magic Trackpad can group 2 reports in a single packet. The packet is split and the raw event function is invoked recursively for each part. However, after processing each part, the BTN_MOUSE status is updated, sending multiple click events. [1] Return after processing double reports to avoid this issue. Link: https://gitlab.freedesktop.org/libinput/libinput/-/issues/811 # [1] Fixes: a462230e ("HID: magicmouse: enable Magic Trackpad support") Reported-by:
Nulo <git@nulo.in> Signed-off-by:
José Expósito <jose.exposito89@gmail.com> Signed-off-by:
Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20221009182747.90730-1-jose.exposito89@gmail.com Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Tony Luck authored
[ Upstream commit f6ec01da ] If there is no user space consumer of extlog_mem trace records, then Linux properly handles multiple error records in an ELOG block extlog_print() print_extlog_rcd() __print_extlog_rcd() cper_estatus_print() apei_estatus_for_each_section() But the other code path hard codes looking for a single record to output a trace record. Fix by using the same apei_estatus_for_each_section() iterator to step over all records. Fixes: 2dfb7d51 ("trace, RAS: Add eMCA trace event interface") Signed-off-by:
Tony Luck <tony.luck@intel.com> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Filipe Manana authored
[ Upstream commit 4fc7b572 ] When processing delayed data references during backref walking and we are using a share context (we are being called through fiemap), whenever we find a delayed data reference for an inode different from the one we are interested in, then we immediately exit and consider the data extent as shared. This is wrong, because: 1) This might be a DROP reference that will cancel out a reference in the extent tree; 2) Even if it's an ADD reference, it may be followed by a DROP reference that cancels it out. In either case we should not exit immediately. Fix this by never exiting when we find a delayed data reference for another inode - instead add the reference and if it does not cancel out other delayed reference, we will exit early when we call extent_is_shared() after processing all delayed references. If we find a drop reference, then signal the code that processes references from the extent tree (add_inline_refs() and add_keyed_refs()) to not exit immediately if it finds there a reference for another inode, since we have delayed drop references that may cancel it out. In this later case we exit once we don't have references in the rb trees that cancel out each other and have two references for different inodes. Example reproducer for case 1): $ cat test-1.sh #!/bin/bash DEV=/dev/sdj MNT=/mnt/sdj mkfs.btrfs -f $DEV mount $DEV $MNT xfs_io -f -c "pwrite 0 64K" $MNT/foo cp --reflink=always $MNT/foo $MNT/bar echo echo "fiemap after cloning:" xfs_io -c "fiemap -v" $MNT/foo rm -f $MNT/bar echo echo "fiemap after removing file bar:" xfs_io -c "fiemap -v" $MNT/foo umount $MNT Running it before this patch, the extent is still listed as shared, it has the flag 0x2000 (FIEMAP_EXTENT_SHARED) set: $ ./test-1.sh fiemap after cloning: /mnt/sdj/foo: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..127]: 26624..26751 128 0x2001 fiemap after removing file bar: /mnt/sdj/foo: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..127]: 26624..26751 128 0x2001 Example reproducer for case 2): $ cat test-2.sh #!/bin/bash DEV=/dev/sdj MNT=/mnt/sdj mkfs.btrfs -f $DEV mount $DEV $MNT xfs_io -f -c "pwrite 0 64K" $MNT/foo cp --reflink=always $MNT/foo $MNT/bar # Flush delayed references to the extent tree and commit current # transaction. sync echo echo "fiemap after cloning:" xfs_io -c "fiemap -v" $MNT/foo rm -f $MNT/bar echo echo "fiemap after removing file bar:" xfs_io -c "fiemap -v" $MNT/foo umount $MNT Running it before this patch, the extent is still listed as shared, it has the flag 0x2000 (FIEMAP_EXTENT_SHARED) set: $ ./test-2.sh fiemap after cloning: /mnt/sdj/foo: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..127]: 26624..26751 128 0x2001 fiemap after removing file bar: /mnt/sdj/foo: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..127]: 26624..26751 128 0x2001 After this patch, after deleting bar in both tests, the extent is not reported with the 0x2000 flag anymore, it gets only the flag 0x1 (which is FIEMAP_EXTENT_LAST): $ ./test-1.sh fiemap after cloning: /mnt/sdj/foo: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..127]: 26624..26751 128 0x2001 fiemap after removing file bar: /mnt/sdj/foo: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..127]: 26624..26751 128 0x1 $ ./test-2.sh fiemap after cloning: /mnt/sdj/foo: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..127]: 26624..26751 128 0x2001 fiemap after removing file bar: /mnt/sdj/foo: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..127]: 26624..26751 128 0x1 These tests will later be converted to a test case for fstests. Fixes: dc046b10 ("Btrfs: make fiemap not blow when you have lots of snapshots") Signed-off-by:
Filipe Manana <fdmanana@suse.com> Signed-off-by:
David Sterba <dsterba@suse.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Jean-Francois Le Fillatre authored
commit 1bd3a383 upstream. The Lenovo OneLink+ Dock contains an RTL8153 controller that behaves as a broken CDC device by default. Add the custom Lenovo PID to the r8152 driver to support it properly. Also, systems compatible with this dock provide a BIOS option to enable MAC address passthrough (as per Lenovo document "ThinkPad Docking Solutions 2017"). Add the custom PID to the MAC passthrough list too. Tested on a ThinkPad 13 1st gen with the expected results: passthrough disabled: Invalid header when reading pass-thru MAC addr passthrough enabled: Using pass-thru MAC addr XX:XX:XX:XX:XX:XX Signed-off-by:
Jean-Francois Le Fillatre <jflf_kernel@gmx.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
James Morse authored
commit 44b3834b upstream. Cortex-A57 and Cortex-A72 have an erratum where an interrupt that occurs between a pair of AES instructions in aarch32 mode may corrupt the ELR. The task will subsequently produce the wrong AES result. The AES instructions are part of the cryptographic extensions, which are optional. User-space software will detect the support for these instructions from the hwcaps. If the platform doesn't support these instructions a software implementation should be used. Remove the hwcap bits on affected parts to indicate user-space should not use the AES instructions. Acked-by:
Ard Biesheuvel <ardb@kernel.org> Signed-off-by:
James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20220714161523.279570-3-james.morse@arm.com Signed-off-by:
Will Deacon <will@kernel.org> [florian: resolved conflicts in arch/arm64/tools/cpucaps and cpu_errata.c] Signed-off-by:
Florian Fainelli <f.fainelli@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Ren authored
commit c000a260 upstream. With some PCIe topologies, restoring a guest fails while parsing the ITS device tables. Reproducer hints: 1. Create ARM virt VM with pxb-pcie bus which adds extra host bridges, with qemu command like: ``` -device pxb-pcie,bus_nr=8,id=pci.x,numa_node=0,bus=pcie.0 \ -device pcie-root-port,..,bus=pci.x \ ... -device pxb-pcie,bus_nr=37,id=pci.y,numa_node=1,bus=pcie.0 \ -device pcie-root-port,..,bus=pci.y \ ... ``` 2. Ensure the guest uses 2-level device table 3. Perform VM migration which calls save/restore device tables In that setup, we get a big "offset" between 2 device_ids, which makes unsigned "len" round up a big positive number, causing the scan loop to continue with a bad GPA. For example: 1. L1 table has 2 entries; 2. and we are now scanning at L2 table entry index 2075 (pointed to by L1 first entry) 3. if next device id is 9472, we will get a big offset: 7397; 4. with unsigned 'len', 'len -= offset * esz', len will underflow to a positive number, mistakenly into next iteration with a bad GPA; (It should break out of the current L2 table scanning, and jump into the next L1 table entry) 5. that bad GPA fails the guest read. Fix it by stopping the L2 table scan when the next device id is outside of the current table, allowing the scan to continue from the next L1 table entry. Thanks to Eric Auger for the fix suggestion. Fixes: 920a7a8f ("KVM: arm64: vgic-its: Add infrastructure for tableookup") Suggested-by:
Eric Auger <eric.auger@redhat.com> Signed-off-by:
Eric Ren <renzhengeek@gmail.com> [maz: commit message tidy-up] Signed-off-by:
Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/d9c3a564af9e2c5bf63f48a7dcbf08cd593c5c0b.1665802985.git.renzhengeek@gmail.com Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kai-Heng Feng authored
commit 1e41e693 upstream. UBSAN complains about array-index-out-of-bounds: [ 1.980703] kernel: UBSAN: array-index-out-of-bounds in /build/linux-9H675w/linux-5.15.0/drivers/ata/libahci.c:968:41 [ 1.980709] kernel: index 15 is out of range for type 'ahci_em_priv [8]' [ 1.980713] kernel: CPU: 0 PID: 209 Comm: scsi_eh_8 Not tainted 5.15.0-25-generic #25-Ubuntu [ 1.980716] kernel: Hardware name: System manufacturer System Product Name/P5Q3, BIOS 1102 06/11/2010 [ 1.980718] kernel: Call Trace: [ 1.980721] kernel: <TASK> [ 1.980723] kernel: show_stack+0x52/0x58 [ 1.980729] kernel: dump_stack_lvl+0x4a/0x5f [ 1.980734] kernel: dump_stack+0x10/0x12 [ 1.980736] kernel: ubsan_epilogue+0x9/0x45 [ 1.980739] kernel: __ubsan_handle_out_of_bounds.cold+0x44/0x49 [ 1.980742] kernel: ahci_qc_issue+0x166/0x170 [libahci] [ 1.980748] kernel: ata_qc_issue+0x135/0x240 [ 1.980752] kernel: ata_exec_internal_sg+0x2c4/0x580 [ 1.980754] kernel: ? vprintk_default+0x1d/0x20 [ 1.980759] kernel: ata_exec_internal+0x67/0xa0 [ 1.980762] kernel: sata_pmp_read+0x8d/0xc0 [ 1.980765] kernel: sata_pmp_read_gscr+0x3c/0x90 [ 1.980768] kernel: sata_pmp_attach+0x8b/0x310 [ 1.980771] kernel: ata_eh_revalidate_and_attach+0x28c/0x4b0 [ 1.980775] kernel: ata_eh_recover+0x6b6/0xb30 [ 1.980778] kernel: ? ahci_do_hardreset+0x180/0x180 [libahci] [ 1.980783] kernel: ? ahci_stop_engine+0xb0/0xb0 [libahci] [ 1.980787] kernel: ? ahci_do_softreset+0x290/0x290 [libahci] [ 1.980792] kernel: ? trace_event_raw_event_ata_eh_link_autopsy_qc+0xe0/0xe0 [ 1.980795] kernel: sata_pmp_eh_recover.isra.0+0x214/0x560 [ 1.980799] kernel: sata_pmp_error_handler+0x23/0x40 [ 1.980802] kernel: ahci_error_handler+0x43/0x80 [libahci] [ 1.980806] kernel: ata_scsi_port_error_handler+0x2b1/0x600 [ 1.980810] kernel: ata_scsi_error+0x9c/0xd0 [ 1.980813] kernel: scsi_error_handler+0xa1/0x180 [ 1.980817] kernel: ? scsi_unjam_host+0x1c0/0x1c0 [ 1.980820] kernel: kthread+0x12a/0x150 [ 1.980823] kernel: ? set_kthread_struct+0x50/0x50 [ 1.980826] kernel: ret_from_fork+0x22/0x30 [ 1.980831] kernel: </TASK> This happens because sata_pmp_init_links() initialize link->pmp up to SATA_PMP_MAX_PORTS while em_priv is declared as 8 elements array. I can't find the maximum Enclosure Management ports specified in AHCI spec v1.3.1, but "12.2.1 LED message type" states that "Port Multiplier Information" can utilize 4 bits, which implies it can support up to 16 ports. Hence, use SATA_PMP_MAX_PORTS as EM_MAX_SLOTS to resolve the issue. BugLink: https://bugs.launchpad.net/bugs/1970074 Cc: stable@vger.kernel.org Signed-off-by:
Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by:
Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Alexander Stein authored
commit 979556f1 upstream. 'ahci:' is an invalid prefix, preventing the module from autoloading. Fix this by using the 'platform:' prefix and DRV_NAME. Fixes: 9e54eae2 ("ahci_imx: add ahci sata support on imx platforms") Cc: stable@vger.kernel.org Signed-off-by:
Alexander Stein <alexander.stein@ew.tq-group.com> Reviewed-by:
Fabio Estevam <festevam@gmail.com> Signed-off-by:
Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Borislav Petkov authored
commit e7ad18d1 upstream. Currently, the patch application logic checks whether the revision needs to be applied on each logical CPU (SMT thread). Therefore, on SMT designs where the microcode engine is shared between the two threads, the application happens only on one of them as that is enough to update the shared microcode engine. However, there are microcode patches which do per-thread modification, see Link tag below. Therefore, drop the revision check and try applying on each thread. This is what the BIOS does too so this method is very much tested. Btw, change only the early paths. On the late loading paths, there's no point in doing per-thread modification because if is it some case like in the bugzilla below - removing a CPUID flag - the kernel cannot go and un-use features it has detected are there early. For that, one should use early loading anyway. [ bp: Fixes does not contain the oldest commit which did check for equality but that is good enough. ] Fixes: 8801b3fc ("x86/microcode/AMD: Rework container parsing") Reported-by:
Ștefan Talpalaru <stefantalpalaru@yahoo.com> Signed-off-by:
Borislav Petkov <bp@suse.de> Tested-by:
Ștefan Talpalaru <stefantalpalaru@yahoo.com> Cc: <stable@vger.kernel.org> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216211 Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Joseph Qi authored
commit 759a7c61 upstream. Commit b1529a41 "ocfs2: should reclaim the inode if '__ocfs2_mknod_locked' returns an error" tried to reclaim the claimed inode if __ocfs2_mknod_locked() fails later. But this introduce a race, the freed bit may be reused immediately by another thread, which will update dinode, e.g. i_generation. Then iput this inode will lead to BUG: inode->i_generation != le32_to_cpu(fe->i_generation) We could make this inode as bad, but we did want to do operations like wipe in some cases. Since the claimed inode bit can only affect that an dinode is missing and will return back after fsck, it seems not a big problem. So just leave it as is by revert the reclaim logic. Link: https://lkml.kernel.org/r/20221017130227.234480-1-joseph.qi@linux.alibaba.com Fixes: b1529a41 ("ocfs2: should reclaim the inode if '__ocfs2_mknod_locked' returns an error") Signed-off-by:
Joseph Qi <joseph.qi@linux.alibaba.com> Reported-by:
Yan Wang <wangyan122@huawei.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Joseph Qi authored
commit 28f4821b upstream. In ocfs2_mknod(), if error occurs after dinode successfully allocated, ocfs2 i_links_count will not be 0. So even though we clear inode i_nlink before iput in error handling, it still won't wipe inode since we'll refresh inode from dinode during inode lock. So just like clear inode i_nlink, we clear ocfs2 i_links_count as well. Also do the same change for ocfs2_symlink(). Link: https://lkml.kernel.org/r/20221017130227.234480-2-joseph.qi@linux.alibaba.com Signed-off-by:
Joseph Qi <joseph.qi@linux.alibaba.com> Reported-by:
Yan Wang <wangyan122@huawei.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- Nov 01, 2022
-
-
Greg Kroah-Hartman authored
Link: https://lore.kernel.org/r/20221031070140.108124105@linuxfoundation.org Tested-by:
Jon Hunter <jonathanh@nvidia.com> Tested-by:
Linux Kernel Functional Testing <lkft@linaro.org> Tested-by:
Guenter Roeck <linux@roeck-us.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Daniel Sneddon authored
commit 2b129932 upstream. tl;dr: The Enhanced IBRS mitigation for Spectre v2 does not work as documented for RET instructions after VM exits. Mitigate it with a new one-entry RSB stuffing mechanism and a new LFENCE. == Background == Indirect Branch Restricted Speculation (IBRS) was designed to help mitigate Branch Target Injection and Speculative Store Bypass, i.e. Spectre, attacks. IBRS prevents software run in less privileged modes from affecting branch prediction in more privileged modes. IBRS requires the MSR to be written on every privilege level change. To overcome some of the performance issues of IBRS, Enhanced IBRS was introduced. eIBRS is an "always on" IBRS, in other words, just turn it on once instead of writing the MSR on every privilege level change. When eIBRS is enabled, more privileged modes should be protected from less privileged modes, including protecting VMMs from guests. == Problem == Here's a simplification of how guests are run on Linux' KVM: void run_kvm_guest(void) { // Prepare to run guest VMRESUME(); // Clean up after guest runs } The execution flow for that would look something like this to the processor: 1. Host-side: call run_kvm_guest() 2. Host-side: VMRESUME 3. Guest runs, does "CALL guest_function" 4. VM exit, host runs again 5. Host might make some "cleanup" function calls 6. Host-side: RET from run_kvm_guest() Now, when back on the host, there are a couple of possible scenarios of post-guest activity the host needs to do before executing host code: * on pre-eIBRS hardware (legacy IBRS, or nothing at all), the RSB is not touched and Linux has to do a 32-entry stuffing. * on eIBRS hardware, VM exit with IBRS enabled, or restoring the host IBRS=1 shortly after VM exit, has a documented side effect of flushing the RSB except in this PBRSB situation where the software needs to stuff the last RSB entry "by hand". IOW, with eIBRS supported, host RET instructions should no longer be influenced by guest behavior after the host retires a single CALL instruction. However, if the RET instructions are "unbalanced" with CALLs after a VM exit as is the RET in #6, it might speculatively use the address for the instruction after the CALL in #3 as an RSB prediction. This is a problem since the (untrusted) guest controls this address. Balanced CALL/RET instruction pairs such as in step #5 are not affected. == Solution == The PBRSB issue affects a wide variety of Intel processors which support eIBRS. But not all of them need mitigation. Today, X86_FEATURE_RSB_VMEXIT triggers an RSB filling sequence that mitigates PBRSB. Systems setting RSB_VMEXIT need no further mitigation - i.e., eIBRS systems which enable legacy IBRS explicitly. However, such systems (X86_FEATURE_IBRS_ENHANCED) do not set RSB_VMEXIT and most of them need a new mitigation. Therefore, introduce a new feature flag X86_FEATURE_RSB_VMEXIT_LITE which triggers a lighter-weight PBRSB mitigation versus RSB_VMEXIT. The lighter-weight mitigation performs a CALL instruction which is immediately followed by a speculative execution barrier (INT3). This steers speculative execution to the barrier -- just like a retpoline -- which ensures that speculation can never reach an unbalanced RET. Then, ensure this CALL is retired before continuing execution with an LFENCE. In other words, the window of exposure is opened at VM exit where RET behavior is troublesome. While the window is open, force RSB predictions sampling for RET targets to a dead end at the INT3. Close the window with the LFENCE. There is a subset of eIBRS systems which are not vulnerable to PBRSB. Add these systems to the cpu_vuln_whitelist[] as NO_EIBRS_PBRSB. Future systems that aren't vulnerable will set ARCH_CAP_PBRSB_NO. [ bp: Massage, incorporate review comments from Andy Cooper. ] Signed-off-by:
Daniel Sneddon <daniel.sneddon@linux.intel.com> Co-developed-by:
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by:
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by:
Borislav Petkov <bp@suse.de> [ bp: Adjust patch to account for kvm entry being in c ] Signed-off-by:
Suraj Jitindar Singh <surajjs@amazon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pawan Gupta authored
commit eb23b5ef upstream. IBRS mitigation for spectre_v2 forces write to MSR_IA32_SPEC_CTRL at every kernel entry/exit. On Enhanced IBRS parts setting MSR_IA32_SPEC_CTRL[IBRS] only once at boot is sufficient. MSR writes at every kernel entry/exit incur unnecessary performance loss. When Enhanced IBRS feature is present, print a warning about this unnecessary performance loss. Signed-off-by:
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/2a5eaf54583c2bfe0edc4fea64006656256cca17.1657814857.git.pawan.kumar.gupta@linux.intel.com Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Nathan Chancellor authored
commit db886979 upstream. Clang warns: arch/x86/kernel/cpu/bugs.c:58:21: error: section attribute is specified on redeclared variable [-Werror,-Wsection] DEFINE_PER_CPU(u64, x86_spec_ctrl_current); ^ arch/x86/include/asm/nospec-branch.h:283:12: note: previous declaration is here extern u64 x86_spec_ctrl_current; ^ 1 error generated. The declaration should be using DECLARE_PER_CPU instead so all attributes stay in sync. Cc: stable@vger.kernel.org Fixes: fc02735b ("KVM: VMX: Prevent guest RSB poisoning attacks with eIBRS") Reported-by:
kernel test robot <lkp@intel.com> Signed-off-by:
Nathan Chancellor <nathan@kernel.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pawan Gupta authored
commit 4ad3278d upstream. Some Intel processors may use alternate predictors for RETs on RSB-underflow. This condition may be vulnerable to Branch History Injection (BHI) and intramode-BTI. Kernel earlier added spectre_v2 mitigation modes (eIBRS+Retpolines, eIBRS+LFENCE, Retpolines) which protect indirect CALLs and JMPs against such attacks. However, on RSB-underflow, RET target prediction may fallback to alternate predictors. As a result, RET's predicted target may get influenced by branch history. A new MSR_IA32_SPEC_CTRL bit (RRSBA_DIS_S) controls this fallback behavior when in kernel mode. When set, RETs will not take predictions from alternate predictors, hence mitigating RETs as well. Support for this is enumerated by CPUID.7.2.EDX[RRSBA_CTRL] (bit2). For spectre v2 mitigation, when a user selects a mitigation that protects indirect CALLs and JMPs against BHI and intramode-BTI, set RRSBA_DIS_S also to protect RETs for RSB-underflow case. Signed-off-by:
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by:
Borislav Petkov <bp@suse.de> [bwh: Backported to 5.15: adjust context in scattered.c] Signed-off-by:
Ben Hutchings <ben@decadent.org.uk> [sam: Fixed for missing X86_FEATURE_ENTRY_IBPB context] Signed-off-by:
Samuel Mendoza-Jonas <samjonas@amazon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pawan Gupta authored
commit f54d4537 upstream. Cannon lake is also affected by RETBleed, add it to the list. Fixes: 6ad0ad2b ("x86/bugs: Report Intel retbleed vulnerability") Signed-off-by:
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> [ bp: Adjust cpu model name CANNONLAKE_L -> CANNONLAKE_MOBILE ] Signed-off-by:
Suraj Jitindar Singh <surajjs@amazon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Andrew Cooper authored
commit 26aae8cc upstream. BTC_NO indicates that hardware is not susceptible to Branch Type Confusion. Zen3 CPUs don't suffer BTC. Hypervisors are expected to synthesise BTC_NO when it is appropriate given the migration pool, to prevent kernels using heuristics. Signed-off-by:
Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> [ bp: Adjust context ] Signed-off-by:
Suraj Jitindar Singh <surajjs@amazon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Zijlstra authored
commit 7a05bc95 upstream. The whole MMIO/RETBLEED enumeration went overboard on steppings. Get rid of all that and simply use ANY. If a future stepping of these models would not be affected, it had better set the relevant ARCH_CAP_$FOO_NO bit in IA32_ARCH_CAPABILITIES. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Acked-by:
Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Josh Poimboeuf authored
commit 9756bba2 upstream. Prevent RSB underflow/poisoning attacks with RSB. While at it, add a bunch of comments to attempt to document the current state of tribal knowledge about RSB attacks and what exactly is being mitigated. Signed-off-by:
Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> [ bp: Adjust for the fact that vmexit is in inline assembly ] Signed-off-by:
Suraj Jitindar Singh <surajjs@amazon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Josh Poimboeuf authored
commit bea7e31a upstream. For legacy IBRS to work, the IBRS bit needs to be always re-written after vmexit, even if it's already on. Signed-off-by:
Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Josh Poimboeuf authored
commit fc02735b upstream. On eIBRS systems, the returns in the vmexit return path from __vmx_vcpu_run() to vmx_vcpu_run() are exposed to RSB poisoning attacks. Fix that by moving the post-vmexit spec_ctrl handling to immediately after the vmexit. Signed-off-by:
Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> [ bp: Adjust for the fact that vmexit is in inline assembly ] Signed-off-by:
Suraj Jitindar Singh <surajjs@amazon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Josh Poimboeuf authored
commit acac5e98 upstream. This mask has been made redundant by kvm_spec_ctrl_test_value(). And it doesn't even work when MSR interception is disabled, as the guest can just write to SPEC_CTRL directly. Signed-off-by:
Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Josh Poimboeuf authored
commit bbb69e8b upstream. There's no need to recalculate the host value for every entry/exit. Just use the cached value in spec_ctrl_current(). Signed-off-by:
Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Josh Poimboeuf authored
commit 56aa4d22 upstream. If the SMT state changes, SSBD might get accidentally disabled. Fix that. Signed-off-by:
Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Josh Poimboeuf authored
commit e6aa1362 upstream. The firmware entry code may accidentally clear STIBP or SSBD. Fix that. Signed-off-by:
Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Josh Poimboeuf authored
commit b2620fac upstream. If a kernel is built with CONFIG_RETPOLINE=n, but the user still wants to mitigate Spectre v2 using IBRS or eIBRS, the RSB filling will be silently disabled. There's nothing retpoline-specific about RSB buffer filling. Remove the CONFIG_RETPOLINE guards around it. Signed-off-by:
Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pawan Gupta authored
commit ba6e31af upstream. RSB fill sequence does not have any protection for miss-prediction of conditional branch at the end of the sequence. CPU can speculatively execute code immediately after the sequence, while RSB filling hasn't completed yet. #define __FILL_RETURN_BUFFER(reg, nr, sp) \ mov $(nr/2), reg; \ 771: \ ANNOTATE_INTRA_FUNCTION_CALL; \ call 772f; \ 773: /* speculation trap */ \ UNWIND_HINT_EMPTY; \ pause; \ lfence; \ jmp 773b; \ 772: \ ANNOTATE_INTRA_FUNCTION_CALL; \ call 774f; \ 775: /* speculation trap */ \ UNWIND_HINT_EMPTY; \ pause; \ lfence; \ jmp 775b; \ 774: \ add $(BITS_PER_LONG/8) * 2, sp; \ dec reg; \ jnz 771b; <----- CPU can miss-predict here. Before RSB is filled, RETs that come in program order after this macro can be executed speculatively, making them vulnerable to RSB-based attacks. Mitigate it by adding an LFENCE after the conditional branch to prevent speculation while RSB is being filled. Suggested-by:
Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by:
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Zijlstra authored
commit 089dd8e5 upstream. Change FILL_RETURN_BUFFER so that objtool groks it and can generate correct ORC unwind information. - Since ORC is alternative invariant; that is, all alternatives should have the same ORC entries, the __FILL_RETURN_BUFFER body can not be part of an alternative. Therefore, move it out of the alternative and keep the alternative as a sort of jump_label around it. - Use the ANNOTATE_INTRA_FUNCTION_CALL annotation to white-list these 'funny' call instructions to nowhere. - Use UNWIND_HINT_EMPTY to 'fill' the speculation traps, otherwise objtool will consider them unreachable. - Move the RSP adjustment into the loop, such that the loop has a deterministic stack layout. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Alexandre Chartre <alexandre.chartre@oracle.com> Acked-by:
Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200428191700.032079304@infradead.org [ bp: no intra-function call validation support ] Signed-off-by:
Suraj Jitindar Singh <surajjs@amazon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Zijlstra authored
commit bf5835bc upstream. Having IBRS enabled while the SMT sibling is idle unnecessarily slows down the running sibling. OTOH, disabling IBRS around idle takes two MSR writes, which will increase the idle latency. Therefore, only disable IBRS around deeper idle states. Shallow idle states are bounded by the tick in duration, since NOHZ is not allowed for them by virtue of their short target residency. Only do this for mwait-driven idle, since that keeps interrupts disabled across idle, which makes disabling IBRS vs IRQ-entry a non-issue. Note: C6 is a random threshold, most importantly C1 probably shouldn't disable IBRS, benchmarking needed. Suggested-by:
Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by:
Borislav Petkov <bp@suse.de> [cascardo: no CPUIDLE_FLAG_IRQ_ENABLE] Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> [ bp: Adjust context ] Signed-off-by:
Suraj Jitindar Singh <surajjs@amazon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Zijlstra authored
commit 6ad0ad2b upstream. Skylake suffers from RSB underflow speculation issues; report this vulnerability and it's mitigation (spectre_v2=ibrs). [jpoimboe: cleanups, eibrs] Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> [ bp: Adjust to different processor family names ] Signed-off-by:
Suraj Jitindar Singh <surajjs@amazon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Zijlstra authored
commit 166115c0 upstream. retbleed will depend on spectre_v2, while spectre_v2_user depends on retbleed. Break this cycle. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pawan Gupta authored
commit 7c693f54 upstream. Extend spectre_v2= boot option with Kernel IBRS. [jpoimboe: no STIBP with IBRS] Signed-off-by:
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Zijlstra authored
commit c779bc1a upstream. When changing SPEC_CTRL for user control, the WRMSR can be delayed until return-to-user when KERNEL_IBRS has been enabled. This avoids an MSR write during context switch. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Thadeu Lima de Souza Cascardo authored
commit 2dbb887e upstream. Implement Kernel IBRS - currently the only known option to mitigate RSB underflow speculation issues on Skylake hardware. Note: since IBRS_ENTER requires fuller context established than UNTRAIN_RET, it must be placed after it. However, since UNTRAIN_RET itself implies a RET, it must come after IBRS_ENTER. This means IBRS_ENTER needs to also move UNTRAIN_RET. Note 2: KERNEL_IBRS is sub-optimal for XenPV. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by:
Borislav Petkov <bp@suse.de> [cascardo: conflict at arch/x86/entry/entry_64_compat.S] [cascardo: conflict fixups, no ANNOTATE_NOENDBR] Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> [ bp: Adjust context ] Signed-off-by:
Suraj Jitindar Singh <surajjs@amazon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Zijlstra authored
commit caa0ff24 upstream. Due to TIF_SSBD and TIF_SPEC_IB the actual IA32_SPEC_CTRL value can differ from x86_spec_ctrl_base. As such, keep a per-CPU value reflecting the current task's MSR content. [jpoimboe: rename] Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Alexandre Chartre authored
commit 7fbf47c7 upstream. Add the "retbleed=<value>" boot parameter to select a mitigation for RETBleed. Possible values are "off", "auto" and "unret" (JMP2RET mitigation). The default value is "auto". Currently, "retbleed=auto" will select the unret mitigation on AMD and Hygon and no mitigation on Intel (JMP2RET is not effective on Intel). [peterz: rebase; add hygon] [jpoimboe: cleanups] Signed-off-by:
Alexandre Chartre <alexandre.chartre@oracle.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> [ bp: Adjust context - no jmp2ret mitigation exists ] Signed-off-by:
Suraj Jitindar Singh <surajjs@amazon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Alexandre Chartre authored
commit 6b80b59b upstream. Report that AMD x86 CPUs are vulnerable to the RETBleed (Arbitrary Speculative Code Execution with Return Instructions) attack. [peterz: add hygon] [kim: invert parity; fam15h] Co-developed-by:
Kim Phillips <kim.phillips@amd.com> Signed-off-by:
Kim Phillips <kim.phillips@amd.com> Signed-off-by:
Alexandre Chartre <alexandre.chartre@oracle.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> [ bp: Adjust context ] Signed-off-by:
Suraj Jitindar Singh <surajjs@amazon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Zijlstra authored
commit a883d624 upstream. In order to extend the RETPOLINE features to 4, move them to word 11 where there is still room. This mostly keeps DISABLE_RETPOLINE simple. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-