- Dec 14, 2024
-
-
Christophe JAILLET authored
[ Upstream commit ad980b04 ] The type of the last parameter given to devm_add_action_or_reset() is "struct caam_drv_private *", but in caam_qi_shutdown(), it is casted to "struct device *". Pass the correct parameter to devm_add_action_or_reset() so that the resources are released as expected. Fixes: f414de2e ("crypto: caam - use devres to de-initialize QI") Signed-off-by:
Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Thadeu Lima de Souza Cascardo authored
[ Upstream commit 1c82587c ] Devices block sizes may change. One of these cases is a loop device by using ioctl LOOP_SET_BLOCK_SIZE. While this may cause other issues like IO being rejected, in the case of hfsplus, it will allocate a block by using that size and potentially write out-of-bounds when hfsplus_read_wrapper calls hfsplus_submit_bio and the latter function reads a different io_size. Using a new min_io_size initally set to sb_min_blocksize works for the purposes of the original fix, since it will be set to the max between HFSPLUS_SECTOR_SIZE and the first seen logical block size. We still use the max between HFSPLUS_SECTOR_SIZE and min_io_size in case the latter is not initialized. Tested by mounting an hfsplus filesystem with loop block sizes 512, 1024 and 4096. The produced KASAN report before the fix looks like this: [ 419.944641] ================================================================== [ 419.945655] BUG: KASAN: slab-use-after-free in hfsplus_read_wrapper+0x659/0xa0a [ 419.946703] Read of size 2 at addr ffff88800721fc00 by task repro/10678 [ 419.947612] [ 419.947846] CPU: 0 UID: 0 PID: 10678 Comm: repro Not tainted 6.12.0-rc5-00008-gdf56e0f2f3ca #84 [ 419.949007] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 [ 419.950035] Call Trace: [ 419.950384] <TASK> [ 419.950676] dump_stack_lvl+0x57/0x78 [ 419.951212] ? hfsplus_read_wrapper+0x659/0xa0a [ 419.951830] print_report+0x14c/0x49e [ 419.952361] ? __virt_addr_valid+0x267/0x278 [ 419.952979] ? kmem_cache_debug_flags+0xc/0x1d [ 419.953561] ? hfsplus_read_wrapper+0x659/0xa0a [ 419.954231] kasan_report+0x89/0xb0 [ 419.954748] ? hfsplus_read_wrapper+0x659/0xa0a [ 419.955367] hfsplus_read_wrapper+0x659/0xa0a [ 419.955948] ? __pfx_hfsplus_read_wrapper+0x10/0x10 [ 419.956618] ? do_raw_spin_unlock+0x59/0x1a9 [ 419.957214] ? _raw_spin_unlock+0x1a/0x2e [ 419.957772] hfsplus_fill_super+0x348/0x1590 [ 419.958355] ? hlock_class+0x4c/0x109 [ 419.958867] ? __pfx_hfsplus_fill_super+0x10/0x10 [ 419.959499] ? __pfx_string+0x10/0x10 [ 419.960006] ? lock_acquire+0x3e2/0x454 [ 419.960532] ? bdev_name.constprop.0+0xce/0x243 [ 419.961129] ? __pfx_bdev_name.constprop.0+0x10/0x10 [ 419.961799] ? pointer+0x3f0/0x62f [ 419.962277] ? __pfx_pointer+0x10/0x10 [ 419.962761] ? vsnprintf+0x6c4/0xfba [ 419.963178] ? __pfx_vsnprintf+0x10/0x10 [ 419.963621] ? setup_bdev_super+0x376/0x3b3 [ 419.964029] ? snprintf+0x9d/0xd2 [ 419.964344] ? __pfx_snprintf+0x10/0x10 [ 419.964675] ? lock_acquired+0x45c/0x5e9 [ 419.965016] ? set_blocksize+0x139/0x1c1 [ 419.965381] ? sb_set_blocksize+0x6d/0xae [ 419.965742] ? __pfx_hfsplus_fill_super+0x10/0x10 [ 419.966179] mount_bdev+0x12f/0x1bf [ 419.966512] ? __pfx_mount_bdev+0x10/0x10 [ 419.966886] ? vfs_parse_fs_string+0xce/0x111 [ 419.967293] ? __pfx_vfs_parse_fs_string+0x10/0x10 [ 419.967702] ? __pfx_hfsplus_mount+0x10/0x10 [ 419.968073] legacy_get_tree+0x104/0x178 [ 419.968414] vfs_get_tree+0x86/0x296 [ 419.968751] path_mount+0xba3/0xd0b [ 419.969157] ? __pfx_path_mount+0x10/0x10 [ 419.969594] ? kmem_cache_free+0x1e2/0x260 [ 419.970311] do_mount+0x99/0xe0 [ 419.970630] ? __pfx_do_mount+0x10/0x10 [ 419.971008] __do_sys_mount+0x199/0x1c9 [ 419.971397] do_syscall_64+0xd0/0x135 [ 419.971761] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 419.972233] RIP: 0033:0x7c3cb812972e [ 419.972564] Code: 48 8b 0d f5 46 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c2 46 0d 00 f7 d8 64 89 01 48 [ 419.974371] RSP: 002b:00007ffe30632548 EFLAGS: 00000286 ORIG_RAX: 00000000000000a5 [ 419.975048] RAX: ffffffffffffffda RBX: 00007ffe306328d8 RCX: 00007c3cb812972e [ 419.975701] RDX: 0000000020000000 RSI: 0000000020000c80 RDI: 00007ffe306325d0 [ 419.976363] RBP: 00007ffe30632720 R08: 00007ffe30632610 R09: 0000000000000000 [ 419.977034] R10: 0000000000200008 R11: 0000000000000286 R12: 0000000000000000 [ 419.977713] R13: 00007ffe306328e8 R14: 00005a0eb298bc68 R15: 00007c3cb8356000 [ 419.978375] </TASK> [ 419.978589] Fixes: 6596528e ("hfsplus: ensure bio requests are not smaller than the hardware sectors") Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@igalia.com> Link: https://lore.kernel.org/r/20241107114109.839253-1-cascardo@igalia.com Signed-off-by:
Christian Brauner <brauner@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Masahiro Yamada authored
[ Upstream commit 0708967e ] Building the kernel with ARCH=s390 creates a weird arch/arch/ directory. $ find arch/arch arch/arch arch/arch/s390 arch/arch/s390/include arch/arch/s390/include/generated arch/arch/s390/include/generated/asm arch/arch/s390/include/generated/uapi arch/arch/s390/include/generated/uapi/asm The root cause is 'targets' in arch/s390/kernel/syscalls/Makefile, where the relative path is incorrect. Strictly speaking, 'targets' was not necessary in the first place because this Makefile uses 'filechk' instead of 'if_changed'. However, this commit keeps it, as it will be useful when converting 'filechk' to 'if_changed' later. Fixes: 5c75824d ("s390/syscalls: add Makefile to generate system call header files") Signed-off-by:
Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20241111134603.2063226-1-masahiroy@kernel.org Signed-off-by:
Heiko Carstens <hca@linux.ibm.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Aleksandr Mishin authored
[ Upstream commit 1a9de2f6 ] In case of error in gtdt_parse_timer_block() invalid 'gtdt_frame' will be used in 'do {} while (i-- >= 0 && gtdt_frame--);' statement block because do{} block will be executed even if 'i == 0'. Adjust error handling procedure by replacing 'i-- >= 0' with 'i-- > 0'. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: a712c3ed ("acpi/arm64: Add memory-mapped timer support in GTDT driver") Signed-off-by:
Aleksandr Mishin <amishin@t-argos.ru> Acked-by:
Hanjun Guo <guohanjun@huawei.com> Acked-by:
Sudeep Holla <sudeep.holla@arm.com> Acked-by:
Aleksandr Mishin <amishin@t-argos.ru> Link: https://lore.kernel.org/r/20240827101239.22020-1-amishin@t-argos.ru Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Masahiro Yamada authored
[ Upstream commit 340fd66c ] Commit be288182 ("arm64/build: Assert for unwanted sections") introduced an assertion to ensure that the .data.rel.ro section does not exist. However, this check does not work when CONFIG_LTO_CLANG is enabled, because .data.rel.ro matches the .data.[0-9a-zA-Z_]* pattern in the DATA_MAIN macro. Move the ASSERT() above the RW_DATA() line. Fixes: be288182 ("arm64/build: Assert for unwanted sections") Signed-off-by:
Masahiro Yamada <masahiroy@kernel.org> Acked-by:
Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20241106161843.189927-1-masahiroy@kernel.org Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Daniel Palmer authored
[ Upstream commit 077b33b9 ] Commit a38eaa07 ("m68k/mvme147: config.c - Remove unused functions"), removed the console functionality for the mvme147 instead of wiring it up to an early console. Put the console write function back and wire it up like mvme16x does so it's possible to see Linux boot on this fine hardware once more. Fixes: a38eaa07 ("m68k/mvme147: config.c - Remove unused functions") Signed-off-by:
Daniel Palmer <daniel@0x0f.com> Co-developed-by:
Finn Thain <fthain@linux-m68k.org> Signed-off-by:
Finn Thain <fthain@linux-m68k.org> Reviewed-by:
Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/a82e8f0068a8722996a0ccfe666abb5e0a5c120d.1730850684.git.fthain@linux-m68k.org Signed-off-by:
Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Geert Uytterhoeven authored
[ Upstream commit dcec33c1 ] When building with W=1: arch/m68k/mvme16x/config.c:208:6: warning: no previous prototype for ‘mvme16x_cons_write’ [-Wmissing-prototypes] 208 | void mvme16x_cons_write(struct console *co, const char *str, unsigned count) | ^~~~~~~~~~~~~~~~~~ Fix this by introducing a new header file "mvme16x.h" for holding the prototypes of functions implemented in arch/m68k/mvme16x/. Signed-off-by:
Geert Uytterhoeven <geert@linux-m68k.org> Acked-by:
Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/6200cc3b26fad215c4524748af04692e38c5ecd2.1694613528.git.geert@linux-m68k.org Stable-dep-of: 077b33b9 ("m68k: mvme147: Reinstate early console") Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Daniel Palmer authored
[ Upstream commit 47bc8744 ] Sometime long ago the m68k IRQ code was refactored and the interrupt numbers for SCSI controller on this board ended up wrong, and it hasn't worked since. The PCC adds 0x40 to the vector for its interrupts so they end up in the user interrupt range. Hence, the kernel number should be the kernel offset for user interrupt range + the PCC interrupt number. Fixes: 200a3d35 ("[PATCH] m68k: convert VME irq code") Signed-off-by:
Daniel Palmer <daniel@0x0f.com> Reviewed-by:
Finn Thain <fthain@linux-m68k.org> Reviewed-by:
Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/0e7636a21a0274eea35bfd5d874459d5078e97cc.1727926187.git.fthain@linux-m68k.org Signed-off-by:
Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Christoph Hellwig authored
[ Upstream commit 3c2fb1ca ] The HMB descriptor table is sized to the maximum number of descriptors that could be used for a given device, but __nvme_alloc_host_mem could break out of the loop earlier on memory allocation failure and end up using less descriptors than planned for, which leads to an incorrect size passed to dma_free_coherent. In practice this was not showing up because the number of descriptors tends to be low and the dma coherent allocator always allocates and frees at least a page. Fixes: 87ad72a5 ("nvme-pci: implement host memory buffer support") Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Keith Busch <kbusch@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
David Disseldorp authored
[ Upstream commit e017671f ] The initramfs filename field is defined in Documentation/driver-api/early-userspace/buffer-format.rst as: 37 cpio_file := ALGN(4) + cpio_header + filename + "\0" + ALGN(4) + data ... 55 ============= ================== ========================= 56 Field name Field size Meaning 57 ============= ================== ========================= ... 70 c_namesize 8 bytes Length of filename, including final \0 When extracting an initramfs cpio archive, the kernel's do_name() path handler assumes a zero-terminated path at @collected, passing it directly to filp_open() / init_mkdir() / init_mknod(). If a specially crafted cpio entry carries a non-zero-terminated filename and is followed by uninitialized memory, then a file may be created with trailing characters that represent the uninitialized memory. The ability to create an initramfs entry would imply already having full control of the system, so the buffer overrun shouldn't be considered a security vulnerability. Append the output of the following bash script to an existing initramfs and observe any created /initramfs_test_fname_overrunAA* path. E.g. ./reproducer.sh | gzip >> /myinitramfs It's easiest to observe non-zero uninitialized memory when the output is gzipped, as it'll overflow the heap allocated @out_buf in __gunzip(), rather than the initrd_start+initrd_size block. ---- reproducer.sh ---- nilchar="A" # change to "\0" to properly zero terminate / pad magic="070701" ino=1 mode=$(( 0100777 )) uid=0 gid=0 nlink=1 mtime=1 filesize=0 devmajor=0 devminor=1 rdevmajor=0 rdevminor=0 csum=0 fname="initramfs_test_fname_overrun" namelen=$(( ${#fname} + 1 )) # plus one to account for terminator printf "%s%08x%08x%08x%08x%08x%08x%08x%08x%08x%08x%08x%08x%08x%s" \ $magic $ino $mode $uid $gid $nlink $mtime $filesize \ $devmajor $devminor $rdevmajor $rdevminor $namelen $csum $fname termpadlen=$(( 1 + ((4 - ((110 + $namelen) & 3)) % 4) )) printf "%.s${nilchar}" $(seq 1 $termpadlen) ---- reproducer.sh ---- Symlink filename fields handled in do_symlink() won't overrun past the data segment, due to the explicit zero-termination of the symlink target. Fix filename buffer overrun by aborting the initramfs FSM if any cpio entry doesn't carry a zero-terminator at the expected (name_len - 1) offset. Fixes: 1da177e4 ("Linux-2.6.12-rc2") Signed-off-by:
David Disseldorp <ddiss@suse.de> Link: https://lore.kernel.org/r/20241030035509.20194-2-ddiss@suse.de Signed-off-by:
Christian Brauner <brauner@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Jonas Gorski authored
[ Upstream commit da099359 ] When MIPS_FP_SUPPORT is disabled, __sanitize_fcr31() is defined as nothing, which triggers a gcc warning: In file included from kernel/sched/core.c:79: kernel/sched/core.c: In function 'context_switch': ./arch/mips/include/asm/switch_to.h:114:39: warning: suggest braces around empty body in an 'if' statement [-Wempty-body] 114 | __sanitize_fcr31(next); \ | ^ kernel/sched/core.c:5316:9: note: in expansion of macro 'switch_to' 5316 | switch_to(prev, next, prev); | ^~~~~~~~~ Fix this by providing an empty body for __sanitize_fcr31() like one is defined for __mips_mt_fpaff_switch_to(). Fixes: 36a49803 ("MIPS: Avoid FCSR sanitization when CONFIG_MIPS_FP_SUPPORT=n") Signed-off-by:
Jonas Gorski <jonas.gorski@gmail.com> Reviewed-by:
Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by:
Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Josh Poimboeuf authored
[ Upstream commit 82694854 ] This indirect jump is harmless; annotate it to keep objtool's retpoline validation happy. Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Signed-off-by:
Josh Poimboeuf <jpoimboe@redhat.com> Reviewed-by:
Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/4797c72a258b26e06741c58ccd4a75c42db39c1d.1611263462.git.jpoimboe@redhat.com Stable-dep-of: e8fbc0d9 ("x86/pvh: Call C code via the kernel virtual mapping") Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Andre Przywara authored
[ Upstream commit 96dddb7b ] When checking MTE tags, we print some diagnostic messages when the tests fail. Some variables uses there are "longs", however we only use "%x" for the format specifier. Update the format specifiers to "%lx", to match the variable types they are supposed to print. Fixes: f3b2a26c ("kselftest/arm64: Verify mte tag inclusion via prctl") Signed-off-by:
Andre Przywara <andre.przywara@arm.com> Reviewed-by:
Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240816153251.2833702-9-andre.przywara@arm.com Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Borislav Petkov (AMD) authored
commit 04c30245 upstream. AMD does not have the requirement for a synchronization barrier when acccessing a certain group of MSRs. Do not incur that unnecessary penalty there. There will be a CPUID bit which explicitly states that a MFENCE is not needed. Once that bit is added to the APM, this will be extended with it. While at it, move to processor.h to avoid include hell. Untangling that file properly is a matter for another day. Some notes on the performance aspect of why this is relevant, courtesy of Kishon VijayAbraham <Kishon.VijayAbraham@amd.com>: On a AMD Zen4 system with 96 cores, a modified ipi-bench[1] on a VM shows x2AVIC IPI rate is 3% to 4% lower than AVIC IPI rate. The ipi-bench is modified so that the IPIs are sent between two vCPUs in the same CCX. This also requires to pin the vCPU to a physical core to prevent any latencies. This simulates the use case of pinning vCPUs to the thread of a single CCX to avoid interrupt IPI latency. In order to avoid run-to-run variance (for both x2AVIC and AVIC), the below configurations are done: 1) Disable Power States in BIOS (to prevent the system from going to lower power state) 2) Run the system at fixed frequency 2500MHz (to prevent the system from increasing the frequency when the load is more) With the above configuration: *) Performance measured using ipi-bench for AVIC: Average Latency: 1124.98ns [Time to send IPI from one vCPU to another vCPU] Cumulative throughput: 42.6759M/s [Total number of IPIs sent in a second from 48 vCPUs simultaneously] *) Performance measured using ipi-bench for x2AVIC: Average Latency: 1172.42ns [Time to send IPI from one vCPU to another vCPU] Cumulative throughput: 40.9432M/s [Total number of IPIs sent in a second from 48 vCPUs simultaneously] From above, x2AVIC latency is ~4% more than AVIC. However, the expectation is x2AVIC performance to be better or equivalent to AVIC. Upon analyzing the perf captures, it is observed significant time is spent in weak_wrmsr_fence() invoked by x2apic_send_IPI(). With the fix to skip weak_wrmsr_fence() *) Performance measured using ipi-bench for x2AVIC: Average Latency: 1117.44ns [Time to send IPI from one vCPU to another vCPU] Cumulative throughput: 42.9608M/s [Total number of IPIs sent in a second from 48 vCPUs simultaneously] Comparing the performance of x2AVIC with and without the fix, it can be seen the performance improves by ~4%. Performance captured using an unmodified ipi-bench using the 'mesh-ipi' option with and without weak_wrmsr_fence() on a Zen4 system also showed significant performance improvement without weak_wrmsr_fence(). The 'mesh-ipi' option ignores CCX or CCD and just picks random vCPU. Average throughput (10 iterations) with weak_wrmsr_fence(), Cumulative throughput: 4933374 IPI/s Average throughput (10 iterations) without weak_wrmsr_fence(), Cumulative throughput: 6355156 IPI/s [1] https://github.com/bytedance/kvm-utils/tree/master/microbenchmark/ipi-bench Signed-off-by:
Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230622095212.20940-1-bp@alien8.de Signed-off-by:
Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Puranjay Mohan authored
[ Upstream commit 7c2fd760 ] On an NVMe namespace that does not support metadata, it is possible to send an IO command with metadata through io-passthru. This allows issues like [1] to trigger in the completion code path. nvme_map_user_request() doesn't check if the namespace supports metadata before sending it forward. It also allows admin commands with metadata to be processed as it ignores metadata when bdev == NULL and may report success. Reject an IO command with metadata when the NVMe namespace doesn't support it and reject an admin command if it has metadata. [1] https://lore.kernel.org/all/mb61pcylvnym8.fsf@amazon.com/ Suggested-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Puranjay Mohan <pjy@amazon.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Sagi Grimberg <sagi@grimberg.me> Reviewed-by:
Anuj Gupta <anuj20.g@samsung.com> Signed-off-by:
Keith Busch <kbusch@kernel.org> [ Move the changes from nvme_map_user_request() to nvme_submit_user_cmd() to make it work on 5.10 ] Signed-off-by:
Puranjay Mohan <pjy@amazon.com> Signed-off-by:
Hagar Hemdan <hagarhem@amazon.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Pali Rohár authored
commit e2a8910a upstream. ReparseDataLength is sum of the InodeType size and DataBuffer size. So to get DataBuffer size it is needed to subtract InodeType's size from ReparseDataLength. Function cifs_strndup_from_utf16() is currentlly accessing buf->DataBuffer at position after the end of the buffer because it does not subtract InodeType size from the length. Fix this problem and correctly subtract variable len. Member InodeType is present only when reparse buffer is large enough. Check for ReparseDataLength before accessing InodeType to prevent another invalid memory access. Major and minor rdev values are present also only when reparse buffer is large enough. Check for reparse buffer size before calling reparse_mkdev(). Fixes: d5ecebc4 ("smb3: Allow query of symlinks stored as reparse points") Reviewed-by:
Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by:
Pali Rohár <pali@kernel.org> Signed-off-by:
Steve French <stfrench@microsoft.com> [use variable name symlink_buf, the other buf->InodeType accesses are not used in current version so skip] Signed-off-by:
Mahmoud Adam <mngyadam@amazon.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Paul E. McKenney authored
commit 5c9a9ca4 upstream. Any idle task corresponding to an offline CPU is in an RCU Tasks Trace quiescent state. This commit causes rcu_tasks_trace_postscan() to ignore idle tasks for offline CPUs, which it can do safely due to CPU-hotplug operations being disabled. Signed-off-by:
Paul E. McKenney <paulmck@kernel.org> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Martin KaFai Lau <kafai@fb.com> Cc: KP Singh <kpsingh@kernel.org> Signed-off-by:
Krister Johansen <kjlx@templeofstupid.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Breno Leitao authored
[ Upstream commit e28acc9c ] Accessing `mr_table->mfc_cache_list` is protected by an RCU lock. In the following code flow, the RCU read lock is not held, causing the following error when `RCU_PROVE` is not held. The same problem might show up in the IPv6 code path. 6.12.0-rc5-kbuilder-01145-gbac17284bdcb #33 Tainted: G E N ----------------------------- net/ipv4/ipmr_base.c:313 RCU-list traversed in non-reader section!! rcu_scheduler_active = 2, debug_locks = 1 2 locks held by RetransmitAggre/3519: #0: ffff88816188c6c0 (nlk_cb_mutex-ROUTE){+.+.}-{3:3}, at: __netlink_dump_start+0x8a/0x290 #1: ffffffff83fcf7a8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_dumpit+0x6b/0x90 stack backtrace: lockdep_rcu_suspicious mr_table_dump ipmr_rtm_dumproute rtnl_dump_all rtnl_dumpit netlink_dump __netlink_dump_start rtnetlink_rcv_msg netlink_rcv_skb netlink_unicast netlink_sendmsg This is not a problem per see, since the RTNL lock is held here, so, it is safe to iterate in the list without the RCU read lock, as suggested by Eric. To alleviate the concern, modify the code to use list_for_each_entry_rcu() with the RTNL-held argument. The annotation will raise an error only if RTNL or RCU read lock are missing during iteration, signaling a legitimate problem, otherwise it will avoid this false positive. This will solve the IPv6 case as well, since ip6mr_rtm_dumproute() calls this function as well. Signed-off-by:
Breno Leitao <leitao@debian.org> Reviewed-by:
David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20241108-ipmr_rcu-v2-1-c718998e209b@debian.org Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Eryk Zagorski authored
[ Upstream commit 6f891ca1 ] This patch switches the P-125 quirk entry to use a composite quirk as the P-125 supplies both MIDI and Audio like many of the other Yamaha keyboards Signed-off-by:
Eryk Zagorski <erykzagorski@gmail.com> Link: https://patch.msgid.link/20241111164520.9079-2-erykzagorski@gmail.com Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
David Wang authored
[ Upstream commit 84b9749a ] seq_printf is costy, on a system with n CPUs, reading /proc/softirqs would yield 10*n decimal values, and the extra cost parsing format string grows linearly with number of cpus. Replace seq_printf with seq_put_decimal_ull_width have significant performance improvement. On an 8CPUs system, reading /proc/softirqs show ~40% performance gain with this patch. Signed-off-by:
David Wang <00107082@163.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Luo Yifan authored
[ Upstream commit 23569c8b ] This patch checks if div is less than or equal to zero (div <= 0). If div is zero or negative, the function returns -EINVAL, ensuring the division operation is safe to perform. Signed-off-by:
Luo Yifan <luoyifan@cmss.chinamobile.com> Reviewed-by:
Olivier Moysan <olivier.moysan@foss.st.com> Link: https://patch.msgid.link/20241107015936.211902-1-luoyifan@cmss.chinamobile.com Signed-off-by:
Mark Brown <broonie@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Luo Yifan authored
[ Upstream commit 63c1c879 ] This patch checks if div is less than or equal to zero (div <= 0). If div is zero or negative, the function returns -EINVAL, ensuring the division operation (*prate / div) is safe to perform. Signed-off-by:
Luo Yifan <luoyifan@cmss.chinamobile.com> Link: https://patch.msgid.link/20241106014654.206860-1-luoyifan@cmss.chinamobile.com Signed-off-by:
Mark Brown <broonie@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Alexander Hölzl authored
[ Upstream commit b6ec62e0 ] The description of PDU1 format usage mistakenly referred to PDU2 format. Signed-off-by:
Alexander Hölzl <alexander.hoelzl@gmx.net> Acked-by:
Oleksij Rempel <o.rempel@pengutronix.de> Acked-by:
Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://patch.msgid.link/20241023145257.82709-1-alexander.hoelzl@gmx.net Signed-off-by:
Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Mikhail Rudenko authored
[ Upstream commit 5e53e4a6 ] Currently, RK809's BUCK3 regulator is modelled in the driver as a configurable regulator with 0.5-2.4V voltage range. But the voltage setting is not actually applied, because when bit 6 of PMIC_POWER_CONFIG register is set to 0 (default), BUCK3 output voltage is determined by the external feedback resistor. Fix this, by setting bit 6 when voltage selection is set. Existing users which do not specify voltage constraints in their device trees will not be affected by this change, since no voltage setting is applied in those cases, and bit 6 is not enabled. Signed-off-by:
Mikhail Rudenko <mike.rudenko@gmail.com> Link: https://patch.msgid.link/20241017-rk809-dcdc3-v1-1-e3c3de92f39c@gmail.com Signed-off-by:
Mark Brown <broonie@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Charles Han authored
[ Upstream commit e694d2b5 ] devm_kasprintf() can return a NULL pointer on failure but this returned value in qcom_socinfo_probe() is not checked. Signed-off-by:
Charles Han <hanchunchao@inspur.com> Link: https://lore.kernel.org/r/20240929072349.202520-1-hanchunchao@inspur.com Signed-off-by:
Bjorn Andersson <andersson@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Benoît Monin authored
[ Upstream commit 6b3f18a7 ] Add support for Quectel RG650V which is based on Qualcomm SDX65 chip. The composition is DIAG / NMEA / AT / AT / QMI. T: Bus=02 Lev=01 Prnt=01 Port=03 Cnt=01 Dev#= 4 Spd=5000 MxCh= 0 D: Ver= 3.20 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=2c7c ProdID=0122 Rev=05.15 S: Manufacturer=Quectel S: Product=RG650V-EU S: SerialNumber=xxxxxxx C: #Ifs= 5 Cfg#= 1 Atr=a0 MxPwr=896mA I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=9ms I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=9ms I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=05(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=88(I) Atr=03(Int.) MxPS= 8 Ivl=9ms Signed-off-by:
Benoît Monin <benoit.monin@gmx.fr> Reviewed-by:
Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241024151113.53203-1-benoit.monin@gmx.fr Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Arnd Bergmann authored
[ Upstream commit fce9642c ] node_to_amd_nb() is defined to NULL in non-AMD configs: drivers/platform/x86/amd/hsmp/plat.c: In function 'init_platform_device': drivers/platform/x86/amd/hsmp/plat.c:165:68: error: dereferencing 'void *' pointer [-Werror] 165 | sock->root = node_to_amd_nb(i)->root; | ^~ drivers/platform/x86/amd/hsmp/plat.c:165:68: error: request for member 'root' in something not a structure or union Users of the interface who also allow COMPILE_TEST will cause the above build error so provide an inline stub to fix that. [ bp: Massage commit message. ] Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by:
Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20241029092329.3857004-1-arnd@kernel.org Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Piyush Raj Chouhan authored
[ Upstream commit ef5fbdf7 ] Infinix ZERO BOOK 13 has a 2+2 speaker system which isn't probed correctly. This patch adds a quirk with the proper pin connections. Also The mic in this laptop suffers too high gain resulting in mostly fan noise being recorded, This patch Also limit mic boost. HW Probe for device; https://linux-hardware.org/?probe=a2e892c47b Test: All 4 speaker works, Mic has low noise. Signed-off-by:
Piyush Raj Chouhan <piyushchouhan1598@gmail.com> Link: https://patch.msgid.link/20241028155516.15552-1-piyuschouhan1598@gmail.com Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Li Zhijian authored
[ Upstream commit dc1308be ] When running watchdog-test with 'make run_tests', the watchdog-test will be terminated by a timeout signal(SIGTERM) due to the test timemout. And then, a system reboot would happen due to watchdog not stop. see the dmesg as below: ``` [ 1367.185172] watchdog: watchdog0: watchdog did not stop! ``` Fix it by registering more signals(including SIGTERM) in watchdog-test, where its signal handler will stop the watchdog. After that # timeout 1 ./watchdog-test Watchdog Ticking Away! . Stopping watchdog ticks... Link: https://lore.kernel.org/all/20241029031324.482800-1-lizhijian@fujitsu.com/ Signed-off-by:
Li Zhijian <lizhijian@fujitsu.com> Reviewed-by:
Shuah Khan <skhan@linuxfoundation.org> Signed-off-by:
Shuah Khan <skhan@linuxfoundation.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Ben Greear authored
[ Upstream commit 9b15c6cf ] ieee80211_calc_hw_conf_chan was ignoring the configured user_txpower. If it is set, use it to potentially decrease txpower as requested. Signed-off-by:
Ben Greear <greearb@candelatech.com> Link: https://patch.msgid.link/20241010203954.1219686-1-greearb@candelatech.com Signed-off-by:
Johannes Berg <johannes.berg@intel.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Hans de Goede authored
[ Upstream commit 0107f28f ] The Vexia Edu Atla 10 tablet mostly uses the BYTCR tablet defaults, but as happens on more models it is using IN1 instead of IN3 for its internal mic and JD_SRC_JD2_IN4N instead of JD_SRC_JD1_IN4P for jack-detection. Add a DMI quirk for this to fix the internal-mic and jack-detection. Signed-off-by:
Hans de Goede <hdegoede@redhat.com> Link: https://patch.msgid.link/20241024211615.79518-2-hdegoede@redhat.com Signed-off-by:
Mark Brown <broonie@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Lorenzo Stoakes authored
[ Upstream commit 5de19506 ] The mmap_region() function is somewhat terrifying, with spaghetti-like control flow and numerous means by which issues can arise and incomplete state, memory leaks and other unpleasantness can occur. A large amount of the complexity arises from trying to handle errors late in the process of mapping a VMA, which forms the basis of recently observed issues with resource leaks and observable inconsistent state. Taking advantage of previous patches in this series we move a number of checks earlier in the code, simplifying things by moving the core of the logic into a static internal function __mmap_region(). Doing this allows us to perform a number of checks up front before we do any real work, and allows us to unwind the writable unmap check unconditionally as required and to perform a CONFIG_DEBUG_VM_MAPLE_TREE validation unconditionally also. We move a number of things here: 1. We preallocate memory for the iterator before we call the file-backed memory hook, allowing us to exit early and avoid having to perform complicated and error-prone close/free logic. We carefully free iterator state on both success and error paths. 2. The enclosing mmap_region() function handles the mapping_map_writable() logic early. Previously the logic had the mapping_map_writable() at the point of mapping a newly allocated file-backed VMA, and a matching mapping_unmap_writable() on success and error paths. We now do this unconditionally if this is a file-backed, shared writable mapping. If a driver changes the flags to eliminate VM_MAYWRITE, however doing so does not invalidate the seal check we just performed, and we in any case always decrement the counter in the wrapper. We perform a debug assert to ensure a driver does not attempt to do the opposite. 3. We also move arch_validate_flags() up into the mmap_region() function. This is only relevant on arm64 and sparc64, and the check is only meaningful for SPARC with ADI enabled. We explicitly add a warning for this arch if a driver invalidates this check, though the code ought eventually to be fixed to eliminate the need for this. With all of these measures in place, we no longer need to explicitly close the VMA on error paths, as we place all checks which might fail prior to a call to any driver mmap hook. This eliminates an entire class of errors, makes the code easier to reason about and more robust. Link: https://lkml.kernel.org/r/6e0becb36d2f5472053ac5d544c0edfe9b899e25.1730224667.git.lorenzo.stoakes@oracle.com Fixes: deb0f656 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails") Signed-off-by:
Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reported-by:
Jann Horn <jannh@google.com> Reviewed-by:
Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by:
Vlastimil Babka <vbabka@suse.cz> Tested-by:
Mark Brown <broonie@kernel.org> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Xu <peterx@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Lorenzo Stoakes authored
[ Upstream commit 5baf8b03 ] Currently MTE is permitted in two circumstances (desiring to use MTE having been specified by the VM_MTE flag) - where MAP_ANONYMOUS is specified, as checked by arch_calc_vm_flag_bits() and actualised by setting the VM_MTE_ALLOWED flag, or if the file backing the mapping is shmem, in which case we set VM_MTE_ALLOWED in shmem_mmap() when the mmap hook is activated in mmap_region(). The function that checks that, if VM_MTE is set, VM_MTE_ALLOWED is also set is the arm64 implementation of arch_validate_flags(). Unfortunately, we intend to refactor mmap_region() to perform this check earlier, meaning that in the case of a shmem backing we will not have invoked shmem_mmap() yet, causing the mapping to fail spuriously. It is inappropriate to set this architecture-specific flag in general mm code anyway, so a sensible resolution of this issue is to instead move the check somewhere else. We resolve this by setting VM_MTE_ALLOWED much earlier in do_mmap(), via the arch_calc_vm_flag_bits() call. This is an appropriate place to do this as we already check for the MAP_ANONYMOUS case here, and the shmem file case is simply a variant of the same idea - we permit RAM-backed memory. This requires a modification to the arch_calc_vm_flag_bits() signature to pass in a pointer to the struct file associated with the mapping, however this is not too egregious as this is only used by two architectures anyway - arm64 and parisc. So this patch performs this adjustment and removes the unnecessary assignment of VM_MTE_ALLOWED in shmem_mmap(). [akpm@linux-foundation.org: fix whitespace, per Catalin] Link: https://lkml.kernel.org/r/ec251b20ba1964fb64cf1607d2ad80c47f3873df.1730224667.git.lorenzo.stoakes@oracle.com Fixes: deb0f656 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails") Signed-off-by:
Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Suggested-by:
Catalin Marinas <catalin.marinas@arm.com> Reported-by:
Jann Horn <jannh@google.com> Reviewed-by:
Catalin Marinas <catalin.marinas@arm.com> Reviewed-by:
Vlastimil Babka <vbabka@suse.cz> Cc: Andreas Larsson <andreas@gaisler.com> Cc: David S. Miller <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Brown <broonie@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Lorenzo Stoakes authored
[ Upstream commit 4080ef15 ] Incorrect invocation of VMA callbacks when the VMA is no longer in a consistent state is bug prone and risky to perform. With regards to the important vm_ops->close() callback We have gone to great lengths to try to track whether or not we ought to close VMAs. Rather than doing so and risking making a mistake somewhere, instead unconditionally close and reset vma->vm_ops to an empty dummy operations set with a NULL .close operator. We introduce a new function to do so - vma_close() - and simplify existing vms logic which tracked whether we needed to close or not. This simplifies the logic, avoids incorrect double-calling of the .close() callback and allows us to update error paths to simply call vma_close() unconditionally - making VMA closure idempotent. Link: https://lkml.kernel.org/r/28e89dda96f68c505cb6f8e9fc9b57c3e9f74b42.1730224667.git.lorenzo.stoakes@oracle.com Fixes: deb0f656 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails") Signed-off-by:
Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reported-by:
Jann Horn <jannh@google.com> Reviewed-by:
Vlastimil Babka <vbabka@suse.cz> Reviewed-by:
Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by:
Jann Horn <jannh@google.com> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Brown <broonie@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Lorenzo Stoakes authored
[ Upstream commit 3dd6ed34 ] Patch series "fix error handling in mmap_region() and refactor (hotfixes)", v4. mmap_region() is somewhat terrifying, with spaghetti-like control flow and numerous means by which issues can arise and incomplete state, memory leaks and other unpleasantness can occur. A large amount of the complexity arises from trying to handle errors late in the process of mapping a VMA, which forms the basis of recently observed issues with resource leaks and observable inconsistent state. This series goes to great lengths to simplify how mmap_region() works and to avoid unwinding errors late on in the process of setting up the VMA for the new mapping, and equally avoids such operations occurring while the VMA is in an inconsistent state. The patches in this series comprise the minimal changes required to resolve existing issues in mmap_region() error handling, in order that they can be hotfixed and backported. There is additionally a follow up series which goes further, separated out from the v1 series and sent and updated separately. This patch (of 5): After an attempted mmap() fails, we are no longer in a situation where we can safely interact with VMA hooks. This is currently not enforced, meaning that we need complicated handling to ensure we do not incorrectly call these hooks. We can avoid the whole issue by treating the VMA as suspect the moment that the file->f_ops->mmap() function reports an error by replacing whatever VMA operations were installed with a dummy empty set of VMA operations. We do so through a new helper function internal to mm - mmap_file() - which is both more logically named than the existing call_mmap() function and correctly isolates handling of the vm_op reassignment to mm. All the existing invocations of call_mmap() outside of mm are ultimately nested within the call_mmap() from mm, which we now replace. It is therefore safe to leave call_mmap() in place as a convenience function (and to avoid churn). The invokers are: ovl_file_operations -> mmap -> ovl_mmap() -> backing_file_mmap() coda_file_operations -> mmap -> coda_file_mmap() shm_file_operations -> shm_mmap() shm_file_operations_huge -> shm_mmap() dma_buf_fops -> dma_buf_mmap_internal -> i915_dmabuf_ops -> i915_gem_dmabuf_mmap() None of these callers interact with vm_ops or mappings in a problematic way on error, quickly exiting out. Link: https://lkml.kernel.org/r/cover.1730224667.git.lorenzo.stoakes@oracle.com Link: https://lkml.kernel.org/r/d41fd763496fd0048a962f3fd9407dc72dd4fd86.1730224667.git.lorenzo.stoakes@oracle.com Fixes: deb0f656 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails") Signed-off-by:
Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reported-by:
Jann Horn <jannh@google.com> Reviewed-by:
Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by:
Vlastimil Babka <vbabka@suse.cz> Reviewed-by:
Jann Horn <jannh@google.com> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Brown <broonie@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Andrew Morton authored
commit d1aa0c04 upstream. Revert d949d1d1 ("mm: shmem: fix data-race in shmem_getattr()") as suggested by Chuck [1]. It is causing deadlocks when accessing tmpfs over NFS. As Hugh commented, "added just to silence a syzbot sanitizer splat: added where there has never been any practical problem". Link: https://lkml.kernel.org/r/ZzdxKF39VEmXSSyN@tissot.1015granger.net [1] Fixes: d949d1d1 ("mm: shmem: fix data-race in shmem_getattr()") Acked-by:
Hugh Dickins <hughd@google.com> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Jeongjun Park <aha310510@gmail.com> Cc: Yu Zhao <yuzhao@google.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Hugh Dickins <hughd@google.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chuck Lever authored
[ Upstream commit 8286f8b6 ] The error flow in nfsd4_copy() calls cleanup_async_copy(), which already decrements nn->pending_async_copies. Reported-by:
Olga Kornievskaia <okorniev@redhat.com> Fixes: aadc3bbe ("NFSD: Limit the number of concurrent async COPY operations") Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chuck Lever authored
[ Upstream commit 63fab04c ] Ensure the refcount and async_copies fields are initialized early. cleanup_async_copy() will reference these fields if an error occurs in nfsd4_copy(). If they are not correctly initialized, at the very least, a refcount underflow occurs. Reported-by:
Olga Kornievskaia <okorniev@redhat.com> Fixes: aadc3bbe ("NFSD: Limit the number of concurrent async COPY operations") Reviewed-by:
Jeff Layton <jlayton@kernel.org> Tested-by:
Olga Kornievskaia <okorniev@redhat.com> Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chuck Lever authored
[ Upstream commit aadc3bbe ] Nothing appears to limit the number of concurrent async COPY operations that clients can start. In addition, AFAICT each async COPY can copy an unlimited number of 4MB chunks, so can run for a long time. Thus IMO async COPY can become a DoS vector. Add a restriction mechanism that bounds the number of concurrent background COPY operations. Start simple and try to be fair -- this patch implements a per-namespace limit. An async COPY request that occurs while this limit is exceeded gets NFS4ERR_DELAY. The requesting client can choose to send the request again after a delay or fall back to a traditional read/write style copy. If there is need to make the mechanism more sophisticated, we can visit that in future patches. Cc: stable@vger.kernel.org Reviewed-by:
Jeff Layton <jlayton@kernel.org> Link: https://nvd.nist.gov/vuln/detail/CVE-2024-49974 Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chuck Lever authored
[ Upstream commit 9ed666eb ] Currently, when NFSD handles an asynchronous COPY, it returns a zero write verifier, relying on the subsequent CB_OFFLOAD callback to pass the write verifier and a stable_how4 value to the client. However, if the CB_OFFLOAD never arrives at the client (for example, if a network partition occurs just as the server sends the CB_OFFLOAD operation), the client will never receive this verifier. Thus, if the client sends a follow-up COMMIT, there is no way for the client to assess the COMMIT result. The usual recovery for a missing CB_OFFLOAD is for the client to send an OFFLOAD_STATUS operation, but that operation does not carry a write verifier in its result. Neither does it carry a stable_how4 value, so the client /must/ send a COMMIT in this case -- which will always fail because currently there's still no write verifier in the COPY result. Thus the server needs to return a normal write verifier in its COPY result even if the COPY operation is to be performed asynchronously. If the server recognizes the callback stateid in subsequent OFFLOAD_STATUS operations, then obviously it has not restarted, and the write verifier the client received in the COPY result is still valid and can be used to assess a COMMIT of the copied data, if one is needed. Reviewed-by:
Jeff Layton <jlayton@kernel.org> [ cel: adjusted to apply to origin/linux-5.10.y ] Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-