- Oct 15, 2022
-
-
Hu Weiwen authored
commit 7cb99947 upstream. Clear O_TRUNC from the flags sent in the MDS create request. `atomic_open' is called before permission check. We should not do any modification to the file here. The caller will do the truncation afterward. Fixes: 124e68e7 ("ceph: file operations") Signed-off-by:
Hu Weiwen <sehuww@mail.scut.edu.cn> Reviewed-by:
Xiubo Li <xiubli@redhat.com> Signed-off-by:
Ilya Dryomov <idryomov@gmail.com> [Xiubo: fixed a trivial conflict for 5.10 backport] Signed-off-by:
Xiubo Li <xiubli@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ryusuke Konishi authored
commit 723ac751208f6d6540191689cfbf6c77135a7a1b upstream. If creation or finalization of a checkpoint fails due to anomalies in the checkpoint metadata on disk, a kernel warning is generated. This patch replaces the WARN_ONs by nilfs_error, so that a kernel, booted with panic_on_warn, does not panic. A nilfs_error is appropriate here to handle the abnormal filesystem condition. This also replaces the detected error codes with an I/O error so that neither of the internal error codes is returned to callers. Link: https://lkml.kernel.org/r/20220929123330.19658-1-konishi.ryusuke@gmail.com Signed-off-by:
Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by:
<syzbot+fbb3e0b24e8dae5a16ee@syzkaller.appspotmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ryusuke Konishi authored
commit d0d51a97063db4704a5ef6bc978dddab1636a306 upstream. If nilfs_attach_log_writer() failed to create a log writer thread, it frees a data structure of the log writer without any cleanup. After commit e912a5b6 ("nilfs2: use root object to get ifile"), this causes a leak of struct nilfs_root, which started to leak an ifile metadata inode and a kobject on that struct. In addition, if the kernel is booted with panic_on_warn, the above ifile metadata inode leak will cause the following panic when the nilfs2 kernel module is removed: kmem_cache_destroy nilfs2_inode_cache: Slab cache still has objects when called from nilfs_destroy_cachep+0x16/0x3a [nilfs2] WARNING: CPU: 8 PID: 1464 at mm/slab_common.c:494 kmem_cache_destroy+0x138/0x140 ... RIP: 0010:kmem_cache_destroy+0x138/0x140 Code: 00 20 00 00 e8 a9 55 d8 ff e9 76 ff ff ff 48 8b 53 60 48 c7 c6 20 70 65 86 48 c7 c7 d8 69 9c 86 48 8b 4c 24 28 e8 ef 71 c7 00 <0f> 0b e9 53 ff ff ff c3 48 81 ff ff 0f 00 00 77 03 31 c0 c3 53 48 ... Call Trace: <TASK> ? nilfs_palloc_freev.cold.24+0x58/0x58 [nilfs2] nilfs_destroy_cachep+0x16/0x3a [nilfs2] exit_nilfs_fs+0xa/0x1b [nilfs2] __x64_sys_delete_module+0x1d9/0x3a0 ? __sanitizer_cov_trace_pc+0x1a/0x50 ? syscall_trace_enter.isra.19+0x119/0x190 do_syscall_64+0x34/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd ... </TASK> Kernel panic - not syncing: panic_on_warn set ... This patch fixes these issues by calling nilfs_detach_log_writer() cleanup function if spawning the log writer thread fails. Link: https://lkml.kernel.org/r/20221007085226.57667-1-konishi.ryusuke@gmail.com Fixes: e912a5b6 ("nilfs2: use root object to get ifile") Signed-off-by:
Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by:
<syzbot+7381dc4ad60658ca4c05@syzkaller.appspotmail.com> Tested-by:
Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ryusuke Konishi authored
commit 21a87d88c2253350e115029f14fe2a10a7e6c856 upstream. If the i_mode field in inode of metadata files is corrupted on disk, it can cause the initialization of bmap structure, which should have been called from nilfs_read_inode_common(), not to be called. This causes a lockdep warning followed by a NULL pointer dereference at nilfs_bmap_lookup_at_level(). This patch fixes these issues by adding a missing sanitiy check for the i_mode field of metadata file's inode. Link: https://lkml.kernel.org/r/20221002030804.29978-1-konishi.ryusuke@gmail.com Signed-off-by:
Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by:
<syzbot+2b32eb36c1a825b7a74c@syzkaller.appspotmail.com> Reported-by:
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Tested-by:
Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Krzysztof authored
commit 766279a8 upstream. The use of strncpy() is considered deprecated for NUL-terminated strings[1]. Replace strncpy() with strscpy_pad(), to keep existing pad-behavior of strncpy, similarly to commit 08de420a ("rpmsg: glink: Replace strncpy() with strscpy_pad()"). This fixes W=1 warning: In function ‘qcom_glink_rx_close’, inlined from ‘qcom_glink_work’ at ../drivers/rpmsg/qcom_glink_native.c:1638:4: drivers/rpmsg/qcom_glink_native.c:1549:17: warning: ‘strncpy’ specified bound 32 equals destination size [-Wstringop-truncation] 1549 | strncpy(chinfo.name, channel->name, sizeof(chinfo.name)); [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings Signed-off-by:
Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by:
Stephen Boyd <sboyd@kernel.org> Signed-off-by:
Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20220519073330.7187-1-krzysztof.kozlowski@linaro.org Signed-off-by:
Andrew Chernyakov <acherniakov@astralinux.ru> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Brian Norris authored
commit e9233917 upstream. This loop intends to retry a max of 10 times, with some implicit termination based on the SD_{R,}OCR_S18A bit. Unfortunately, the termination condition depends on the value reported by the SD card (*rocr), which may or may not correctly reflect what we asked it to do. Needless to say, it's not wise to rely on the card doing what we expect; we should at least terminate the loop regardless. So, check both the input and output values, so we ensure we will terminate regardless of the SD card behavior. Note that SDIO learned a similar retry loop in commit 0797e5f1 ("mmc: core: Fixup signal voltage switch"), but that used the 'ocr' result, and so the current pre-terminating condition looks like: rocr & ocr & R4_18V_PRESENT (i.e., it doesn't have the same bug.) This addresses a number of crash reports seen on ChromeOS that look like the following: ... // lots of repeated: ... <4>[13142.846061] mmc1: Skipping voltage switch <4>[13143.406087] mmc1: Skipping voltage switch <4>[13143.964724] mmc1: Skipping voltage switch <4>[13144.526089] mmc1: Skipping voltage switch <4>[13145.086088] mmc1: Skipping voltage switch <4>[13145.645941] mmc1: Skipping voltage switch <3>[13146.153969] INFO: task halt:30352 blocked for more than 122 seconds. ... Fixes: f2119df6 ("mmc: sd: add support for signal voltage switch procedure") Cc: <stable@vger.kernel.org> Signed-off-by:
Brian Norris <briannorris@chromium.org> Reviewed-by:
Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20220914014010.2076169-1-briannorris@chromium.org Signed-off-by:
Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
ChanWoo Lee authored
commit e4272664 upstream. SD_ROCR_S18A is already defined and is used to check the rocr value, so let's replace with already defined values for readability. Signed-off-by:
ChanWoo Lee <cw9316.lee@samsung.com> Reviewed-by:
Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20220706004840.24812-1-cw9316.lee@samsung.com Signed-off-by:
Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by:
Brian Norris <briannorris@chromium.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Johan Hovold authored
commit 7bd7ad3c upstream. The 300 bps rate of SIO devices has been mapped to 9600 bps since 2003... Let's fix the regression. Cc: stable@vger.kernel.org Signed-off-by:
Johan Hovold <johan@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tadeusz Struk authored
commit a659daf6 upstream. Syzbot found an issue in usbmon module, where the user space client can corrupt the monitor's internal memory, causing the usbmon module to crash the kernel with segfault, UAF, etc. The reproducer mmaps the /dev/usbmon memory to user space, and overwrites it with arbitrary data, which causes all kinds of issues. Return an -EPERM error from mon_bin_mmap() if the flag VM_WRTIE is set. Also clear VM_MAYWRITE to make it impossible to change it to writable later. Cc: "Dmitry Vyukov" <dvyukov@google.com> Cc: stable <stable@kernel.org> Fixes: 6f23ee1f ("USB: add binary API to usbmon") Suggested-by: PaX Team <pageexec@freemail.hu> # for the VM_MAYRITE portion Link: https://syzkaller.appspot.com/bug?id=2eb1f35d6525fa4a74d75b4244971e5b1411c95a Reported-by:
<syzbot+23f57c5ae902429285d7@syzkaller.appspotmail.com> Signed-off-by:
Tadeusz Struk <tadeusz.struk@linaro.org> Link: https://lore.kernel.org/r/20220919215957.205681-1-tadeusz.struk@linaro.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
David Gow authored
[ Upstream commit bd71558d ] Since binutils 2.39, ld will print a warning if any stack section is executable, which is the default for stack sections on files without a .note.GNU-stack section. This was fixed for x86 in commit ffcf9c57 ("x86: link vdso and boot with -z noexecstack --no-warn-rwx-segments"), but remained broken for UML, resulting in several warnings: /usr/bin/ld: warning: arch/x86/um/vdso/vdso.o: missing .note.GNU-stack section implies executable stack /usr/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker /usr/bin/ld: warning: .tmp_vmlinux.kallsyms1 has a LOAD segment with RWX permissions /usr/bin/ld: warning: .tmp_vmlinux.kallsyms1.o: missing .note.GNU-stack section implies executable stack /usr/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker /usr/bin/ld: warning: .tmp_vmlinux.kallsyms2 has a LOAD segment with RWX permissions /usr/bin/ld: warning: .tmp_vmlinux.kallsyms2.o: missing .note.GNU-stack section implies executable stack /usr/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker /usr/bin/ld: warning: vmlinux has a LOAD segment with RWX permissions Link both the VDSO and vmlinux with -z noexecstack, fixing the warnings about .note.GNU-stack sections. In addition, pass --no-warn-rwx-segments to dodge the remaining warnings about LOAD segments with RWX permissions in the kallsyms objects. (Note that this flag is apparently not available on lld, so hide it behind a test for BFD, which is what the x86 patch does.) Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ffcf9c5700e49c0aee42dcba9a12ba21338e8136 Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ba951afb99912da01a6e8434126b8fac7aa75107 Signed-off-by:
David Gow <davidgow@google.com> Reviewed-by:
Lukas Straub <lukasstraub2@web.de> Tested-by:
Lukas Straub <lukasstraub2@web.de> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by:
Richard Weinberger <richard@nod.at> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Lukas Straub authored
[ Upstream commit d27fff34 ] arch.tls_array is statically allocated so checking for NULL doesn't make sense. This causes the compiler warning below. Remove the checks to silence these warnings. ../arch/x86/um/tls_32.c: In function 'get_free_idx': ../arch/x86/um/tls_32.c:68:13: warning: the comparison will always evaluate as 'true' for the address of 'tls_array' will never be NULL [-Waddress] 68 | if (!t->arch.tls_array) | ^ In file included from ../arch/x86/um/asm/processor.h:10, from ../include/linux/rcupdate.h:30, from ../include/linux/rculist.h:11, from ../include/linux/pid.h:5, from ../include/linux/sched.h:14, from ../arch/x86/um/tls_32.c:7: ../arch/x86/um/asm/processor_32.h:22:31: note: 'tls_array' declared here 22 | struct uml_tls_struct tls_array[GDT_ENTRY_TLS_ENTRIES]; | ^~~~~~~~~ ../arch/x86/um/tls_32.c: In function 'get_tls_entry': ../arch/x86/um/tls_32.c:243:13: warning: the comparison will always evaluate as 'true' for the address of 'tls_array' will never be NULL [-Waddress] 243 | if (!t->arch.tls_array) | ^ ../arch/x86/um/asm/processor_32.h:22:31: note: 'tls_array' declared here 22 | struct uml_tls_struct tls_array[GDT_ENTRY_TLS_ENTRIES]; | ^~~~~~~~~ Signed-off-by:
Lukas Straub <lukasstraub2@web.de> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by:
Richard Weinberger <richard@nod.at> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Lukas Straub authored
[ Upstream commit 61670b4d ] Like in f4f03f29 "um: Cleanup syscall_handler_t definition/cast, fix warning", remove the cast to to fix the compiler warning. Signed-off-by:
Lukas Straub <lukasstraub2@web.de> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by:
Richard Weinberger <richard@nod.at> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Haimin Zhang authored
[ Upstream commit 94160108 ] There is uninit value bug in dgram_sendmsg function in net/ieee802154/socket.c when the length of valid data pointed by the msg->msg_name isn't verified. We introducing a helper function ieee802154_sockaddr_check_size to check namelen. First we check there is addr_type in ieee802154_addr_sa. Then, we check namelen according to addr_type. Also fixed in raw_bind, dgram_bind, dgram_connect. Signed-off-by:
Haimin Zhang <tcs_kernel@tencent.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Letu Ren authored
[ Upstream commit fbfe9686 ] In __qedf_probe(), if qedf->cdev is NULL which means qed_ops->common->probe() failed, then the program will goto label err1, and scsi_host_put() will free lport->host pointer. Because the memory qedf points to is allocated by libfc_host_alloc(), it will be freed by scsi_host_put(). However, the if statement below label err0 only checks whether qedf is NULL but doesn't check whether the memory has been freed. So a UAF bug can occur. There are two ways to reach the statements below err0. The first one is described as before, "qedf" should be set to NULL. The second one is goto "err0" directly. In the latter scenario qedf hasn't been changed and it has the initial value NULL. As a result the if statement is not reachable in any situation. The KASAN logs are as follows: [ 2.312969] BUG: KASAN: use-after-free in __qedf_probe+0x5dcf/0x6bc0 [ 2.312969] [ 2.312969] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 2.312969] Call Trace: [ 2.312969] dump_stack_lvl+0x59/0x7b [ 2.312969] print_address_description+0x7c/0x3b0 [ 2.312969] ? __qedf_probe+0x5dcf/0x6bc0 [ 2.312969] __kasan_report+0x160/0x1c0 [ 2.312969] ? __qedf_probe+0x5dcf/0x6bc0 [ 2.312969] kasan_report+0x4b/0x70 [ 2.312969] ? kobject_put+0x25d/0x290 [ 2.312969] kasan_check_range+0x2ca/0x310 [ 2.312969] __qedf_probe+0x5dcf/0x6bc0 [ 2.312969] ? selinux_kernfs_init_security+0xdc/0x5f0 [ 2.312969] ? trace_rpm_return_int_rcuidle+0x18/0x120 [ 2.312969] ? rpm_resume+0xa5c/0x16e0 [ 2.312969] ? qedf_get_generic_tlv_data+0x160/0x160 [ 2.312969] local_pci_probe+0x13c/0x1f0 [ 2.312969] pci_device_probe+0x37e/0x6c0 Link: https://lore.kernel.org/r/20211112120641.16073-1-fantasquex@gmail.com Reported-by:
Zheyu Ma <zheyuma97@gmail.com> Acked-by:
Saurav Kashyap <skashyap@marvell.com> Co-developed-by:
Wende Tan <twd2.me@gmail.com> Signed-off-by:
Wende Tan <twd2.me@gmail.com> Signed-off-by:
Letu Ren <fantasquex@gmail.com> Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Sergei Antonov authored
[ Upstream commit 02181e68 ] Driver moxart-mmc.c has .compatible = "moxa,moxart-mmc". But moxart .dts/.dtsi and the documentation file moxa,moxart-dma.txt contain compatible = "moxa,moxart-sdhci". Change moxart .dts/.dtsi files and moxa,moxart-dma.txt to match the driver. Replace 'sdhci' with 'mmc' in names too, since SDHCI is a different controller from FTSDC010. Suggested-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Sergei Antonov <saproj@gmail.com> Cc: Jonas Jensen <jonas.jensen@gmail.com> Link: https://lore.kernel.org/r/20220907175341.1477383-1-saproj@gmail.com ' Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Swati Agarwal authored
[ Upstream commit 8f2b6bc7 ] The driver does not handle the failure case while calling dma_set_mask_and_coherent API. In case of failure, capture the return value of API and then report an error. Addresses-coverity: Unchecked return value (CHECKED_RETURN) Signed-off-by:
Swati Agarwal <swati.agarwal@xilinx.com> Reviewed-by:
Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com> Link: https://lore.kernel.org/r/20220817061125.4720-4-swati.agarwal@xilinx.com Signed-off-by:
Vinod Koul <vkoul@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Swati Agarwal authored
[ Upstream commit 462bce79 ] Free the allocated resources for missing xlnx,num-fstores property. Signed-off-by:
Swati Agarwal <swati.agarwal@xilinx.com> Link: https://lore.kernel.org/r/20220817061125.4720-3-swati.agarwal@xilinx.com Signed-off-by:
Vinod Koul <vkoul@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Cristian Marussi authored
[ Upstream commit dea796fc ] Currently, when removing the SCMI PM driver not all the resources registered with genpd subsystem are properly de-registered. As a side effect of this after a driver unload/load cycle you get a splat with a few warnings like this: | debugfs: Directory 'BIG_CPU0' with parent 'pm_genpd' already present! | debugfs: Directory 'BIG_CPU1' with parent 'pm_genpd' already present! | debugfs: Directory 'LITTLE_CPU0' with parent 'pm_genpd' already present! | debugfs: Directory 'LITTLE_CPU1' with parent 'pm_genpd' already present! | debugfs: Directory 'LITTLE_CPU2' with parent 'pm_genpd' already present! | debugfs: Directory 'LITTLE_CPU3' with parent 'pm_genpd' already present! | debugfs: Directory 'BIG_SSTOP' with parent 'pm_genpd' already present! | debugfs: Directory 'LITTLE_SSTOP' with parent 'pm_genpd' already present! | debugfs: Directory 'DBGSYS' with parent 'pm_genpd' already present! | debugfs: Directory 'GPUTOP' with parent 'pm_genpd' already present! Add a proper scmi_pm_domain_remove callback to the driver in order to take care of all the needed cleanups not handled by devres framework. Link: https://lore.kernel.org/r/20220817172731.1185305-7-cristian.marussi@arm.com Signed-off-by:
Cristian Marussi <cristian.marussi@arm.com> Signed-off-by:
Sudeep Holla <sudeep.holla@arm.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Dongliang Mu authored
commit 2e488f13 upstream. In alloc_inode, inode_init_always() could return -ENOMEM if security_inode_alloc() fails, which causes inode->i_private uninitialized. Then nilfs_is_metadata_file_inode() returns true and nilfs_free_inode() wrongly calls nilfs_mdt_destroy(), which frees the uninitialized inode->i_private and leads to crashes(e.g., UAF/GPF). Fix this by moving security_inode_alloc just prior to this_cpu_inc(nr_inodes) Link: https://lkml.kernel.org/r/CAFcO6XOcf1Jj2SeGt=jJV59wmhESeSKpfR0omdFRq+J9nD1vfQ@mail.gmail.com Reported-by:
butt3rflyh4ck <butterflyhuangxx@gmail.com> Reported-by:
Hao Sun <sunhao.th@gmail.com> Reported-by:
Jiacheng Xu <stitch@zju.edu.cn> Reviewed-by:
Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by:
Dongliang Mu <mudongliangabcd@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@vger.kernel.org Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Alexey Dobriyan authored
commit 128dbd78 upstream. strdup() prototype doesn't live in stdlib.h . Add limits.h for PATH_MAX definition as well. This fixes the build on Android. Signed-off-by:
Alexey Dobriyan (SK hynix) <adobriyan@gmail.com> Acked-by:
Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/YRukaQbrgDWhiwGr@localhost.localdomain Signed-off-by:
Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by:
Florian Fainelli <f.fainelli@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Steven Price authored
commit 8782fb61 upstream. The mmap lock protects the page walker from changes to the page tables during the walk. However a read lock is insufficient to protect those areas which don't have a VMA as munmap() detaches the VMAs before downgrading to a read lock and actually tearing down PTEs/page tables. For users of walk_page_range() the solution is to simply call pte_hole() immediately without checking the actual page tables when a VMA is not present. We now never call __walk_page_range() without a valid vma. For walk_page_range_novma() the locking requirements are tightened to require the mmap write lock to be taken, and then walking the pgd directly with 'no_vma' set. This in turn means that all page walkers either have a valid vma, or it's that special 'novma' case for page table debugging. As a result, all the odd '(!walk->vma && !walk->no_vma)' tests can be removed. Fixes: dd2283f2 ("mm: mmap: zap pages with read mmap_sem in munmap") Reported-by:
Jann Horn <jannh@google.com> Signed-off-by:
Steven Price <steven.price@arm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> [manually backported. backport note: walk_page_range_novma() does not exist in 5.4, so I'm omitting it from the backport] Signed-off-by:
Jann Horn <jannh@google.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- Oct 07, 2022
-
-
Greg Kroah-Hartman authored
Link: https://lore.kernel.org/r/20221005113210.255710920@linuxfoundation.org Reviewed-by:
Guenter Roeck <linux@roeck-us.net> Tested-by:
Jon Hunter <jonathanh@nvidia.com> Tested-by:
Linux Kernel Functional Testing <lkft@linaro.org> Tested-by:
Slade Watkins <srw@sladewatkins.net> Tested-by:
Allen Pais <apais@linux.microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Shuah Khan authored
commit 8bfdfa0d upstream. Update mediator information in the CoC interpretation document. Signed-off-by:
Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20220901212319.56644-1-skhan@linuxfoundation.org Cc: stable@vger.kernel.org Signed-off-by:
Jonathan Corbet <corbet@lwn.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sami Tolvanen authored
commit 21206351 upstream. We enable -Wcast-function-type globally in the kernel to warn about mismatching types in function pointer casts. Compilers currently warn only about ABI incompability with this flag, but Clang 16 will enable a stricter version of the check by default that checks for an exact type match. This will be very noisy in the kernel, so disable -Wcast-function-type-strict without W=1 until the new warnings have been addressed. Cc: stable@vger.kernel.org Link: https://reviews.llvm.org/D134831 Link: https://github.com/ClangBuiltLinux/linux/issues/1724 Suggested-by:
Nathan Chancellor <nathan@kernel.org> Signed-off-by:
Sami Tolvanen <samitolvanen@google.com> Signed-off-by:
Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220930203310.4010564-1-samitolvanen@google.com Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Greg Kroah-Hartman authored
This reverts commit c89849ec which is commit 66f99628 upstream. It is reported to cause problems on 5.4.y so it should be reverted for now. Reported-by:
Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/7af02bc3-c0f2-7326-e467-02549e88c9ce@linuxfoundation.org Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Sasha Levin <sashal@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
YueHaibing authored
commit b3531f5f upstream. fs/xfs/xfs_inode.c: In function 'xfs_itruncate_extents_flags': fs/xfs/xfs_inode.c:1523:8: warning: unused variable 'done' [-Wunused-variable] commit 4bbb04ab ("xfs: truncate should remove all blocks, not just to the end of the page cache") left behind this, so remove it. Fixes: 4bbb04ab ("xfs: truncate should remove all blocks, not just to the end of the page cache") Reported-by:
Hulk Robot <hulkci@huawei.com> Reported-by:
Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by:
YueHaibing <yuehaibing@huawei.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Acked-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Chandan Babu R <chandan.babu@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Darrick J. Wong authored
commit 54027a49 upstream. Dan Carpenter pointed out that error is uninitialized. While there never should be an attr leaf block with zero entries, let's not leave that logic bomb there. Fixes: 0bb9d159 ("xfs: streamline xfs_attr3_leaf_inactive") Reported-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Allison Collins <allison.henderson@oracle.com> Reviewed-by:
Eric Sandeen <sandeen@redhat.com> Acked-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Chandan Babu R <chandan.babu@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Darrick J. Wong authored
commit 0bb9d159 upstream. Now that we know we don't have to take a transaction to stale the incore buffers for a remote value, get rid of the unnecessary memory allocation in the leaf walker and call the rmt_stale function directly. Flatten the loop while we're at it. Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Acked-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Chandan Babu R <chandan.babu@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Christoph Hellwig authored
commit a39f089a upstream. Move the abstract in-memory version of various btree block headers out of xfs_da_format.h as they aren't on-disk formats. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Acked-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Chandan Babu R <chandan.babu@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Darrick J. Wong authored
commit e8db2aaf upstream. [Replaced XFS_IS_CORRUPT() calls with ASSERT() for 5.4.y backport] While running generic/103, I observed what looks like memory corruption and (with slub debugging turned on) a slub redzone warning on i386 when inactivating an inode with a 64k remote attr value. On a v5 filesystem, maximally sized remote attr values require one block more than 64k worth of space to hold both the remote attribute value header (64 bytes). On a 4k block filesystem this results in a 68k buffer; on a 64k block filesystem, this would be a 128k buffer. Note that even though we'll never use more than 65,600 bytes of this buffer, XFS_MAX_BLOCKSIZE is 64k. This is a problem because the definition of struct xfs_buf_log_format allows for XFS_MAX_BLOCKSIZE worth of dirty bitmap (64k). On i386 when we invalidate a remote attribute, xfs_trans_binval zeroes all 68k worth of the dirty map, writing right off the end of the log item and corrupting memory. We've gotten away with this on x86_64 for years because the compiler inserts a u32 padding on the end of struct xfs_buf_log_format. Fortunately for us, remote attribute values are written to disk with xfs_bwrite(), which is to say that they are not logged. Fix the problem by removing all places where we could end up creating a buffer log item for a remote attribute value and leave a note explaining why. Next, replace the open-coded buffer invalidation with a call to the helper we created in the previous patch that does better checking for bad metadata before marking the buffer stale. Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Acked-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Chandan Babu R <chandan.babu@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Darrick J. Wong authored
commit 8edbb26b upstream. [Replaced XFS_IS_CORRUPT() calls with ASSERT() for 5.4.y backport] Hoist the code that invalidates remote extended attribute value buffers into a separate helper function. This prepares us for a memory corruption fix in the next patch. Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Acked-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Chandan Babu R <chandan.babu@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Christoph Hellwig authored
commit 7b53b868 upstream. Direct I/O reads can also be used with RWF_NOWAIT & co. Fix the inode locking in xfs_file_dio_aio_read to take IOCB_NOWAIT into account. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Acked-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Chandan Babu R <chandan.babu@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Darrick J. Wong authored
commit 932befe3 upstream. I observed a hang in generic/308 while running fstests on a i686 kernel. The hang occurred when trying to purge the pagecache on a large sparse file that had a page created past MAX_LFS_FILESIZE, which caused an integer overflow in the pagecache xarray and resulted in an infinite loop. I then noticed that Linus changed the definition of MAX_LFS_FILESIZE in commit 0cc3b0ec ("Clarify (and fix) MAX_LFS_FILESIZE macros") so that it is now one page short of the maximum page index on 32-bit kernels. Because the XFS function to compute max offset open-codes the 2005-era MAX_LFS_FILESIZE computation and neither the vfs nor mm perform any sanity checking of s_maxbytes, the code in generic/308 can create a page above the pagecache's limit and kaboom. Fix all this by setting s_maxbytes to MAX_LFS_FILESIZE directly and aborting the mount with a warning if our assumptions ever break. I have no answer for why this seems to have been broken for years and nobody noticed. Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Acked-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Chandan Babu R <chandan.babu@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Darrick J. Wong authored
commit 4bbb04ab upstream. xfs_itruncate_extents_flags() is supposed to unmap every block in a file from EOF onwards. Oddly, it uses s_maxbytes as the upper limit to the bunmapi range, even though s_maxbytes reflects the highest offset the pagecache can support, not the highest offset that XFS supports. The result of this confusion is that if you create a 20T file on a 64-bit machine, mount the filesystem on a 32-bit machine, and remove the file, we leak everything above 16T. Fix this by capping the bunmapi request at the maximum possible block offset, not s_maxbytes. Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Acked-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Chandan Babu R <chandan.babu@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Darrick J. Wong authored
commit a5084865 upstream. Introduce a new #define for the maximum supported file block offset. We'll use this in the next patch to make it more obvious that we're doing some operation for all possible inode fork mappings after a given offset. We can't use ULLONG_MAX here because bunmapi uses that to detect when it's done. Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Acked-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Chandan Babu R <chandan.babu@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Christoph Hellwig authored
commit 780d2905 upstream. XFS_ATTR_INCOMPLETE is a flag in the on-disk attribute format, and thus in a different namespace as the ATTR_* flags in xfs_da_args.flags. Switch to using a XFS_DA_OP_INCOMPLETE flag in op_flags instead. Without this users might be able to inject this flag into operations using the attr by handle ioctl. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Acked-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Chandan Babu R <chandan.babu@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Daniel Sneddon authored
commit 2b129932 upstream. tl;dr: The Enhanced IBRS mitigation for Spectre v2 does not work as documented for RET instructions after VM exits. Mitigate it with a new one-entry RSB stuffing mechanism and a new LFENCE. == Background == Indirect Branch Restricted Speculation (IBRS) was designed to help mitigate Branch Target Injection and Speculative Store Bypass, i.e. Spectre, attacks. IBRS prevents software run in less privileged modes from affecting branch prediction in more privileged modes. IBRS requires the MSR to be written on every privilege level change. To overcome some of the performance issues of IBRS, Enhanced IBRS was introduced. eIBRS is an "always on" IBRS, in other words, just turn it on once instead of writing the MSR on every privilege level change. When eIBRS is enabled, more privileged modes should be protected from less privileged modes, including protecting VMMs from guests. == Problem == Here's a simplification of how guests are run on Linux' KVM: void run_kvm_guest(void) { // Prepare to run guest VMRESUME(); // Clean up after guest runs } The execution flow for that would look something like this to the processor: 1. Host-side: call run_kvm_guest() 2. Host-side: VMRESUME 3. Guest runs, does "CALL guest_function" 4. VM exit, host runs again 5. Host might make some "cleanup" function calls 6. Host-side: RET from run_kvm_guest() Now, when back on the host, there are a couple of possible scenarios of post-guest activity the host needs to do before executing host code: * on pre-eIBRS hardware (legacy IBRS, or nothing at all), the RSB is not touched and Linux has to do a 32-entry stuffing. * on eIBRS hardware, VM exit with IBRS enabled, or restoring the host IBRS=1 shortly after VM exit, has a documented side effect of flushing the RSB except in this PBRSB situation where the software needs to stuff the last RSB entry "by hand". IOW, with eIBRS supported, host RET instructions should no longer be influenced by guest behavior after the host retires a single CALL instruction. However, if the RET instructions are "unbalanced" with CALLs after a VM exit as is the RET in #6, it might speculatively use the address for the instruction after the CALL in #3 as an RSB prediction. This is a problem since the (untrusted) guest controls this address. Balanced CALL/RET instruction pairs such as in step #5 are not affected. == Solution == The PBRSB issue affects a wide variety of Intel processors which support eIBRS. But not all of them need mitigation. Today, X86_FEATURE_RSB_VMEXIT triggers an RSB filling sequence that mitigates PBRSB. Systems setting RSB_VMEXIT need no further mitigation - i.e., eIBRS systems which enable legacy IBRS explicitly. However, such systems (X86_FEATURE_IBRS_ENHANCED) do not set RSB_VMEXIT and most of them need a new mitigation. Therefore, introduce a new feature flag X86_FEATURE_RSB_VMEXIT_LITE which triggers a lighter-weight PBRSB mitigation versus RSB_VMEXIT. The lighter-weight mitigation performs a CALL instruction which is immediately followed by a speculative execution barrier (INT3). This steers speculative execution to the barrier -- just like a retpoline -- which ensures that speculation can never reach an unbalanced RET. Then, ensure this CALL is retired before continuing execution with an LFENCE. In other words, the window of exposure is opened at VM exit where RET behavior is troublesome. While the window is open, force RSB predictions sampling for RET targets to a dead end at the INT3. Close the window with the LFENCE. There is a subset of eIBRS systems which are not vulnerable to PBRSB. Add these systems to the cpu_vuln_whitelist[] as NO_EIBRS_PBRSB. Future systems that aren't vulnerable will set ARCH_CAP_PBRSB_NO. [ bp: Massage, incorporate review comments from Andy Cooper. ] Signed-off-by:
Daniel Sneddon <daniel.sneddon@linux.intel.com> Co-developed-by:
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by:
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by:
Borislav Petkov <bp@suse.de> [cascardo: no intra-function validation] Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pawan Gupta authored
commit eb23b5ef upstream. IBRS mitigation for spectre_v2 forces write to MSR_IA32_SPEC_CTRL at every kernel entry/exit. On Enhanced IBRS parts setting MSR_IA32_SPEC_CTRL[IBRS] only once at boot is sufficient. MSR writes at every kernel entry/exit incur unnecessary performance loss. When Enhanced IBRS feature is present, print a warning about this unnecessary performance loss. Signed-off-by:
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/2a5eaf54583c2bfe0edc4fea64006656256cca17.1657814857.git.pawan.kumar.gupta@linux.intel.com Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Nathan Chancellor authored
commit db886979 upstream. Clang warns: arch/x86/kernel/cpu/bugs.c:58:21: error: section attribute is specified on redeclared variable [-Werror,-Wsection] DEFINE_PER_CPU(u64, x86_spec_ctrl_current); ^ arch/x86/include/asm/nospec-branch.h:283:12: note: previous declaration is here extern u64 x86_spec_ctrl_current; ^ 1 error generated. The declaration should be using DECLARE_PER_CPU instead so all attributes stay in sync. Cc: stable@vger.kernel.org Fixes: fc02735b ("KVM: VMX: Prevent guest RSB poisoning attacks with eIBRS") Reported-by:
kernel test robot <lkp@intel.com> Signed-off-by:
Nathan Chancellor <nathan@kernel.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pawan Gupta authored
commit 4ad3278d upstream. Some Intel processors may use alternate predictors for RETs on RSB-underflow. This condition may be vulnerable to Branch History Injection (BHI) and intramode-BTI. Kernel earlier added spectre_v2 mitigation modes (eIBRS+Retpolines, eIBRS+LFENCE, Retpolines) which protect indirect CALLs and JMPs against such attacks. However, on RSB-underflow, RET target prediction may fallback to alternate predictors. As a result, RET's predicted target may get influenced by branch history. A new MSR_IA32_SPEC_CTRL bit (RRSBA_DIS_S) controls this fallback behavior when in kernel mode. When set, RETs will not take predictions from alternate predictors, hence mitigating RETs as well. Support for this is enumerated by CPUID.7.2.EDX[RRSBA_CTRL] (bit2). For spectre v2 mitigation, when a user selects a mitigation that protects indirect CALLs and JMPs against BHI and intramode-BTI, set RRSBA_DIS_S also to protect RETs for RSB-underflow case. Signed-off-by:
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by:
Borislav Petkov <bp@suse.de> [cascardo: no tools/arch/x86/include/asm/msr-index.h] Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-