- Mar 20, 2023
-
-
Ondrej Mosnacek authored
Linux Security Modules (LSMs) that implement the "capable" hook will usually emit an access denial message to the audit log whenever they "block" the current task from using the given capability based on their security policy. The occurrence of a denial is used as an indication that the given task has attempted an operation that requires the given access permission, so the callers of functions that perform LSM permission checks must take care to avoid calling them too early (before it is decided if the permission is actually needed to perform the requested operation). The __sys_setres[ug]id() functions violate this convention by first calling ns_capable_setid() and only then checking if the operation requires the capability or not. It means that any caller that has the capability granted by DAC (task's capability set) but not by MAC (LSMs) will generate a "denied" audit record, even if is doing an operation for which the capability is not required. Fix this by reordering the checks such that ns_capable_setid() is checked last and -EPERM is returned immediately if it returns false. While there, also do two small optimizations: * move the capability check before prepare_creds() and * bail out early in case of a no-op. Link: https://lkml.kernel.org/r/20230217162154.837549-1-omosnace@redhat.com Fixes: 1da177e4 ("Linux-2.6.12-rc2") Signed-off-by:
Ondrej Mosnacek <omosnace@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Enric Balletbo i Serra authored
Map Enric's old corporate addresses to his kernel.org address. Link: https://lkml.kernel.org/r/20230314115455.188818-1-eballetbo@kernel.org Signed-off-by:
Enric Balletbo i Serra <eballetbo@kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Konrad Dybcio authored
Sai's old email is still picked up by the likes of get_maintainer.pl and keeps bouncing like all other @codeaurora.org addresses. Map it to his current one. Link: https://lkml.kernel.org/r/20230314125604.2734146-1-konrad.dybcio@linaro.org Signed-off-by:
Konrad Dybcio <konrad.dybcio@linaro.org> Cc: Sai Prakash Ranjan <quic_saipraka@quicinc.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Konrad Dybcio authored
Rajendra's old email is still picked up by the likes of get_maintainer.pl and keeps bouncing like all other @codeaurora.org addresses. Map it to his current one. Link: https://lkml.kernel.org/r/20230313090343.2148346-1-konrad.dybcio@linaro.org Signed-off-by:
Konrad Dybcio <konrad.dybcio@linaro.org> Cc: Rajendra Nayak <quic_rjendra@quicinc.com> Cc: Andy Gross <agross@kernel.org> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Bjorn Andersson <andersson@kernel.org> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Kirill Tkhai <tkhai@ya.ru> Cc: Marijn Suijten <marijn.suijten@somainline.org> Cc: Qais Yousef <qyousef@layalina.io> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Vasily Averin <vasily.averin@linux.dev> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Peter Collingbourne authored
This reverts commit 487a32ec. should_skip_kasan_poison() reads the PG_skip_kasan_poison flag from page->flags. However, this line of code in free_pages_prepare(): page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; clears most of page->flags, including PG_skip_kasan_poison, before calling should_skip_kasan_poison(), which meant that it would never return true as a result of the page flag being set. Therefore, fix the code to call should_skip_kasan_poison() before clearing the flags, as we were doing before the reverted patch. This fixes a measurable performance regression introduced in the reverted commit, where munmap() takes longer than intended if HW tags KASAN is supported and enabled at runtime. Without this patch, we see a single-digit percentage performance regression in a particular mmap()-heavy benchmark when enabling HW tags KASAN, and with the patch, there is no statistically significant performance impact when enabling HW tags KASAN. Link: https://lkml.kernel.org/r/20230310042914.3805818-2-pcc@google.com Fixes: 487a32ec ("kasan: drop skip_kasan_poison variable in free_pages_prepare") Link: https://linux-review.googlesource.com/id/Ic4f13affeebd20548758438bb9ed9ca40e312b79 Signed-off-by:
Peter Collingbourne <pcc@google.com> Reviewed-by:
Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> [arm64] Cc: Evgenii Stepanov <eugenis@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will@kernel.org> Cc: <stable@vger.kernel.org> [6.1] Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Tobias Klauser authored
Map my old email addresses to the current address. Link: https://lkml.kernel.org/r/20230310123508.22079-1-tklauser@distanz.ch Signed-off-by:
Tobias Klauser <tklauser@distanz.ch> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Marco Elver authored
With appropriate compiler support [1], KASAN builds use __asan prefixed meminstrinsics, and KASAN no longer overrides memcpy/memset/memmove. If compiler support is detected (CC_HAS_KASAN_MEMINTRINSIC_PREFIX), define memintrinsics normally (do not prefix '__'). On powerpc, KASAN is the only user of __mem functions, which are used to define instrumented memintrinsics. Alias the normal versions for KASAN to use in its implementation. Link: https://lore.kernel.org/all/20230224085942.1791837-1-elver@google.com/ [1] Link: https://lore.kernel.org/oe-kbuild-all/202302271348.U5lvmo0S-lkp@intel.com/ Link: https://lkml.kernel.org/r/20230227094726.3833247-1-elver@google.com Signed-off-by:
Marco Elver <elver@google.com> Reported-by:
kernel test robot <lkp@intel.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Daniel Axtens <dja@axtens.net> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
exit_mmap() will tear down the VMAs and maple tree with the mmap_lock held in write mode. Ensure that the maple tree is still valid by checking ksm_test_exit() after taking the mmap_lock in read mode, but before the for_each_vma() iterator dereferences a destroyed maple tree. Since the maple tree is destroyed, the flags telling lockdep to check an external lock has been cleared. Skip the for_each_vma() iterator to avoid dereferencing a maple tree without the external lock flag, which would create a lockdep warning. Link: https://lkml.kernel.org/r/20230308220310.3119196-1-Liam.Howlett@oracle.com Fixes: a5f18ba0 ("mm/ksm: use vma iterators instead of vma linked list") Signed-off-by:
Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by:
Pengfei Xu <pengfei.xu@intel.com> Link: https://lore.kernel.org/lkml/ZAdUUhSbaa6fHS36@xpf.sh.intel.com/ Reported-by:
<syzbot+2ee18845e89ae76342c5@syzkaller.appspotmail.com> Link: https://syzkaller.appspot.com/bug?id=64a3e95957cd3deab99df7cd7b5a9475af92c93e Acked-by:
David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: <heng.su@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Peter Xu authored
Sync prctl.h after the changes in b507808e ("mm: implement memory-deny-write-execute as a prctl") [joey.gouly@arm.com: add commit message] Link: https://lkml.kernel.org/r/20230308190423.46491-5-joey.gouly@arm.com Fixes: 4cf1fe34 ("kselftest: vm: add tests for memory-deny-write-execute") Signed-off-by:
Peter Xu <peterx@redhat.com> Signed-off-by:
Joey Gouly <joey.gouly@arm.com> Acked-by:
Catalin Marinas <catalin.marinas@arm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexey Izbyshev <izbyshev@ispras.ru> Cc: Kees Cook <keescook@chromium.org> Cc: nd <nd@arm.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Peter Xu authored
Remove unused variable from the MDWE test. [joey.gouly@arm.com: add commit message] Link: https://lkml.kernel.org/r/20230308190423.46491-4-joey.gouly@arm.com Fixes: 4cf1fe34 ("kselftest: vm: add tests for memory-deny-write-execute") Signed-off-by:
Peter Xu <peterx@redhat.com> Signed-off-by:
Joey Gouly <joey.gouly@arm.com> Acked-by:
Catalin Marinas <catalin.marinas@arm.com> Cc: Alexey Izbyshev <izbyshev@ispras.ru> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: nd <nd@arm.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Joey Gouly authored
Commit 4a18419f ("mm/mprotect: use mmu_gather") changed 'goto out;' to 'break' in the loop. This wasn't noticed while rebasing the MDWE patches, so fix it now. Link: https://lkml.kernel.org/r/20230308190423.46491-3-joey.gouly@arm.com Fixes: b507808e ("mm: implement memory-deny-write-execute as a prctl") Signed-off-by:
Joey Gouly <joey.gouly@arm.com> Reported-by:
Alexey Izbyshev <izbyshev@ispras.ru> Link: https://lore.kernel.org/linux-arm-kernel/8408d8901e9d7ee6b78db4c6cba04b78@ispras.ru/ Reviewed-by:
Catalin Marinas <catalin.marinas@arm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: nd <nd@arm.com> Cc: Peter Xu <peterx@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Joey Gouly authored
Patch series "Fixes for MDWE prctl" These are four small fixes for the recent memory-write-deny-execute prctl patches [1]. Two reported by Alexey about error handling and two tooling fixes by Peter. This patch (of 4): Commit cc8d1b09 ("mmap: clean up mmap_region() unrolling") deduplicated the error handling, do the same for the return value of `map_deny_write_exec`. Link: https://lkml.kernel.org/r/20230308190423.46491-1-joey.gouly@arm.com Link: https://lkml.kernel.org/r/20230308190423.46491-2-joey.gouly@arm.com Link: https://lore.kernel.org/linux-arm-kernel/20230119160344.54358-1-joey.gouly@arm.com/ [1] Fixes: b507808e ("mm: implement memory-deny-write-execute as a prctl") Signed-off-by:
Joey Gouly <joey.gouly@arm.com> Reported-by:
Alexey Izbyshev <izbyshev@ispras.ru> Link: https://lore.kernel.org/linux-arm-kernel/8408d8901e9d7ee6b78db4c6cba04b78@ispras.ru/ Reviewed-by:
Catalin Marinas <catalin.marinas@arm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: nd <nd@arm.com> Cc: Peter Xu <peterx@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Tiezhu Yang authored
fstat is replaced by statx on the new architecture, so an exception is added to the checksyscalls script to silence the following build warning on LoongArch: CALL scripts/checksyscalls.sh <stdin>:569:2: warning: #warning syscall fstat not implemented [-Wcpp] Link: https://lkml.kernel.org/r/1678175940-20872-1-git-send-email-yangtiezhu@loongson.cn Signed-off-by:
Tiezhu Yang <yangtiezhu@loongson.cn> Suggested-by:
WANG Xuerui <kernel@xen0n.name> Suggested-by:
Arnd Bergmann <arnd@arndb.de> Reviewed-by:
Arnd Bergmann <arnd@arndb.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Ryusuke Konishi authored
The ioctl helper function nilfs_ioctl_wrap_copy(), which exchanges a metadata array to/from user space, may copy uninitialized buffer regions to user space memory for read-only ioctl commands NILFS_IOCTL_GET_SUINFO and NILFS_IOCTL_GET_CPINFO. This can occur when the element size of the user space metadata given by the v_size member of the argument nilfs_argv structure is larger than the size of the metadata element (nilfs_suinfo structure or nilfs_cpinfo structure) on the file system side. KMSAN-enabled kernels detect this issue as follows: BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:121 [inline] BUG: KMSAN: kernel-infoleak in _copy_to_user+0xc0/0x100 lib/usercopy.c:33 instrument_copy_to_user include/linux/instrumented.h:121 [inline] _copy_to_user+0xc0/0x100 lib/usercopy.c:33 copy_to_user include/linux/uaccess.h:169 [inline] nilfs_ioctl_wrap_copy+0x6fa/0xc10 fs/nilfs2/ioctl.c:99 nilfs_ioctl_get_info fs/nilfs2/ioctl.c:1173 [inline] nilfs_ioctl+0x2402/0x4450 fs/nilfs2/ioctl.c:1290 nilfs_compat_ioctl+0x1b8/0x200 fs/nilfs2/ioctl.c:1343 __do_compat_sys_ioctl fs/ioctl.c:968 [inline] __se_compat_sys_ioctl+0x7dd/0x1000 fs/ioctl.c:910 __ia32_compat_sys_ioctl+0x93/0xd0 fs/ioctl.c:910 do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline] __do_fast_syscall_32+0xa2/0x100 arch/x86/entry/common.c:178 do_fast_syscall_32+0x37/0x80 arch/x86/entry/common.c:203 do_SYSENTER_32+0x1f/0x30 arch/x86/entry/common.c:246 entry_SYSENTER_compat_after_hwframe+0x70/0x82 Uninit was created at: __alloc_pages+0x9f6/0xe90 mm/page_alloc.c:5572 alloc_pages+0xab0/0xd80 mm/mempolicy.c:2287 __get_free_pages+0x34/0xc0 mm/page_alloc.c:5599 nilfs_ioctl_wrap_copy+0x223/0xc10 fs/nilfs2/ioctl.c:74 nilfs_ioctl_get_info fs/nilfs2/ioctl.c:1173 [inline] nilfs_ioctl+0x2402/0x4450 fs/nilfs2/ioctl.c:1290 nilfs_compat_ioctl+0x1b8/0x200 fs/nilfs2/ioctl.c:1343 __do_compat_sys_ioctl fs/ioctl.c:968 [inline] __se_compat_sys_ioctl+0x7dd/0x1000 fs/ioctl.c:910 __ia32_compat_sys_ioctl+0x93/0xd0 fs/ioctl.c:910 do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline] __do_fast_syscall_32+0xa2/0x100 arch/x86/entry/common.c:178 do_fast_syscall_32+0x37/0x80 arch/x86/entry/common.c:203 do_SYSENTER_32+0x1f/0x30 arch/x86/entry/common.c:246 entry_SYSENTER_compat_after_hwframe+0x70/0x82 Bytes 16-127 of 3968 are uninitialized ... This eliminates the leak issue by initializing the page allocated as buffer using get_zeroed_page(). Link: https://lkml.kernel.org/r/20230307085548.6290-1-konishi.ryusuke@gmail.com Signed-off-by:
Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by:
<syzbot+132fdd2f1e1805fdc591@syzkaller.appspotmail.com> Link: https://lkml.kernel.org/r/000000000000a5bd2d05f63f04ae@google.com Tested-by:
Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Test robust filling of an entire area of the tree, then test one beyond. This is to test the walking back up the tree at the end of nodes and error condition. Test inspired by the reproducer code provided by Snild Dolkow. The last test in the function tests for the case of a corrupted maple state caused by the incorrect limits set during mas_skip_node(). There needs to be a gap in the second last child and last child, but the search must rule out the second last child's gap. This would avoid correcting the maple state to the correct max limit and return an error. Link: https://lkml.kernel.org/r/20230307180247.2220303-3-Liam.Howlett@oracle.com Cc: Snild Dolkow <snild@sony.com> Link: https://lore.kernel.org/linux-mm/cb8dc31a-fef2-1d09-f133-e9f7b9f9e77a@sony.com/ Fixes: e15e06a8 ("lib/test_maple_tree: add testing for maple tree") Signed-off-by:
Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Patch series "Fix mas_skip_node() for mas_empty_area()", v2. mas_empty_area() was incorrectly returning an error when there was room. The issue was tracked down to mas_skip_node() using the incorrect end-of-slot count. Instead of using the nodes hard limit, the limit of data should be used. mas_skip_node() was also setting the min and max to that of the child node, which was unnecessary. Within these limits being set, there was also a bug that corrupted the maple state's max if the offset was set to the maximum node pivot. The bug was without consequence unless there was a sufficient gap in the next child node which would cause an error to be returned. This patch set fixes these errors by removing the limit setting from mas_skip_node() and uses the mas_data_end() for slot limits, and adds tests for all failures discovered. This patch (of 2): mas_skip_node() is used to move the maple state to the node with a higher limit. It does this by walking up the tree and increasing the slot count. Since slot count may not be able to be increased, it may need to walk up multiple times to find room to walk right to a higher limit node. The limit of slots that was being used was the node limit and not the last location of data in the node. This would cause the maple state to be shifted outside actual data and enter an error state, thus returning -EBUSY. The result of the incorrect error state means that mas_awalk() would return an error instead of finding the allocation space. The fix is to use mas_data_end() in mas_skip_node() to detect the nodes data end point and continue walking the tree up until it is safe to move to a node with a higher limit. The walk up the tree also sets the maple state limits so remove the buggy code from mas_skip_node(). Setting the limits had the unfortunate side effect of triggering another bug if the parent node was full and the there was no suitable gap in the second last child, but room in the next child. mas_skip_node() may also be passed a maple state in an error state from mas_anode_descend() when no allocations are available. Return on such an error state immediately. Link: https://lkml.kernel.org/r/20230307180247.2220303-1-Liam.Howlett@oracle.com Link: https://lkml.kernel.org/r/20230307180247.2220303-2-Liam.Howlett@oracle.com Fixes: 54a611b6 ("Maple Tree: add new data structure") Signed-off-by:
Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by:
Snild Dolkow <snild@sony.com> Link: https://lore.kernel.org/linux-mm/cb8dc31a-fef2-1d09-f133-e9f7b9f9e77a@sony.com/ Tested-by:
Snild Dolkow <snild@sony.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Michal Hocko authored
Gao Xiang has reported that the page allocator complains about high order __GFP_NOFAIL request coming from the vmalloc core: __alloc_pages+0x1cb/0x5b0 mm/page_alloc.c:5549 alloc_pages+0x1aa/0x270 mm/mempolicy.c:2286 vm_area_alloc_pages mm/vmalloc.c:2989 [inline] __vmalloc_area_node mm/vmalloc.c:3057 [inline] __vmalloc_node_range+0x978/0x13c0 mm/vmalloc.c:3227 kvmalloc_node+0x156/0x1a0 mm/util.c:606 kvmalloc include/linux/slab.h:737 [inline] kvmalloc_array include/linux/slab.h:755 [inline] kvcalloc include/linux/slab.h:760 [inline] it seems that I have completely missed high order allocation backing vmalloc areas case when implementing __GFP_NOFAIL support. This means that [k]vmalloc at al. can allocate higher order allocations with __GFP_NOFAIL which can trigger OOM killer for non-costly orders easily or cause a lot of reclaim/compaction activity if those requests cannot be satisfied. Fix the issue by falling back to zero order allocations for __GFP_NOFAIL requests if the high order request fails. Link: https://lkml.kernel.org/r/ZAXynvdNqcI0f6Us@dhcp22.suse.cz Fixes: 9376130c ("mm/vmalloc: add support for __GFP_NOFAIL") Reported-by:
Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lkml.kernel.org/r/20230305053035.1911-1-hsiangkao@linux.alibaba.com Signed-off-by:
Michal Hocko <mhocko@suse.com> Reviewed-by:
Uladzislau Rezki (Sony) <urezki@gmail.com> Acked-by:
Vlastimil Babka <vbabka@suse.cz> Cc: Baoquan He <bhe@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
ye xingchen authored
The path for SCHED_DEBUG is /sys/kernel/debug/sched. So, SCHED_DEBUG should depend on DEBUG_FS, not PROC_FS. Link: https://lkml.kernel.org/r/202301291110098787982@zte.com.cn Signed-off-by:
ye xingchen <ye.xingchen@zte.com.cn> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Zhaoyang Huang <zhaoyang.huang@unisoc.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Mar 08, 2023
-
-
SeongJae Park authored
damon_pa_mark_accessed_or_deactivate() is accessing a folio via folio_nr_pages() after folio_put() for the folio has invoked. Fix it. Link: https://lkml.kernel.org/r/20230304193949.296391-3-sj@kernel.org Fixes: f70da5ee ("mm/damon: convert damon_pa_mark_accessed_or_deactivate() to use folios") Signed-off-by:
SeongJae Park <sj@kernel.org> Reviewed-by:
Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by:
Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
SeongJae Park authored
Patch series "mm/damon/paddr: Fix folio-use-after-put bugs". There are two folio accesses after folio_put() in mm/damon/paddr.c file. Fix those. This patch (of 2): damon_pa_young() is accessing a folio via folio_size() after folio_put() for the folio has invoked. Fix it. Link: https://lkml.kernel.org/r/20230304193949.296391-1-sj@kernel.org Link: https://lkml.kernel.org/r/20230304193949.296391-2-sj@kernel.org Fixes: 397b0c3a ("mm/damon/paddr: remove folio_sz field from damon_pa_access_chk_result") Signed-off-by:
SeongJae Park <sj@kernel.org> Reviewed-by:
Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by:
Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: <stable@vger.kernel.org> [6.2.x] Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Jan Kara via Ocfs2-devel authored
When buffered write fails to copy data into underlying page cache page, ocfs2_write_end_nolock() just zeroes out and dirties the page. This can leave dirty page beyond EOF and if page writeback tries to write this page before write succeeds and expands i_size, page gets into inconsistent state where page dirty bit is clear but buffer dirty bits stay set resulting in page data never getting written and so data copied to the page is lost. Fix the problem by invalidating page beyond EOF after failed write. Link: https://lkml.kernel.org/r/20230302153843.18499-1-jack@suse.cz Fixes: 6dbf7bb5 ("fs: Don't invalidate page buffers in block_write_full_page()") Signed-off-by:
Jan Kara <jack@suse.cz> Reviewed-by:
Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Huang Ying authored
When we have locked more than one folios, we cannot wait the lock or bit (e.g., page lock, buffer head lock, writeback bit) synchronously. Otherwise deadlock may be triggered. This make it hard to batch the synchronous migration directly. This patch re-enables batching synchronous migration via trying to migrate in batch asynchronously firstly. And any folios that are failed to be migrated asynchronously will be migrated synchronously one by one. Test shows that this can restore the TLB flushing batching performance for synchronous migration effectively. Link: https://lkml.kernel.org/r/20230303030155.160983-4-ying.huang@intel.com Fixes: 5dfab109 ("migrate_pages: batch _unmap and _move") Signed-off-by:
"Huang, Ying" <ying.huang@intel.com> Tested-by:
Hugh Dickins <hughd@google.com> Reviewed-by:
Baolin Wang <baolin.wang@linux.alibaba.com> Cc: "Xu, Pengfei" <pengfei.xu@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Stefan Roesch <shr@devkernel.io> Cc: Tejun Heo <tj@kernel.org> Cc: Xin Hao <xhao@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Huang Ying authored
To simplify the code logic and reduce the line number. Link: https://lkml.kernel.org/r/20230303030155.160983-3-ying.huang@intel.com Fixes: 5dfab109 ("migrate_pages: batch _unmap and _move") Signed-off-by:
"Huang, Ying" <ying.huang@intel.com> Reviewed-by:
Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Hugh Dickins <hughd@google.com> Cc: "Xu, Pengfei" <pengfei.xu@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Stefan Roesch <shr@devkernel.io> Cc: Tejun Heo <tj@kernel.org> Cc: Xin Hao <xhao@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Huang Ying authored
Patch series "migrate_pages: fix deadlock in batched synchronous migration", v2. Two deadlock bugs were reported for the migrate_pages() batching series. Thanks Hugh and Pengfei. Analysis shows that if we have locked some other folios except the one we are migrating, it's not safe in general to wait synchronously, for example, to wait the writeback to complete or wait to lock the buffer head. So 1/3 fixes the deadlock in a simple way, where the batching support for the synchronous migration is disabled. The change is straightforward and easy to be understood. While 3/3 re-introduce the batching for synchronous migration via trying to migrate asynchronously in batch optimistically, then fall back to migrate synchronously one by one for fail-to-migrate folios. Test shows that this can restore the TLB flushing batching performance for synchronous migration effectively. This patch (of 3): Two deadlock bugs were reported for the migrate_pages() batching series. Thanks Hugh and Pengfei! For example, in the following deadlock trace snippet, INFO: task kworker/u4:0:9 blocked for more than 147 seconds. Not tainted 6.2.0-rc4-kvm+ #1314 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/u4:0 state:D stack:0 pid:9 ppid:2 flags:0x00004000 Workqueue: loop4 loop_rootcg_workfn Call Trace: <TASK> __schedule+0x43b/0xd00 schedule+0x6a/0xf0 io_schedule+0x4a/0x80 folio_wait_bit_common+0x1b5/0x4e0 ? __pfx_wake_page_function+0x10/0x10 __filemap_get_folio+0x73d/0x770 shmem_get_folio_gfp+0x1fd/0xc80 shmem_write_begin+0x91/0x220 generic_perform_write+0x10e/0x2e0 __generic_file_write_iter+0x17e/0x290 ? generic_write_checks+0x12b/0x1a0 generic_file_write_iter+0x97/0x180 ? __sanitizer_cov_trace_const_cmp4+0x1a/0x20 do_iter_readv_writev+0x13c/0x210 ? __sanitizer_cov_trace_const_cmp4+0x1a/0x20 do_iter_write+0xf6/0x330 vfs_iter_write+0x46/0x70 loop_process_work+0x723/0xfe0 loop_rootcg_workfn+0x28/0x40 process_one_work+0x3cc/0x8d0 worker_thread+0x66/0x630 ? __pfx_worker_thread+0x10/0x10 kthread+0x153/0x190 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x29/0x50 </TASK> INFO: task repro:1023 blocked for more than 147 seconds. Not tainted 6.2.0-rc4-kvm+ #1314 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:repro state:D stack:0 pid:1023 ppid:360 flags:0x00004004 Call Trace: <TASK> __schedule+0x43b/0xd00 schedule+0x6a/0xf0 io_schedule+0x4a/0x80 folio_wait_bit_common+0x1b5/0x4e0 ? compaction_alloc+0x77/0x1150 ? __pfx_wake_page_function+0x10/0x10 folio_wait_bit+0x30/0x40 folio_wait_writeback+0x2e/0x1e0 migrate_pages_batch+0x555/0x1ac0 ? __pfx_compaction_alloc+0x10/0x10 ? __pfx_compaction_free+0x10/0x10 ? __this_cpu_preempt_check+0x17/0x20 ? lock_is_held_type+0xe6/0x140 migrate_pages+0x100e/0x1180 ? __pfx_compaction_free+0x10/0x10 ? __pfx_compaction_alloc+0x10/0x10 compact_zone+0xe10/0x1b50 ? lock_is_held_type+0xe6/0x140 ? check_preemption_disabled+0x80/0xf0 compact_node+0xa3/0x100 ? __sanitizer_cov_trace_const_cmp8+0x1c/0x30 ? _find_first_bit+0x7b/0x90 sysctl_compaction_handler+0x5d/0xb0 proc_sys_call_handler+0x29d/0x420 proc_sys_write+0x2b/0x40 vfs_write+0x3a3/0x780 ksys_write+0xb7/0x180 __x64_sys_write+0x26/0x30 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc RIP: 0033:0x7f3a2471f59d RSP: 002b:00007ffe567f7288 EFLAGS: 00000217 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3a2471f59d RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000005 RBP: 00007ffe567f72a0 R08: 0000000000000010 R09: 0000000000000010 R10: 0000000000000010 R11: 0000000000000217 R12: 00000000004012e0 R13: 00007ffe567f73e0 R14: 0000000000000000 R15: 0000000000000000 </TASK> The page migration task has held the lock of the shmem folio A, and is waiting the writeback of the folio B of the file system on the loop block device to complete. While the loop worker task which writes back the folio B is waiting to lock the shmem folio A, because the folio A backs the folio B in the loop device. Thus deadlock is triggered. In general, if we have locked some other folios except the one we are migrating, it's not safe to wait synchronously, for example, to wait the writeback to complete or wait to lock the buffer head. To fix the deadlock, in this patch, we avoid to batch the page migration except for MIGRATE_ASYNC mode. In MIGRATE_ASYNC mode, synchronous waiting is avoided. The fix can be improved further. We will do that as soon as possible. Link: https://lkml.kernel.org/r/20230303030155.160983-1-ying.huang@intel.com Link: https://lore.kernel.org/linux-mm/87a6c8c-c5c1-67dc-1e32-eb30831d6e3d@google.com/ Link: https://lore.kernel.org/linux-mm/874jrg7kke.fsf@yhuang6-desk2.ccr.corp.intel.com/ Link: https://lore.kernel.org/linux-mm/20230227110614.dngdub2j3exr6dfp@quack3/ Link: https://lkml.kernel.org/r/20230303030155.160983-2-ying.huang@intel.com Fixes: 5dfab109 ("migrate_pages: batch _unmap and _move") Signed-off-by:
"Huang, Ying" <ying.huang@intel.com> Reported-by:
Hugh Dickins <hughd@google.com> Reported-by:
"Xu, Pengfei" <pengfei.xu@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Stefan Roesch <shr@devkernel.io> Cc: Tejun Heo <tj@kernel.org> Cc: Xin Hao <xhao@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Alexandre Ghiti authored
I'm no longer employed by Canonical which results in email bouncing so add an entry to my personal email address. Link: https://lkml.kernel.org/r/20230301090132.280475-1-alexghiti@rivosinc.com Signed-off-by:
Alexandre Ghiti <alex@ghiti.fr> Reported-by:
Conor Dooley <conor.dooley@microchip.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Konrad Dybcio authored
I recently sent a patch to map Dikshita's old CAF address to his current one @ Qualcomm. It turned out however, that he has two of them, with the @quicinc.com one meant for upstream contributions. Fix it. Link: https://lkml.kernel.org/r/20230301110012.1290379-1-konrad.dybcio@linaro.org Signed-off-by:
Konrad Dybcio <konrad.dybcio@linaro.org> Cc: Dikshita Agarwal <quic_dikshita@quicinc.com> Cc: Andy Gross <agross@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Bjorn Andersson <andersson@kernel.org> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Kirill Tkhai <tkhai@ya.ru> Cc: Marijn Suijten <marijn.suijten@somainline.org> Cc: Qais Yousef <qyousef@layalina.io> Cc: Vasily Averin <vasily.averin@linux.dev> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Jarkko Sakkinen authored
Update to my current employer: https://research.tuni.fi/nisec/ Link: https://lkml.kernel.org/r/20230301235443.6663-1-jarkko@kernel.org Signed-off-by:
Jarkko Sakkinen <jarkko@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Ben Widawsky <bwidawsk@kernel.org> Cc: Bjorn Andersson <andersson@kernel.org> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Kirill Tkhai <tkhai@ya.ru> Cc: Qais Yousef <qyousef@layalina.io> Cc: Vasily Averin <vasily.averin@linux.dev> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
David Hildenbrand authored
Currently, we'd lose the userfaultfd-wp marker when PTE-mapping a huge zeropage, resulting in the next write faults in the PMD range not triggering uffd-wp events. Various actions (partial MADV_DONTNEED, partial mremap, partial munmap, partial mprotect) could trigger this. However, most importantly, un-protecting a single sub-page from the userfaultfd-wp handler when processing a uffd-wp event will PTE-map the shared huge zeropage and lose the uffd-wp bit for the remainder of the PMD. Let's properly propagate the uffd-wp bit to the PMDs. #define _GNU_SOURCE #include <stdio.h> #include <stdlib.h> #include <stdint.h> #include <stdbool.h> #include <inttypes.h> #include <fcntl.h> #include <unistd.h> #include <errno.h> #include <poll.h> #include <pthread.h> #include <sys/mman.h> #include <sys/syscall.h> #include <sys/ioctl.h> #include <linux/userfaultfd.h> static size_t pagesize; static int uffd; static volatile bool uffd_triggered; #define barrier() __asm__ __volatile__("": : :"memory") static void uffd_wp_range(char *start, size_t size, bool wp) { struct uffdio_writeprotect uffd_writeprotect; uffd_writeprotect.range.start = (unsigned long) start; uffd_writeprotect.range.len = size; if (wp) { uffd_writeprotect.mode = UFFDIO_WRITEPROTECT_MODE_WP; } else { uffd_writeprotect.mode = 0; } if (ioctl(uffd, UFFDIO_WRITEPROTECT, &uffd_writeprotect)) { fprintf(stderr, "UFFDIO_WRITEPROTECT failed: %d\n", errno); exit(1); } } static void *uffd_thread_fn(void *arg) { static struct uffd_msg msg; ssize_t nread; while (1) { struct pollfd pollfd; int nready; pollfd.fd = uffd; pollfd.events = POLLIN; nready = poll(&pollfd, 1, -1); if (nready == -1) { fprintf(stderr, "poll() failed: %d\n", errno); exit(1); } nread = read(uffd, &msg, sizeof(msg)); if (nread <= 0) continue; if (msg.event != UFFD_EVENT_PAGEFAULT || !(msg.arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WP)) { printf("FAIL: wrong uffd-wp event fired\n"); exit(1); } /* un-protect the single page. */ uffd_triggered = true; uffd_wp_range((char *)(uintptr_t)msg.arg.pagefault.address, pagesize, false); } return arg; } static int setup_uffd(char *map, size_t size) { struct uffdio_api uffdio_api; struct uffdio_register uffdio_register; pthread_t thread; uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK | UFFD_USER_MODE_ONLY); if (uffd < 0) { fprintf(stderr, "syscall() failed: %d\n", errno); return -errno; } uffdio_api.api = UFFD_API; uffdio_api.features = UFFD_FEATURE_PAGEFAULT_FLAG_WP; if (ioctl(uffd, UFFDIO_API, &uffdio_api) < 0) { fprintf(stderr, "UFFDIO_API failed: %d\n", errno); return -errno; } if (!(uffdio_api.features & UFFD_FEATURE_PAGEFAULT_FLAG_WP)) { fprintf(stderr, "UFFD_FEATURE_WRITEPROTECT missing\n"); return -ENOSYS; } uffdio_register.range.start = (unsigned long) map; uffdio_register.range.len = size; uffdio_register.mode = UFFDIO_REGISTER_MODE_WP; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) < 0) { fprintf(stderr, "UFFDIO_REGISTER failed: %d\n", errno); return -errno; } pthread_create(&thread, NULL, uffd_thread_fn, NULL); return 0; } int main(void) { const size_t size = 4 * 1024 * 1024ull; char *map, *cur; pagesize = getpagesize(); map = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 0); if (map == MAP_FAILED) { fprintf(stderr, "mmap() failed\n"); return -errno; } if (madvise(map, size, MADV_HUGEPAGE)) { fprintf(stderr, "MADV_HUGEPAGE failed\n"); return -errno; } if (setup_uffd(map, size)) return 1; /* Read the whole range, populating zeropages. */ madvise(map, size, MADV_POPULATE_READ); /* Write-protect the whole range. */ uffd_wp_range(map, size, true); /* Make sure uffd-wp triggers on each page. */ for (cur = map; cur < map + size; cur += pagesize) { uffd_triggered = false; barrier(); /* Trigger a write fault. */ *cur = 1; barrier(); if (!uffd_triggered) { printf("FAIL: uffd-wp did not trigger\n"); return 1; } } printf("PASS: uffd-wp triggered\n"); return 0; } Link: https://lkml.kernel.org/r/20230302175423.589164-1-david@redhat.com Fixes: e06f1e1d ("userfaultfd: wp: enabled write protection in userfaultfd API") Signed-off-by:
David Hildenbrand <david@redhat.com> Acked-by:
Peter Xu <peterx@redhat.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Shaohua Li <shli@fb.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
James Houghton authored
By checking huge_pte_none(), we incorrectly classify PTE markers as "present". Instead, check huge_pte_none_mostly(), classifying PTE markers the same as if the PTE were completely blank. PTE markers, unlike other kinds of swap entries, don't reference any physical page and don't indicate that a physical page was mapped previously. As such, treat them as non-present for the sake of mincore(). Link: https://lkml.kernel.org/r/20230302222404.175303-1-jthoughton@google.com Fixes: 5c041f5d ("mm: teach core mm about pte markers") Signed-off-by:
James Houghton <jthoughton@google.com> Acked-by:
Peter Xu <peterx@redhat.com> Acked-by:
David Hildenbrand <david@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: James Houghton <jthoughton@google.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Mar 05, 2023
-
-
Linus Torvalds authored
-
Linus Torvalds authored
Commit aa47a7c2 ("lib/cpumask: deprecate nr_cpumask_bits") resulted in the cpumask operations potentially becoming hugely less efficient, because suddenly the cpumask was always considered to be variable-sized. The optimization was then later added back in a limited form by commit 6f9c07be ("lib/cpumask: add FORCE_NR_CPUS config option"), but that FORCE_NR_CPUS option is not useful in a generic kernel and more of a special case for embedded situations with fixed hardware. Instead, just re-introduce the optimization, with some changes. Instead of depending on CPUMASK_OFFSTACK being false, and then always using the full constant cpumask width, this introduces three different cpumask "sizes": - the exact size (nr_cpumask_bits) remains identical to nr_cpu_ids. This is used for situations where we should use the exact size. - the "small" size (small_cpumask_bits) is the NR_CPUS constant if it fits in a single word and the bitmap operations thus end up able to trigger the "small_const_nbits()" optimizations. This is used for the operations that have optimized single-word cases that get inlined, notably the bit find and scanning functions. - the "large" size (large_cpumask_bits) is the NR_CPUS constant if it is an sufficiently small constant that makes simple "copy" and "clear" operations more efficient. This is arbitrarily set at four words or less. As a an example of this situation, without this fixed size optimization, cpumask_clear() will generate code like movl nr_cpu_ids(%rip), %edx addq $63, %rdx shrq $3, %rdx andl $-8, %edx callq memset@PLT on x86-64, because it would calculate the "exact" number of longwords that need to be cleared. In contrast, with this patch, using a MAX_CPU of 64 (which is quite a reasonable value to use), the above becomes a single movq $0,cpumask instruction instead, because instead of caring to figure out exactly how many CPU's the system has, it just knows that the cpumask will be a single word and can just clear it all. Note that this does end up tightening the rules a bit from the original version in another way: operations that set bits in the cpumask are now limited to the actual nr_cpu_ids limit, whereas we used to do the nr_cpumask_bits thing almost everywhere in the cpumask code. But if you just clear bits, or scan for bits, we can use the simpler compile-time constants. In the process, remove 'cpumask_complement()' and 'for_each_cpu_not()' which were not useful, and which fundamentally have to be limited to 'nr_cpu_ids'. Better remove them now than have somebody introduce use of them later. Of course, on x86-64 with MAXSMP there is no sane small compile-time constant for the cpumask sizes, and we end up using the actual CPU bits, and will generate the above kind of horrors regardless. Please don't use MAXSMP unless you really expect to have machines with thousands of cores. Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds authored
Pull crypto fix from Herbert Xu: "Fix a regression in the caam driver" * tag 'v6.3-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: caam - Fix edesc/iv ordering mixup
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 updates from Thomas Gleixner: "A small set of updates for x86: - Return -EIO instead of success when the certificate buffer for SEV guests is not large enough - Allow STIPB to be enabled with legacy IBSR. Legacy IBRS is cleared on return to userspace for performance reasons, but the leaves user space vulnerable to cross-thread attacks which STIBP prevents. Update the documentation accordingly" * tag 'x86-urgent-2023-03-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: virt/sev-guest: Return -EIO if certificate buffer is not large enough Documentation/hw-vuln: Document the interaction between IBRS and STIBP x86/speculation: Allow enabling STIBP with legacy IBRS
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull irq updates from Thomas Gleixner: "A set of updates for the interrupt susbsystem: - Prevent possible NULL pointer derefences in irq_data_get_affinity_mask() and irq_domain_create_hierarchy() - Take the per device MSI lock before invoking code which relies on it being hold - Make sure that MSI descriptors are unreferenced before freeing them. This was overlooked when the platform MSI code was converted to use core infrastructure and results in a fals positive warning - Remove dead code in the MSI subsystem - Clarify the documentation for pci_msix_free_irq() - More kobj_type constification" * tag 'irq-urgent-2023-03-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/msi, platform-msi: Ensure that MSI descriptors are unreferenced genirq/msi: Drop dead domain name assignment irqdomain: Add missing NULL pointer check in irq_domain_create_hierarchy() genirq/irqdesc: Make kobj_type structures constant PCI/MSI: Clarify usage of pci_msix_free_irq() genirq/msi: Take the per-device MSI lock before validating the control structure genirq/ipi: Fix NULL pointer deref in irq_data_get_affinity_mask()
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull vfs update from Al Viro: "Adding Christian Brauner as VFS co-maintainer" * tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: Adding VFS co-maintainer
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull VM_FAULT_RETRY fixes from Al Viro: "Some of the page fault handlers do not deal with the following case correctly: - handle_mm_fault() has returned VM_FAULT_RETRY - there is a pending fatal signal - fault had happened in kernel mode Correct action in such case is not "return unconditionally" - fatal signals are handled only upon return to userland and something like copy_to_user() would end up retrying the faulting instruction and triggering the same fault again and again. What we need to do in such case is to make the caller to treat that as failed uaccess attempt - handle exception if there is an exception handler for faulting instruction or oops if there isn't one. Over the years some architectures had been fixed and now are handling that case properly; some still do not. This series should fix the remaining ones. Status: - m68k, riscv, hexagon, parisc: tested/acked by maintainers. - alpha, sparc32, sparc64: tested locally - bug has been reproduced on the unpatched kernel and verified to be fixed by this series. - ia64, microblaze, nios2, openrisc: build, but otherwise completely untested" * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: openrisc: fix livelock in uaccess nios2: fix livelock in uaccess microblaze: fix livelock in uaccess ia64: fix livelock in uaccess sparc: fix livelock in uaccess alpha: fix livelock in uaccess parisc: fix livelock in uaccess hexagon: fix livelock in uaccess riscv: fix livelock in uaccess m68k: fix livelock in uaccess
-
Masahiro Yamada authored
include/linux/compiler-intel.h had no update in the past 3 years. We often forget about the third C compiler to build the kernel. For example, commit a0a12c3e ("asm goto: eradicate CC_HAS_ASM_GOTO") only mentioned GCC and Clang. init/Kconfig defines CC_IS_GCC and CC_IS_CLANG but not CC_IS_ICC, and nobody has reported any issue. I guess the Intel Compiler support is broken, and nobody is caring about it. Harald Arnesen pointed out ICC (classic Intel C/C++ compiler) is deprecated: $ icc -v icc: remark #10441: The Intel(R) C++ Compiler Classic (ICC) is deprecated and will be removed from product release in the second half of 2023. The Intel(R) oneAPI DPC++/C++ Compiler (ICX) is the recommended compiler moving forward. Please transition to use this compiler. Use '-diag-disable=10441' to disable this message. icc version 2021.7.0 (gcc version 12.1.0 compatibility) Arnd Bergmann provided a link to the article, "Intel C/C++ compilers complete adoption of LLVM". lib/zstd/common/compiler.h and lib/zstd/compress/zstd_fast.c were kept untouched for better sync with https://github.com/facebook/zstd Link: https://www.intel.com/content/www/us/en/developer/articles/technical/adoption-of-llvm-complete-icx.html Signed-off-by:
Masahiro Yamada <masahiroy@kernel.org> Acked-by:
Arnd Bergmann <arnd@arndb.de> Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com> Reviewed-by:
Nathan Chancellor <nathan@kernel.org> Reviewed-by:
Miguel Ojeda <ojeda@kernel.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Al Viro authored
Acked-by:
Christian Brauner <brauner@kernel.org> Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
- Mar 04, 2023
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linuxLinus Torvalds authored
Pull more i2c updates from Wolfram Sang: "Some improvements/fixes for the newly added GXP driver and a Kconfig dependency fix" * tag 'i2c-for-6.3-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: gxp: fix an error code in probe i2c: gxp: return proper error on address NACK i2c: gxp: remove "empty" switch statement i2c: Disable I2C_APPLE when I2C_PASEMI is a builtin
-
Linus Torvalds authored
The migration code ends up temporarily stashing information of the wrong type in unused fields of the newly allocated destination folio. That all works fine, but gcc does complain about the pointer type mis-use: mm/migrate.c: In function ‘__migrate_folio_extract’: mm/migrate.c:1050:20: note: randstruct: casting between randomized structure pointer types (ssa): ‘struct anon_vma’ and ‘struct address_space’ 1050 | *anon_vmap = (void *)dst->mapping; | ~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~ and gcc is actually right to complain since it really doesn't understand that this is a very temporary special case where this is ok. This could be fixed in different ways by just obfuscating the assignment sufficiently that gcc doesn't see what is going on, but the truly "proper C" way to do this is by explicitly using a union. Using unions for type conversions like this is normally hugely ugly and syntactically nasty, but this really is one of the few cases where we want to make it clear that we're not doing type conversion, we're really re-using the value bit-for-bit just using another type. IOW, this should not become a common pattern, but in this one case using that odd union is probably the best way to document to the compiler what is conceptually going on here. [ Side note: there are valid cases where we convert pointers to other pointer types, notably the whole "folio vs page" situation, where the types actually have fundamental commonalities. The fact that the gcc note is limited to just randomized structures means that we don't see equivalent warnings for those cases, but it migth also mean that we miss other cases where we do play these kinds of dodgy games, and this kind of explicit conversion might be a good idea. ] I verified that at least for an allmodconfig build on x86-64, this generates the exact same code, apart from line numbers and assembler comment changes. Fixes: 64c8902e ("migrate_pages: split unmap_and_move() to _unmap() and _move()") Cc: Huang, Ying <ying.huang@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-