- Mar 12, 2025
-
-
Augusto Caringi authored
Signed-off-by:
Augusto Caringi <acaringi@redhat.com>
-
Augusto Caringi authored
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/6398 JIRA: https://issues.redhat.com/browse/RHEL-78821 Proactive fixes and minor updates for scheduler related code. This includes needed commits up to v6.14-rc1. There are not as many since there are a few features upstream which we are not taking into rhel9 at this point. Signed-off-by:
Phil Auld <pauld@redhat.com> Approved-by:
Waiman Long <longman@redhat.com> Approved-by:
Herton R. Krzesinski <herton@redhat.com> Approved-by:
Tony Camuso <tcamuso@redhat.com> Approved-by:
Juri Lelli <juri.lelli@redhat.com> Approved-by:
Rafael Aquini <raquini@redhat.com> Approved-by:
CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by:
Augusto Caringi <acaringi@redhat.com>
-
Augusto Caringi authored
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/6493 JIRA: https://issues.redhat.com/browse/RHEL-81666 CVE: CVE-2025-21785 ``` commit 875d742cf5327c93cba1f11e12b08d3cce7a88d2 Author: Radu Rendec <rrendec@redhat.com> Date: Thu Feb 6 12:44:20 2025 -0500 arm64: cacheinfo: Avoid out-of-bounds write to cacheinfo array The loop that detects/populates cache information already has a bounds check on the array size but does not account for cache levels with separate data/instructions cache. Fix this by incrementing the index for any populated leaf (instead of any populated level). Fixes: 5d425c18 ("arm64: kernel: add support for cpu cache information") Signed-off-by:
Radu Rendec <rrendec@redhat.com> Link: https://lore.kernel.org/r/20250206174420.2178724-1-rrendec@redhat.com Signed-off-by:
Will Deacon <will@kernel.org>```> Signed-off-by:
CKI Backport Bot <cki-ci-bot+cki-gitlab-backport-bot@redhat.com> --- <small>Created 2025-02-28 03:41 UTC by backporter - [KWF FAQ](https://red.ht/kernel_workflow_doc) - [Slack #team-kernel-workflow](https://redhat-internal.slack.com/archives/C04LRUPMJQ5) - [Source](https://gitlab.com/cki-project/kernel-workflow/-/blob/main/webhook/utils/backporter.py) - [Documentation](https://gitlab.com/cki-project/kernel-workflow/-/blob/main/docs/README.backporter.md) - [Report an issue](https://gitlab.com/cki-project/kernel-workflow/-/issues/new?issue%5Btitle%5D=backporter%20webhook%20issue)</small > Approved-by:
Radu Rendec <rrendec@redhat.com> Approved-by:
Eric Chanudet <echanude@redhat.com> Approved-by:
CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Approved-by:
Mark Langsdorf <mlangsdo@redhat.com> Merged-by:
Augusto Caringi <acaringi@redhat.com>
-
Augusto Caringi authored
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5978 JIRA: https://issues.redhat.com/browse/RHEL-62922 Signed-off-by:
Charles Mirabile <cmirabil@redhat.com> Approved-by:
Eric Chanudet <echanude@redhat.com> Approved-by:
Mark Langsdorf <mlangsdo@redhat.com> Approved-by:
CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by:
Augusto Caringi <acaringi@redhat.com>
-
Augusto Caringi authored
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/6521 JIRA: https://issues.redhat.com/browse/RHEL-73514 Brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=66888276 Upstream Status: Commits are found in Linus's Git Tree. Signed-off-by:
Nigel Croxon <ncroxon@redhat.com> Approved-by:
Heinz Mauelshagen <heinzm@redhat.com> Approved-by:
Xiao Ni <xni@redhat.com> Approved-by:
CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by:
Augusto Caringi <acaringi@redhat.com>
-
Augusto Caringi authored
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/6326 JIRA: https://issues.redhat.com/browse/RHEL-77271 Signed-off-by:
Eric Chanudet <echanude@redhat.com> Approved-by:
Desnes Nunes <desnesn@redhat.com> Approved-by:
Herton R. Krzesinski <herton@redhat.com> Approved-by:
Rafael Aquini <raquini@redhat.com> Approved-by:
CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by:
Augusto Caringi <acaringi@redhat.com>
-
Augusto Caringi authored
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/6416 JIRA: https://issues.redhat.com/browse/RHEL-77173 ``` commit a8258e64ca74314478fda85a42b1c1eb2db29c67 Author: Hangbin Liu <liuhangbin@gmail.com> Date: Wed Dec 6 15:07:55 2023 +0800 selftests/net: convert test_vxlan_mdb.sh to run it in unique namespace Here is the test result after conversion. ]# ./test_vxlan_mdb.sh Control path: Basic (*, G) operations - IPv4 overlay / IPv4 underlay -------------------------------------------------------------------- TEST: MDB entry addition [ OK ] ... Data path: MDB torture test - IPv6 overlay / IPv6 underlay ---------------------------------------------------------- TEST: Torture test [ OK ] Tests passed: 620 Tests failed: 0 Acked-by:
David Ahern <dsahern@kernel.org> Signed-off-by:
Hangbin Liu <liuhangbin@gmail.com> Reviewed-by:
Ido Schimmel <idosch@nvidia.com> Tested-by:
Ido Schimmel <idosch@nvidia.com> Signed-off-by:
David S. Miller <davem@davemloft.net>```> Signed-off-by:
CKI Backport Bot <cki-ci-bot+cki-gitlab-backport-bot@redhat.com> --- <small>Created 2025-02-24 09:49 UTC by backporter - [KWF FAQ](https://red.ht/kernel_workflow_doc) - [Slack #team-kernel-workflow](https://redhat-internal.slack.com/archives/C04LRUPMJQ5) - [Source](https://gitlab.com/cki-project/kernel-workflow/-/blob/main/webhook/utils/backporter.py) - [Documentation](https://gitlab.com/cki-project/kernel-workflow/-/blob/main/docs/README.backporter.md) - [Report an issue](https://gitlab.com/cki-project/kernel-workflow/-/issues/new?issue%5Btitle%5D=backporter%20webhook%20issue)</small > Approved-by:
Antoine Tenart <atenart@redhat.com> Approved-by:
Guillaume Nault <gnault@redhat.com> Approved-by:
CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by:
Augusto Caringi <acaringi@redhat.com>
-
Augusto Caringi authored
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/6540 Description: Updates for xive papr bit map allocation JIRA: https://issues.redhat.com/browse/RHEL-80803 CVE: CVE-2022-49623 Build Info: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=66908454 Tested: Verified Brew build test kernel RPMs Signed-off-by:
Mamatha Inamdar <minamdar@redhat.com> Approved-by:
Steve Best <sbest@redhat.com> Approved-by:
Tony Camuso <tcamuso@redhat.com> Approved-by:
CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by:
Augusto Caringi <acaringi@redhat.com>
-
Augusto Caringi authored
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/6508 This rebases MANA hyperv driver to upstream kernel 6.14-rc5 JIRA: https://issues.redhat.com/browse/RHEL-80098 Tested: Smoke boot test on azure machine, IB portion not yet tested. Signed-off-by:
Maxim Levitsky <mlevitsk@redhat.com> Approved-by:
Kamal Heib <kheib@redhat.com> Approved-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Approved-by:
CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by:
Augusto Caringi <acaringi@redhat.com>
-
Augusto Caringi authored
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/6502 JIRA: https://issues.redhat.com/browse/RHEL-81935 Commits: ``` dc287e4c9149ab54a5003b4d4da007818b5fda3d 05793884a1f30509e477de9da233ab73584b1c8c 2844ddbd540fc84d7571cca65d6c43088e4d6952 ``` Signed-off-by:
Mete Durlu <mdurlu@redhat.com> Approved-by:
Steve Best <sbest@redhat.com> Approved-by:
Tony Camuso <tcamuso@redhat.com> Approved-by:
CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by:
Augusto Caringi <acaringi@redhat.com>
-
Augusto Caringi authored
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/6181 JIRA: https://issues.redhat.com/browse/RHEL-72430 CVE: CVE-2024-56709 JIRA: https://issues.redhat.com/browse/RHEL-68164 CVE: CVE-2024-53052 JIRA: https://issues.redhat.com/browse/RHEL-63824 CVE: CVE-2024-50060 Signed-off-by:
Jeff Moyer <jmoyer@redhat.com> Approved-by:
Ewan D. Milne <emilne@redhat.com> Approved-by:
Ming Lei <ming.lei@redhat.com> Approved-by:
CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by:
Augusto Caringi <acaringi@redhat.com>
-
Augusto Caringi authored
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/6401 JIRA: https://issues.redhat.com/browse/RHEL-78983 CVE: CVE-2024-56690 Upstream Status: linux.git commit 662f2f13e66d3883b9238b0b96b17886179e60e2 Author: Yi Yang <yiyang13@huawei.com> Date: Tue Oct 15 02:09:35 2024 +0000 crypto: pcrypt - Call crypto layer directly when padata_do_parallel() return -EBUSY Since commit 8f4f68e788c3 ("crypto: pcrypt - Fix hungtask for PADATA_RESET"), the pcrypt encryption and decryption operations return -EAGAIN when the CPU goes online or offline. In alg_test(), a WARN is generated when pcrypt_aead_decrypt() or pcrypt_aead_encrypt() returns -EAGAIN, the unnecessary panic will occur when panic_on_warn set 1. Fix this issue by calling crypto layer directly without parallelization in that case. Fixes: 8f4f68e788c3 ("crypto: pcrypt - Fix hungtask for PADATA_RESET") Signed-off-by:
Yi Yang <yiyang13@huawei.com> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by:
Herbert Xu <herbert.xu@redhat.com> Approved-by:
Ondrej Mosnáček <omosnacek@gmail.com> Approved-by:
Phil Auld <pauld@redhat.com> Approved-by:
Vladis Dronov <vdronov@redhat.com> Approved-by:
CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by:
Augusto Caringi <acaringi@redhat.com>
-
Augusto Caringi authored
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5185 JIRA: https://issues.redhat.com/browse/RHEL-58876 CVE: CVE-2024-46689 ``` soc: qcom: cmd-db: Map shared memory as WC, not WB Linux does not write into cmd-db region. This region of memory is write protected by XPU. XPU may sometime falsely detect clean cache eviction as "write" into the write protected region leading to secure interrupt which causes an endless loop somewhere in Trust Zone. The only reason it is working right now is because Qualcomm Hypervisor maps the same region as Non-Cacheable memory in Stage 2 translation tables. The issue manifests if we want to use another hypervisor (like Xen or KVM), which does not know anything about those specific mappings. Changing the mapping of cmd-db memory from MEMREMAP_WB to MEMREMAP_WT/WC removes dependency on correct mappings in Stage 2 tables. This patch fixes the issue by updating the mapping to MEMREMAP_WC. I tested this on SA8155P with Xen. Fixes: 312416d9 ("drivers: qcom: add command DB driver") Cc: stable@vger.kernel.org # 5.4+ Signed-off-by:
Volodymyr Babchuk <volodymyr_babchuk@epam.com> Tested-by: Nikita Travkin <nikita@trvn.ru> # sc7180 WoA in EL2 Signed-off-by:
Maulik Shah <quic_mkshah@quicinc.com> Tested-by:
Pavankumar Kondeti <quic_pkondeti@quicinc.com> Reviewed-by:
Caleb Connolly <caleb.connolly@linaro.org> Link: https://lore.kernel.org/r/20240718-cmd_db_uncached-v2-1-f6cf53164c90@quicinc.com Signed-off-by:
Bjorn Andersson <andersson@kernel.org> (cherry picked from commit f9bb896eab221618927ae6a2f1d566567999839d) ``` Signed-off-by:
CKI Backport Bot <cki-ci-bot+cki-gitlab-backport-bot@redhat.com> --- <small>Created 2024-09-13 15:15 UTC by backporter - [KWF FAQ](https://red.ht/kernel_workflow_doc) - [Slack #team-kernel-workflow](https://redhat-internal.slack.com/archives/C04LRUPMJQ5) - [Source](https://gitlab.com/cki-project/kernel-workflow/-/blob/main/webhook/utils/backporter.py) - [Documentation](https://gitlab.com/cki-project/kernel-workflow/-/blob/main/docs/README.backporter.md) - [Report an issue](https://gitlab.com/cki-project/kernel-workflow/-/issues/new?issue%5Btitle%5D=backporter%20webhook%20issue)</small > Approved-by:
Eric Chanudet <echanude@redhat.com> Approved-by:
Andrew Halaney <ahalaney@redhat.com> Approved-by:
Brian Masney <bmasney@redhat.com> Approved-by:
CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by:
Augusto Caringi <acaringi@redhat.com>
-
- Mar 11, 2025
-
-
Mamatha Inamdar authored
JIRA: https://issues.redhat.com/browse/RHEL-80803 CVE: CVE-2022-49623 commit 19fc5bb93c6bbdce8292b4d7eed04e2fa118d2fe Author: Nathan Lynch <nathanl@linux.ibm.com> Date: Thu Jun 23 13:25:09 2022 -0500 powerpc/xive/spapr: correct bitmap allocation size kasan detects access beyond the end of the xibm->bitmap allocation: BUG: KASAN: slab-out-of-bounds in _find_first_zero_bit+0x40/0x140 Read of size 8 at addr c00000001d1d0118 by task swapper/0/1 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc2-00001-g90df023b36dd #28 Call Trace: [c00000001d98f770] [c0000000012baab8] dump_stack_lvl+0xac/0x108 (unreliable) [c00000001d98f7b0] [c00000000068faac] print_report+0x37c/0x710 [c00000001d98f880] [c0000000006902c0] kasan_report+0x110/0x354 [c00000001d98f950] [c000000000692324] __asan_load8+0xa4/0xe0 [c00000001d98f970] [c0000000011c6ed0] _find_first_zero_bit+0x40/0x140 [c00000001d98f9b0] [c0000000000dbfbc] xive_spapr_get_ipi+0xcc/0x260 [c00000001d98fa70] [c0000000000d6d28] xive_setup_cpu_ipi+0x1e8/0x450 [c00000001d98fb30] [c000000004032a20] pSeries_smp_probe+0x5c/0x118 [c00000001d98fb60] [c000000004018b44] smp_prepare_cpus+0x944/0x9ac [c00000001d98fc90] [c000000004009f9c] kernel_init_freeable+0x2d4/0x640 [c00000001d98fd90] [c0000000000131e8] kernel_init+0x28/0x1d0 [c00000001d98fe10] [c00000000000cd54] ret_from_kernel_thread+0x5c/0x64 Allocated by task 0: kasan_save_stack+0x34/0x70 __kasan_kmalloc+0xb4/0xf0 __kmalloc+0x268/0x540 xive_spapr_init+0x4d0/0x77c pseries_init_irq+0x40/0x27c init_IRQ+0x44/0x84 start_kernel+0x2a4/0x538 start_here_common+0x1c/0x20 The buggy address belongs to the object at c00000001d1d0118 which belongs to the cache kmalloc-8 of size 8 The buggy address is located 0 bytes inside of 8-byte region [c00000001d1d0118, c00000001d1d0120) The buggy address belongs to the physical page: page:c00c000000074740 refcount:1 mapcount:0 mapping:0000000000000000 index:0xc00000001d1d0558 pfn:0x1d1d flags: 0x7ffff000000200(slab|node=0|zone=0|lastcpupid=0x7ffff) raw: 007ffff000000200 c00000001d0003c8 c00000001d0003c8 c00000001d010480 raw: c00000001d1d0558 0000000001e1000a 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: c00000001d1d0000: fc 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc c00000001d1d0080: fc fc 00 fc fc fc fc fc fc fc fc fc fc fc fc fc >c00000001d1d0100: fc fc fc 02 fc fc fc fc fc fc fc fc fc fc fc fc ^ c00000001d1d0180: fc fc fc fc 04 fc fc fc fc fc fc fc fc fc fc fc c00000001d1d0200: fc fc fc fc fc 04 fc fc fc fc fc fc fc fc fc fc This happens because the allocation uses the wrong unit (bits) when it should pass (BITS_TO_LONGS(count) * sizeof(long)) or equivalent. With small numbers of bits, the allocated object can be smaller than sizeof(long), which results in invalid accesses. Use bitmap_zalloc() to allocate and initialize the irq bitmap, paired with bitmap_free() for consistency. Signed-off-by:
Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by:
Cédric Le Goater <clg@kaod.org> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220623182509.3985625-1-nathanl@linux.ibm.com Signed-off-by:
Mamatha Inamdar <minamdar@redhat.com>
-
- Mar 10, 2025
-
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 Currently only stacked devices need to explicitly enable atomic writes by setting BLK_FEAT_ATOMIC_WRITES_STACKED flag. This does not work well for device mapper stacking devices, as there many sets of limits are stacked and what is the 'bottom' and 'top' device can swapped. This means that BLK_FEAT_ATOMIC_WRITES_STACKED needs to be set for many queue limits, which is messy. Generalize enabling atomic writes enabling by ensuring that all devices must explicitly set a flag - that includes NVMe, SCSI sd, and md raid. Signed-off-by:
John Garry <john.g.garry@oracle.com> Reviewed-by:
Mike Snitzer <snitzer@kernel.org> Link: https://lore.kernel.org/r/20250116170301.474130-2-john.g.garry@oracle.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> (cherry picked from commit 6a7e17b22062c84a111d7073c67cc677c4190f32) Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 For stacking atomic writes, ensure that the start sector is aligned with the device atomic write unit min and any boundary. Otherwise, we may permit misaligned atomic writes. Rework bdev_can_atomic_write() into a common helper to resuse the alignment check. There also use atomic_write_hw_unit_min, which is more proper (than atomic_write_unit_min). Fixes: d7f36dc446e89 ("block: Support atomic writes limits for stacked devices") Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
John Garry <john.g.garry@oracle.com> Reviewed-by:
Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20250109114000.2299896-2-john.g.garry@oracle.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> (cherry picked from commit 6564862d646e7d630929ba1ff330740bb215bdac) Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 Allow stacked devices to support atomic writes by aggregating the minimum capability of all bottom devices. Flag BLK_FEAT_ATOMIC_WRITES_STACKED is set for stacked devices which have been enabled to support atomic writes. Some things to note on the implementation: - For simplicity, all bottom devices must have same atomic write boundary value (if any) - The atomic write boundary must be a power-of-2 already, but this restriction could be relaxed. Furthermore, it is now required that the chunk sectors for a top device must be aligned with this boundary. - If a bottom device atomic write unit min/max are not aligned with the top device chunk sectors, the top device atomic write unit min/max are reduced to a value which works for the chunk sectors. Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
John Garry <john.g.garry@oracle.com> Reviewed-by:
Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20241118105018.1870052-3-john.g.garry@oracle.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> (cherry picked from commit d7f36dc446e894e0f57b5f05c5628f03c5f9e2d2) Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 commit fbe8f2fa971c537571994a0df532c511c4fb5537 Author: Bart Van Assche <bvanassche@acm.org> Date: Wed Feb 12 09:11:07 2025 -0800 md/raid*: Fix the set_queue_limits implementations queue_limits_cancel_update() must only be called if queue_limits_start_update() is called first. Remove the queue_limits_cancel_update() calls from the raid*_set_limits() functions because there is no corresponding queue_limits_start_update() call. Cc: Christoph Hellwig <hch@lst.de> Fixes: c6e56cf6b2e7 ("block: move integrity information into queue_limits") Signed-off-by:
Bart Van Assche <bvanassche@acm.org> Reviewed-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/linux-raid/20250212171108.3483150-1-bvanassche@acm.org/ Signed-off-by:
Yu Kuai <yukuai@kernel.org> (cherry picked from commit fbe8f2fa971c537571994a0df532c511c4fb5537) Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 commit a572593ac80e51eb69ecede7e614289fcccdbf8d Author: Bart Van Assche <bvanassche@acm.org> Date: Wed Jan 29 14:56:35 2025 -0800 md: Fix linear_set_limits() queue_limits_cancel_update() must only be called if queue_limits_start_update() is called first. Remove the queue_limits_cancel_update() call from linear_set_limits() because there is no corresponding queue_limits_start_update() call. This bug was discovered by annotating all mutex operations with clang thread-safety attributes and by building the kernel with clang and -Wthread-safety. Cc: Yu Kuai <yukuai3@huawei.com> Cc: Coly Li <colyli@kernel.org> Cc: Mike Snitzer <snitzer@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Fixes: 127186cfb184 ("md: reintroduce md-linear") Signed-off-by:
Bart Van Assche <bvanassche@acm.org> Reviewed-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250129225636.2667932-1-bvanassche@acm.org Signed-off-by:
Song Liu <song@kernel.org> (cherry picked from commit a572593ac80e51eb69ecede7e614289fcccdbf8d) Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 commit 8d28d0ddb986f56920ac97ae704cc3340a699a30 Author: Yu Kuai <yukuai3@huawei.com> Date: Fri Jan 24 17:20:55 2025 +0800 md/md-bitmap: Synchronize bitmap_get_stats() with bitmap lifetime After commit ec6bb299c7c3 ("md/md-bitmap: add 'sync_size' into struct md_bitmap_stats"), following panic is reported: Oops: general protection fault, probably for non-canonical address RIP: 0010:bitmap_get_stats+0x2b/0xa0 Call Trace: <TASK> md_seq_show+0x2d2/0x5b0 seq_read_iter+0x2b9/0x470 seq_read+0x12f/0x180 proc_reg_read+0x57/0xb0 vfs_read+0xf6/0x380 ksys_read+0x6c/0xf0 do_syscall_64+0x82/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e Root cause is that bitmap_get_stats() can be called at anytime if mddev is still there, even if bitmap is destroyed, or not fully initialized. Deferenceing bitmap in this case can crash the kernel. Meanwhile, the above commit start to deferencing bitmap->storage, make the problem easier to trigger. Fix the problem by protecting bitmap_get_stats() with bitmap_info.mutex. Cc: stable@vger.kernel.org # v6.12+ Fixes: 32a7627c ("[PATCH] md: optimised resync using Bitmap based intent logging") Reported-and-tested-by:
Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Closes: https://lore.kernel.org/linux-raid/ca3a91a2-50ae-4f68-b317-abd9889f3907@oracle.com/T/#m6e5086c95201135e4941fe38f9efa76daf9666c5 Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20250124092055.4050195-1-yukuai1@huaweicloud.com Signed-off-by:
Song Liu <song@kernel.org> Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 commit 62c552070a980363d55a6082b432ebd1cade7a6e Author: Dan Carpenter <dan.carpenter@linaro.org> Date: Wed Jan 15 09:53:52 2025 +0300 md/md-linear: Fix a NULL vs IS_ERR() bug in linear_add() The linear_conf() returns error pointers, it doesn't return NULL. Update the error checking to match. Fixes: 127186cfb184 ("md: reintroduce md-linear") Signed-off-by:
Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by:
Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/add654be-759f-4b2d-93ba-a3726dae380c@stanley.mountain Signed-off-by:
Song Liu <song@kernel.org> (cherry picked from commit 62c552070a980363d55a6082b432ebd1cade7a6e) Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 commit cd5fc653381811f1e0ba65f5d169918cab61476f Author: Yu Kuai <yukuai3@huawei.com> Date: Thu Jan 9 09:51:45 2025 +0800 md/md-bitmap: move bitmap_{start, end}write to md upper layer There are two BUG reports that raid5 will hang at bitmap_startwrite([1],[2]), root cause is that bitmap start write and end write is unbalanced, it's not quite clear where, and while reviewing raid5 code, it's found that bitmap operations can be optimized. For example, for a 4 disks raid5, with chunksize=8k, if user issue a IO (0 + 48k) to the array: ┌────────────────────────────────────────────────────────────┐ │chunk 0 │ │ ┌────────────┬─────────────┬─────────────┬────────────┼ │ sh0 │A0: 0 + 4k │A1: 8k + 4k │A2: 16k + 4k │A3: P │ │ ┼────────────┼─────────────┼─────────────┼────────────┼ │ sh1 │B0: 4k + 4k │B1: 12k + 4k │B2: 20k + 4k │B3: P │ ┼──────┴────────────┴─────────────┴─────────────┴────────────┼ │chunk 1 │ │ ┌────────────┬─────────────┬─────────────┬────────────┤ │ sh2 │C0: 24k + 4k│C1: 32k + 4k │C2: P │C3: 40k + 4k│ │ ┼────────────┼─────────────┼─────────────┼────────────┼ │ sh3 │D0: 28k + 4k│D1: 36k + 4k │D2: P │D3: 44k + 4k│ └──────┴────────────┴─────────────┴─────────────┴────────────┘ Before this patch, 4 stripe head will be used, and each sh will attach bio for 3 disks, and each attached bio will trigger bitmap_startwrite() once, which means total 12 times. - 3 times (0 + 4k), for (A0, A1 and A2) - 3 times (4 + 4k), for (B0, B1 and B2) - 3 times (8 + 4k), for (C0, C1 and C3) - 3 times (12 + 4k), for (D0, D1 and D3) After this patch, md upper layer will calculate that IO range (0 + 48k) is corresponding to the bitmap (0 + 16k), and call bitmap_startwrite() just once. Noted that this patch will align bitmap ranges to the chunks, for example, if user issue a IO (0 + 4k) to array: - Before this patch, 1 time (0 + 4k), for A0; - After this patch, 1 time (0 + 8k) for chunk 0; Usually, one bitmap bit will represent more than one disk chunk, and this doesn't have any difference. And even if user really created a array that one chunk contain multiple bits, the overhead is that more data will be recovered after power failure. Also remove STRIPE_BITMAP_PENDING since it's not used anymore. [1] https://lore.kernel.org/all/CAJpMwyjmHQLvm6zg1cmQErttNNQPDAAXPKM3xgTjMhbfts986Q@mail.gmail.com/ [2] https://lore.kernel.org/all/ADF7D720-5764-4AF3-B68E-1845988737AA@flyingcircus.io/ Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20250109015145.158868-6-yukuai1@huaweicloud.com Signed-off-by:
Song Liu <song@kernel.org> Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 commit 9c89f604476cf15c31fbbdb043cff7fbf1dbe0cb Author: Yu Kuai <yukuai3@huawei.com> Date: Thu Jan 9 09:51:44 2025 +0800 md/raid5: implement pers->bitmap_sector() Bitmap is used for the whole array for raid1/raid10, hence IO for the array can be used directly for bitmap. However, bitmap is used for underlying disks for raid5, hence IO for the array can't be used directly for bitmap. Implement pers->bitmap_sector() for raid5 to convert IO ranges from the array to the underlying disks. Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20250109015145.158868-5-yukuai1@huaweicloud.com Signed-off-by:
Song Liu <song@kernel.org> Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 commit 0c984a283a3ea3f10bebecd6c57c1d41b2e4f518 Author: Yu Kuai <yukuai3@huawei.com> Date: Thu Jan 9 09:51:43 2025 +0800 md: add a new callback pers->bitmap_sector() This callback will be used in raid5 to convert io ranges from array to bitmap. Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Reviewed-by:
Xiao Ni <xni@redhat.com> Link: https://lore.kernel.org/r/20250109015145.158868-4-yukuai1@huaweicloud.com Signed-off-by:
Song Liu <song@kernel.org> Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 commit 4f0e7d0e03b7b80af84759a9e7cfb0f81ac4adae Author: Yu Kuai <yukuai3@huawei.com> Date: Thu Jan 9 09:51:42 2025 +0800 md/md-bitmap: remove the last parameter for bimtap_ops->endwrite() For the case that IO failed for one rdev, the bit will be mark as NEEDED in following cases: 1) If badblocks is set and rdev is not faulty; 2) If rdev is faulty; Case 1) is useless because synchronize data to badblocks make no sense. Case 2) can be replaced with mddev->degraded. Also remove R1BIO_Degraded, R10BIO_Degraded and STRIPE_DEGRADED since case 2) no longer use them. Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20250109015145.158868-3-yukuai1@huaweicloud.com Signed-off-by:
Song Liu <song@kernel.org> Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 commit 08c50142a128dcb2d7060aa3b4c5db8837f7a46a Author: Yu Kuai <yukuai3@huawei.com> Date: Thu Jan 9 09:51:41 2025 +0800 md/md-bitmap: factor behind write counters out from bitmap_{start/end}write() behind_write is only used in raid1, prepare to refactor bitmap_{start/end}write(), there are no functional changes. Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Reviewed-by:
Xiao Ni <xni@redhat.com> Link: https://lore.kernel.org/r/20250109015145.158868-2-yukuai1@huaweicloud.com Signed-off-by:
Song Liu <song@kernel.org> Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 commit 4fa91616c078c203f1ab6c43f9524b7e352c8217 Author: David Reaver <me@davidreaver.com> Date: Wed Jan 8 11:21:30 2025 -0800 md: Replace deprecated kmap_atomic() with kmap_local_page() kmap_atomic() is deprecated and should be replaced with kmap_local_page() [1][2]. kmap_local_page() is faster in kernels with HIGHMEM enabled, can take page faults, and allows preemption. According to [2], this is safe as long as the code between kmap_atomic() and kunmap_atomic() does not implicitly depend on disabling page faults or preemption. It appears to me that none of the call sites in this patch depend on disabling page faults or preemption; they are all mapping a page to simply extract some information from it or print some debug info. [1] https://lwn.net/Articles/836144/ [2] https://docs.kernel.org/mm/highmem.html#temporary-virtual-mappings Signed-off-by:
David Reaver <me@davidreaver.com> Link: https://lore.kernel.org/r/20250108192131.46843-1-me@davidreaver.com Signed-off-by:
Song Liu <song@kernel.org> Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 commit a1d9b4fd42d93f46c11e7e9d919a55a3f6ca6126 Author: John Garry <john.g.garry@oracle.com> Date: Mon Nov 18 10:50:18 2024 +0000 md/raid10: Atomic write support Set BLK_FEAT_ATOMIC_WRITES_STACKED to enable atomic writes. For an attempt to atomic write to a region which has bad blocks, error the write as we just cannot do this. It is unlikely to find devices which support atomic writes and bad blocks. Reviewed-by:
Yu Kuai <yukuai3@huawei.com> Signed-off-by:
John Garry <john.g.garry@oracle.com> Reviewed-by:
Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20241118105018.1870052-6-john.g.garry@oracle.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 commit f2a38abf5f1c5aeb3be8e9f4d3d815c867fff7ca Author: John Garry <john.g.garry@oracle.com> Date: Mon Nov 18 10:50:17 2024 +0000 md/raid1: Atomic write support Set BLK_FEAT_ATOMIC_WRITES_STACKED to enable atomic writes. For an attempt to atomic write to a region which has bad blocks, error the write as we just cannot do this. It is unlikely to find devices which support atomic writes and bad blocks. Reviewed-by:
Yu Kuai <yukuai3@huawei.com> Signed-off-by:
John Garry <john.g.garry@oracle.com> Reviewed-by:
Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20241118105018.1870052-5-john.g.garry@oracle.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 commit fa6fec82811bc6ebd3c4337ae4dae36c802c0fc1 Author: John Garry <john.g.garry@oracle.com> Date: Mon Nov 18 10:50:16 2024 +0000 md/raid0: Atomic write support Set BLK_FEAT_ATOMIC_WRITES_STACKED to enable atomic writes. All other stacked device request queue limits should automatically be set properly. With regards to atomic write max bytes limit, this will be set at hw_max_sectors and this is limited by the stripe width, which we want. Reviewed-by:
Yu Kuai <yukuai3@huawei.com> Signed-off-by:
John Garry <john.g.garry@oracle.com> Reviewed-by:
Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20241118105018.1870052-4-john.g.garry@oracle.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 commit ea90d270349d51086d0dddc55821a782040d68f5 Author: John Garry <john.g.garry@oracle.com> Date: Tue Nov 12 16:10:18 2024 +0000 md/raid5: Increase r5conf.cache_name size For compiling with W=1, the following warning can be seen: drivers/md/raid5.c: In function ‘setup_conf’: drivers/md/raid5.c:2423:12: error: ‘%s’ directive output may be truncated writing up to 31 bytes into a region of size between 16 and 26 [-Werror=format-truncation=] "raid%d-%s", conf->level, mdname(conf->mddev)); ^~ drivers/md/raid5.c:2422:3: note: ‘snprintf’ output between 7 and 48 bytes into a destination of size 32 snprintf(conf->cache_name[0], namelen, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ "raid%d-%s", conf->level, mdname(conf->mddev)); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors Increase the array size to avoid this warning. Signed-off-by:
John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20241112161019.4154616-2-john.g.garry@oracle.com Signed-off-by:
Song Liu <song@kernel.org> Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 commit 4cf58d9529097328b669e3c8693ed21e3a041903 Author: John Garry <john.g.garry@oracle.com> Date: Mon Nov 11 11:21:50 2024 +0000 md/raid10: Handle bio_split() errors Add proper bio_split() error handling. For any error, call raid_end_bio_io() and return. Except for discard, where we end the bio directly. Reviewed-by:
Yu Kuai <yukuai3@huawei.com> Reviewed-by:
Hannes Reinecke <hare@suse.de> Signed-off-by:
John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20241111112150.3756529-7-john.g.garry@oracle.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 commit b1a7ad8b5c4fa28325ee7b369a2d545d3e16ccde Author: John Garry <john.g.garry@oracle.com> Date: Mon Nov 11 11:21:49 2024 +0000 md/raid1: Handle bio_split() errors Add proper bio_split() error handling. For any error, call raid_end_bio_io() and return. For the case of an in the write path, we need to undo the increment in the rdev pending count and NULLify the r1_bio->bios[] pointers. For read path failure, we need to undo rdev pending count increment from the earlier read_balance() call. Reviewed-by:
Yu Kuai <yukuai3@huawei.com> Reviewed-by:
Hannes Reinecke <hare@suse.de> Signed-off-by:
John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20241111112150.3756529-6-john.g.garry@oracle.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 commit 74538fdac3e85aae55eb4ed786478ed2384cb85d Author: John Garry <john.g.garry@oracle.com> Date: Mon Nov 11 11:21:48 2024 +0000 md/raid0: Handle bio_split() errors Add proper bio_split() error handling. For any error, set bi_status, end the bio, and return. Reviewed-by:
Yu Kuai <yukuai3@huawei.com> Reviewed-by:
Hannes Reinecke <hare@suse.de> Signed-off-by:
John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20241111112150.3756529-5-john.g.garry@oracle.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 commit fa1944bbe6220eb929e2c02e5e8706b908565711 Author: Xiao Ni <xni@redhat.com> Date: Wed Nov 6 17:51:24 2024 +0800 md/raid5: Wait sync io to finish before changing group cnt One customer reports a bug: raid5 is hung when changing thread cnt while resync is running. The stripes are all in conf->handle_list and new threads can't handle them. Commit b39f35ebe86d ("md: don't quiesce in mddev_suspend()") removes pers->quiesce from mddev_suspend/resume. Before this patch, mddev_suspend needs to wait for all ios including sync io to finish. Now it's used to only wait normal io. Fix this by calling raid5_quiesce from raid5_store_group_thread_cnt directly to wait all sync requests to finish before changing the group cnt. Fixes: b39f35ebe86d ("md: don't quiesce in mddev_suspend()") Cc: stable@vger.kernel.org Signed-off-by:
Xiao Ni <xni@redhat.com> Reviewed-by:
Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20241106095124.74577-1-xni@redhat.com Signed-off-by:
Song Liu <song@kernel.org> Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 commit 6012169e8aae9c0eda38bbedcd7a1540a81220ae Author: Yuan Can <yuancan@huawei.com> Date: Tue Nov 5 21:01:05 2024 +0800 md/md-bitmap: Add missing destroy_work_on_stack() This commit add missed destroy_work_on_stack() operations for unplug_work.work in bitmap_unplug_async(). Fixes: a022325ab970 ("md/md-bitmap: add a new helper to unplug bitmap asynchrously") Cc: stable@vger.kernel.org Signed-off-by:
Yuan Can <yuancan@huawei.com> Reviewed-by:
Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20241105130105.127336-1-yuancan@huawei.com Signed-off-by:
Song Liu <song@kernel.org> Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 commit 649bfec6908bd2365008db79b7328c6c22e662d8 Author: Yu Kuai <yukuai3@huawei.com> Date: Thu Oct 31 11:31:14 2024 +0800 md/raid5: don't set Faulty rdev for blocked_rdev Faulty rdev should never be accessed anymore, hence there is no point to wait for bad block to be acknowledged in this case while handling write request. Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Tested-by:
Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Link: https://lore.kernel.org/r/20241031033114.3845582-8-yukuai1@huaweicloud.com Signed-off-by:
Song Liu <song@kernel.org> Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 commit d419284c95d369f2b77f71fb20f2d61850aa61b8 Author: Yu Kuai <yukuai3@huawei.com> Date: Thu Oct 31 11:31:13 2024 +0800 md/raid10: don't wait for Faulty rdev in wait_blocked_rdev() Faulty rdev should never be accessed anymore, hence there is no point to wait for bad block to be acknowledged in this case while handling write request. Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Tested-by:
Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Link: https://lore.kernel.org/r/20241031033114.3845582-7-yukuai1@huaweicloud.com Signed-off-by:
Song Liu <song@kernel.org> Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 commit ff31a7ef2b13aae27203d7fc29280ab0a2f8bf18 Author: Yu Kuai <yukuai3@huawei.com> Date: Thu Oct 31 11:31:12 2024 +0800 md/raid1: don't wait for Faulty rdev in wait_blocked_rdev() Faulty rdev should never be accessed anymore, hence there is no point to wait for bad block to be acknowledged in this case while handling write request. Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Tested-by:
Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Link: https://lore.kernel.org/r/20241031033114.3845582-6-yukuai1@huaweicloud.com Signed-off-by:
Song Liu <song@kernel.org> Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-
Nigel Croxon authored
JIRA: https://issues.redhat.com/browse/RHEL-73514 commit 88ed59c4cc6c2dbdf03345bce54e0d7a272937bc Author: Yu Kuai <yukuai3@huawei.com> Date: Thu Oct 31 11:31:11 2024 +0800 md/raid1: factor out helper to handle blocked rdev from raid1_write_request() Currently raid1 is preparing IO for underlying disks while checking if any disk is blocked, if so allocated resources must be released, then waiting for rdev to be unblocked and try to prepare IO again. Make code cleaner by checking blocked rdev first, it doesn't matter if rdev is blocked while issuing IO, the IO will wait for rdev to be unblocked or not. Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Tested-by:
Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Link: https://lore.kernel.org/r/20241031033114.3845582-5-yukuai1@huaweicloud.com Signed-off-by:
Song Liu <song@kernel.org> Signed-off-by:
Nigel Croxon <ncroxon@redhat.com>
-