- Mar 04, 2025
-
-
Suren Baghdasaryan authored
When restrict_cma_redirect boot parameter was introduced, its default was set to follow the upstream behavior, which does not restrict any movable allocation from using CMA. However this poses an issue when partners upgrade from previous Android kernels and expect the same behavior, which is to restrict CMA usage to movable allocations with __GFP_CMA. Change the default value of restrict_cma_redirect boot parameter to keep backward compatibility with earlier ACK versions. Partners who need upstream behavior will need to set restrict_cma_redirect=false explicitly. Bug: 399727765 Change-Id: Ia88008578557e38da54a455bc4ce3dc6f86fe52e Signed-off-by:
Suren Baghdasaryan <surenb@google.com>
-
Kalesh Singh authored
cma_get_first_virtzone_base() was not defined if CONFIG_CMA is not set. Define it to return 0 (no-op) if !CONFIG_CMA. Bug: 400651191 Bug: 313807618 Test: tools/bazel run --lto=none //common:kernel_aarch64_microdroid_16k_dist Change-Id: I9083ece26d60cc967baf93e8e29b16d6b0901a15 Signed-off-by:
Kalesh Singh <kaleshsingh@google.com>
-
pengzhongcui authored
2 variable symbol(s) added 'struct tracepoint __tracepoint_android_vh_tune_swappiness' 'struct tracepoint __tracepoint_android_vh_shrink_slab_async' Bug: 399777353 Change-Id: If3fb7fa00349160e5b939b53208725396237c999 Signed-off-by:
pengzhongcui <pengzhongcui@xiaomi.corp-partner.google.com>
-
pengzhongcui authored
one Vendor hook add: android_vh_do_shrink_slab_ex Add vendor hook point in do_shrink_slab to optimize for user experience related threads and time-consuming shrinkers. Bug: 399777353 Change-Id: I63778c73f76930fe27869e33ba6cdb97d50cf543 Signed-off-by:
pengzhongcui <pengzhongcui@xiaomi.corp-partner.google.com>
-
Tangquan Zheng authored
In some cases, our demand for mTHP is not as urgent, while the demand for other resources, such as dma-buf, becomes more pressing. However, the reserved virtual zones may not be efficiently utilized by dma-buf and similar use cases. Therefore, if we are absolutely certain that the product will not require movable zones, we can allow virtual zones to be allocated for requests with unmovable flags. After supporting the use of virtual zones for non-movable allocations, we need to address the large page migration issue triggered by the pin_user_pages function: During large folio performance profiling, we found unnecessary performance loss due to a large number of migrations caused by the pin_user_pages function during boot-up and continuous application startup testing. Call trace: dump_stack+0x18/0x24 folio_add_anon_rmap_ptes+0x294/0x338 remove_migration_pte+0x268/0x514 rmap_walk_anon+0x1c8/0x278 rmap_walk+0x28/0x38 migrate_pages_batch+0xbc8/0x126c migrate_pages+0x16c/0x7a4 __gup_longterm_locked+0x4a4/0x85c pin_user_pages+0x68/0xc4 gup_local_repeat+0x38/0x1cc [mcDrvModule_ffa] tee_mmu_create+0x368/0x804 [mcDrvModule_ffa] client_mmu_create+0x58/0xd4 [mcDrvModule_ffa] wsm_create+0x44/0x11c [mcDrvModule_ffa] session_mc_map+0x174/0x244 [mcDrvModule_ffa] client_mc_map+0x34/0x64 [mcDrvModule_ffa] user_ioctl+0x704/0x830 [mcDrvModule_ffa] __arm64_sys_ioctl+0xa8/0xe4 With TAO enabled, large folio come from either ZONE_NOSPLIT or ZONE_NOMERGE policy zones. Both of the policy zones are movable zones(Refer to folio_is_zone_movable). Therefore folio_is_longterm_pinnable will return false and large folio will be added to the movable_page_list and migrated in the migrate_longterm_unpinnable_pages function. On the other hand, migrating large folio is costly and involves frequent calls to the deferred_split_folio function. What's worse is that dst folio will still alloc from two policy zones, the gfp flag will be set to GFP_TRANSHUGE in the alloc_migration_target function. So the whole migration behaviour seems to become meaningless! Bug: 313807618 Change-Id: I2fdfc4df8b03daa96fd6c2c8c6630d26a8509ad0 Signed-off-by:
Barry Song <v-songbaohua@oppo.com> Signed-off-by:
Shuai Yuanyuan <yuanshuai@oppo.com> Signed-off-by:
Tangquan Zheng <zhengtangquan@oppo.com>
-
Tangquan Zheng authored
We have discovered a bug: when there is an overlap between CMA and TAO, the cma_alloc() function internally causes splitting, which directly triggers a kernel warning. Call trace: split_free_page+0x29c/0x2f0 isolate_single_pageblock+0x38c/0x478 start_isolate_page_range+0x8c/0x178 alloc_contig_range+0xf8/0x2f0 __cma_alloc+0x3dc/0x668 cma_alloc+0x28/0x40 dma_alloc_from_contiguous+0x4c/0x60 atomic_pool_expand+0x9c/0x338 __dma_atomic_pool_init+0x54/0xc8 dma_atomic_pool_init+0xb8/0x200 do_one_initcall+0x80/0x360 kernel_init_freeable+0x2ac/0x568 kernel_init+0x2c/0x1f0 ret_from_fork+0x10/0x20 The fix for this issue is to skip the virtzones area when reserving the CMA region, ensuring that CMA and virtzones do not overlap. Bug: 313807618 Change-Id: I121c75defa6652777491818fcad1e87d14d0f02f Signed-off-by:
Tangquan Zheng <zhengtangquan@oppo.com>
-
Tangquan Zheng authored
While large folios originate from virtual zones, split_folio() migrates them into nr_pages small folios and returns a value greater than 0. In this case, we should retry the move operation using the new small folios as sources. Otherwise, this may trigger a kernel BUG. [ 64.788670] ------------[ cut here ]------------ [ 64.789179] WARNING: CPU: 0 PID: 126 at mm/userfaultfd.c:1760 move_pages+0x2bc/0x1960 [ 64.790059] Modules linked in: [ 64.790866] CPU: 0 PID: 126 Comm: a.out Tainted: G W 6.6.66-g29bb63ce7190-dirty #216 [ 64.791467] Hardware name: linux,dummy-virt (DT) [ 64.791933] pstate: 21402005 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) [ 64.792412] pc : move_pages+0x2bc/0x1960 [ 64.792810] lr : move_pages+0x1a4/0x1960 [ 64.793194] sp : ffff800083ffbbc0 [ 64.793552] x29: ffff800083ffbc50 x28: 0000ffff850a0000 x27: 0000000000000001 [ 64.794412] x26: 0000ffff850b1000 x25: ffff00000576cd80 x24: ffff80008275aaf0 [ 64.795182] x23: ffff0000057625e8 x22: 0000000000000000 x21: ffff000005762a20 [ 64.795951] x20: 0000000000001000 x19: 0000ffff850b0000 x18: 0000000000000000 [ 64.796738] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000028 [ 64.797534] x14: 000000000000467c x13: 0000000000004679 x12: ffff8000834693c8 [ 64.798309] x11: 0000000000000000 x10: ffff8000825bdc20 x9 : ffff8000803e893c [ 64.799107] x8 : ffff000005123900 x7 : ffff8000826c3000 x6 : 0000000000000000 [ 64.799882] x5 : 0000000000000001 x4 : ffff000005123900 x3 : ffff8000825bc008 [ 64.800665] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000005123900 [ 64.801525] Call trace: [ 64.801929] move_pages+0x2bc/0x1960 [ 64.802346] userfaultfd_ioctl+0x484/0x1b98 [ 64.802742] __arm64_sys_ioctl+0xb4/0x100 [ 64.803137] invoke_syscall+0x50/0x120 [ 64.803521] el0_svc_common.constprop.0+0x48/0xf0 [ 64.803916] do_el0_svc+0x24/0x38 [ 64.804288] el0_svc+0x58/0x148 [ 64.804659] el0t_64_sync_handler+0x120/0x130 [ 64.805041] el0t_64_sync+0x1a4/0x1a8 [ 64.805472] irq event stamp: 492 [ 64.805830] hardirqs last enabled at (491): [<ffff8000803ce140>] uncharge_batch+0xd0/0x198 [ 64.806333] hardirqs last disabled at (492): [<ffff8000816761dc>] el1_dbg+0x24/0x98 [ 64.806829] softirqs last enabled at (486): [<ffff800080063368>] handle_softirqs+0x548/0x570 [ 64.807323] softirqs last disabled at (475): [<ffff800080010934>] __do_softirq+0x1c/0x28 [ 64.807813] ---[ end trace 0000000000000000 ]--- Bug: 313807618 Change-Id: Ia8aef8301ed2c8bad3ce690f129c55788330cd26 Signed-off-by:
Barry Song <v-songbaohua@oppo.com> Signed-off-by:
Tangquan Zheng <zhengtangquan@oppo.com>
-
Tangquan Zheng authored
10 function symbol(s) added 'int __ptep_set_access_flags(struct vm_area_struct*, unsigned long, pte_t*, pte_t, int)' 'int __traceiter_android_vh_alloc_swap_folio_gfp(void*, struct vm_area_struct*, gfp_t*)' 'int __traceiter_android_vh_get_swap_pages_bypass(void*, struct swap_info_struct*, int, bool*)' 'int __traceiter_android_vh_replace_anon_vma_name(void*, struct vm_area_struct*, struct anon_vma_name*)' 'int __traceiter_android_vh_reuse_whole_anon_folio(void*, struct folio*, struct vm_fault*, bool*)' 'int __traceiter_android_vh_should_skip_zone(void*, struct zone*, gfp_t, unsigned int, int, bool*)' 'int __traceiter_android_vh_should_split_folio_to_list(void*, struct folio*, bool*)' 'int __traceiter_android_vh_update_unmapped_area_info(void*, struct vm_unmapped_area_info*)' 'pte_t contpte_ptep_get(pte_t*, pte_t)' 'int contpte_ptep_set_access_flags(struct vm_area_struct*, unsigned long, pte_t*, pte_t, int)' Bug: 313807618 Change-Id: I5f2884ec964be2a15e2052dc6b5c55a88b14a424 Signed-off-by:
Tangquan Zheng <zhengtangquan@oppo.com>
-
Tangquan Zheng authored
Barry Song reported a problem here and provided a RFC patch: https://lore.kernel.org/linux-mm/20240831092339.66085-1-21cnbao@gmail.com/ do_wp_page() has no ability to reuse mTHP, then it will CoW many small folios while wp fault occurs at a a read-only mTHP (for example, due to fork). This can somtimes waste lots of memory. David also did that in a more generic approach: https://lkml.kernel.org/r/20240829165627.2256514-1-david@redhat.com neither is merged by mm. Before either David's or Barry's method is ready to be merged into the mainline,we need to submit this GKI hook to support this functionality. android_vh_reuse_whole_anon_folio ----This vendor hook is used to entirely reuse the whole anonymous mTHP in do_wp_page. We also need to export the symbol __ptep_set_access_flags because it is called in our vendor hook function. Bug: 313807618 Change-Id: I366569dd645a4a9e5f14c0d87e3768959e63ae17 Signed-off-by:
Tangquan Zheng <zhengtangquan@oppo.com>
-
Tangquan Zheng authored
We are adding these hooks to customize the mthp functionality. 1.android_vh_alloc_swap_folio_gfp ----We use this vendor hook to update the allocation flags for swapping in large pages. 2.android_vh_get_swap_pages_bypass ----We use dual zram to avoid swap fragmentation, as Chris’s swap reservation has not yet been merged. This vendor hook is used to select different swap devices. 3.android_vh_should_split_folio_to_list ----This vendor hook is used to split shared mapped anonymous mTHP during swap-out. 4.android_vh_should_skip_zone ----This vendor hook is used to prevent mTHP from occupying too much non-virtzone. 5.android_vh_update_unmapped_area_info ----This is used to update vm_unmapped_area_info. 6.android_vh_replace_anon_vma_name ----This is used to mark anon_vma_name for mthp. Bug: 313807618 Change-Id: Ibadb440f89dacad91be17ada9bbff8424e9244d3 Signed-off-by:
Tangquan Zheng <zhengtangquan@oppo.com>
-
- Mar 03, 2025
-
-
Aran Dalton authored
1 function symbol(s) added 'void* devm_pci_remap_cfgspace(struct device*, resource_size_t, resource_size_t)' Bug: 400289337 Change-Id: Ie641be053339715e699f95829c0a17aa4927242c Signed-off-by:
Aran Dalton <arda@allwinnertech.com>
-
Vincent Donnefort authored
A maple tree might need memory allocation during insertion. This is not possible under the rwlock. Therefore, we need to preallocate memory before that lock is taken. This is done with the KVM_DUMMY_PPAGE insert_range. However, to do so, we need to enable the maple tree RCU locking that'll ensure the tree is stable between this lock-less insertion and concurrent tree walks. RCU protection must be manually. The sole limitation to the RCU protection is the mt_find() returned entry might not be valid. This is however not a problem as all the readers are protected from modifiers by the mmu_lock. On the VM destroy path, we can relax the RCU protection. No vCPU can run and as a consequence no concurrent tree access will occur. Bug: 278749606 Bug: 395429108 Change-Id: I8400f3b7bdda76d2a60ddcaeb3ea027607898eb2 Signed-off-by:
Vincent Donnefort <vdonnefort@google.com>
-
Roy Luo authored
[ Upstream commit 399a45e5 ] device_del() can lead to new work being scheduled in gadget->work workqueue. This is observed, for example, with the dwc3 driver with the following call stack: device_del() gadget_unbind_driver() usb_gadget_disconnect_locked() dwc3_gadget_pullup() dwc3_gadget_soft_disconnect() usb_gadget_set_state() schedule_work(&gadget->work) Move flush_work() after device_del() to ensure the workqueue is cleaned up. Fixes: 5702f753 ("usb: gadget: udc-core: move sysfs_notify() to a workqueue") Cc: stable <stable@kernel.org> Bug: 400301689 Change-Id: Icf64956f8a17b1876388546b679cfd203d9701dc Signed-off-by:
Roy Luo <royluo@google.com> Reviewed-by:
Alan Stern <stern@rowland.harvard.edu> Reviewed-by:
Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://lore.kernel.org/r/20250204233642.666991-1-royluo@google.com Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Sasha Levin <sashal@kernel.org> (cherry picked from commit 859cb45a) Signed-off-by:
wei li <sirius.liwei@honor.corp-partner.google.com>
-
Carlos Llamas authored
Adding the following symbols: - xhci_create_secondary_interrupter - xhci_disable_interrupter - xhci_enable_interrupter - xhci_initialize_ring_info - xhci_remove_secondary_interrupter - xhci_set_interrupter_moderation - xhci_stop_endpoint_sync Bug: 391779198 Change-Id: I932747120c850b93468328900db99a1eb7821f47 Signed-off-by:
Carlos Llamas <cmllamas@google.com> [Lee: Rebased to avoid merge conflict - no changes required] Signed-off-by:
Lee Jones <joneslee@google.com>
-
Selvarasu Ganesan authored
The current implementation sets the wMaxPacketSize of bulk in/out endpoints to 1024 bytes at the end of the f_midi_bind function. However, in cases where there is a failure in the first midi bind attempt, consider rebinding. This scenario may encounter an f_midi_bind issue due to the previous bind setting the bulk endpoint's wMaxPacketSize to 1024 bytes, which exceeds the ep->maxpacket_limit where configured dwc3 TX/RX FIFO's maxpacket size of 512 bytes for IN/OUT endpoints in support HS speed only. Here the term "rebind" in this context refers to attempting to bind the MIDI function a second time in certain scenarios. The situations where rebinding is considered include: * When there is a failure in the first UDC write attempt, which may be caused by other functions bind along with MIDI. * Runtime composition change : Example : MIDI,ADB to MIDI. Or MIDI to MIDI,ADB. This commit addresses this issue by resetting the wMaxPacketSize before endpoint claim. And here there is no need to reset all values in the usb endpoint descriptor structure, as all members except wMaxPacketSize and bEndpointAddress have predefined values. This ensures that restores the endpoint to its expected configuration, and preventing conflicts with value of ep->maxpacket_limit. It also aligns with the approach used in other function drivers, which treat endpoint descriptors as if they were full speed before endpoint claim. Fixes: 46decc82 ("usb: gadget: unconditionally allocate hs/ss descriptor in bind operation") Cc: stable@vger.kernel.org Signed-off-by:
Selvarasu Ganesan <selvarasu.g@samsung.com> Link: https://lore.kernel.org/r/20250118060134.927-1-selvarasu.g@samsung.com Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Bug: 399689221 Change-Id: Ib90cffc2a0b1a8b25042b4fa7fcad7947bdf0995 (cherry picked from commit 9e8b2141) Signed-off-by:
Meitao Gao <meitaogao@asrmicro.com>
-
- Feb 28, 2025
-
-
Mukesh Ojha authored
This reverts commit ef1134dd. Sometimes back commit cfb00a35 ("arm64: jump_label: Ensure patched jump_labels are visible to all CPUs") got merged into all stable branches wherever applicable with citing a bug in static key which does not synchronizes among the cpus and adds IPI to all cores to fix this. Kfence is one of the user of static key and recently, it has been observed that after above commit during toggling kfence_allocation_key IPI is being sent to the core which are there low power mode which has regressed power numbers and after disabling CONFIG_KFENCE_STATIC_KEYS we see workload improved in the range of 1% - 10% resulting in 1% - 4% power savings for few audio playback, video decode & display cases and with no regression on benchmarks. Bug: 394509835 Change-Id: I8efa3280bf115c33cc957f83ccb8e578730aa5f5 Signed-off-by:
Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
-
John Scheible authored
Adding the following symbols: - dev_pm_clear_wake_irq - dev_pm_set_wake_irq - dma_buf_vmap_unlocked - dma_buf_vunmap_unlocked - kvm_iommu_cma_alloc - kvm_iommu_cma_release - lru_cache_disable - lru_disable_count - __traceiter_android_vh_typec_store_partner_src_caps - __traceiter_android_vh_typec_tcpm_log - __traceiter_android_vh_typec_tcpm_modify_src_caps - __tracepoint_android_vh_typec_store_partner_src_caps - __tracepoint_android_vh_typec_tcpm_log - __tracepoint_android_vh_typec_tcpm_modify_src_caps - ufshcd_populate_vreg - ufshcd_resume_complete - ufshcd_runtime_resume - ufshcd_runtime_suspend - ufshcd_suspend_prepare Bug: 399486531 Change-Id: Idf6a99e32cb9330968310ee0e364985cdcd0e087 Signed-off-by:
John Scheible <johnscheible@google.com>
-
Yang Yang authored
Due to 72d04bdc ("sbitmap: fix io hung due to race on sbitmap_word ::cleared") directly adding spinlock_t swap_1ock to struct sbitmap_word in sbitmap.h, KMI was damaged. In order to achieve functionality without damaging KMI, we can only apply for a block of memory with a size of map_nr * (sizeof (* sb ->map)+sizeof(spinlock_t)) to ensure that each struct sbitmap-word receives protection from spinlock. The actual memory distribution used is as follows: ---------------------- struct sbitmap_word[0] ...................... struct sbitmap_word[n] ----------------------- spinlock_t swap_lock[0] ....................... spinlock_t swap_lock[n] ---------------------- sbitmap_word[0] corresponds to swap_lock[0], and sbitmap_word[n] corresponds to swap_lock[n], and so on. Fixes: ea86ea2c ("sbitmap: ammortize cost of clearing bits") Signed-off-by:
Yang Yang <yang.yang@vivo.com> Reviewed-by:
Ming Lei <ming.lei@redhat.com> Reviewed-by:
Bart Van Assche <bvanassche@acm.org> Bug: 382398521 Link: https://lore.kernel.org/r/20240716082644.659566-1-yang.yang@vivo.com Change-Id: Idcab0dd5fd7c3147efd05dd6cc51757c2b0464f6 Signed-off-by:
liuyu <liuyu@allwinnertech.com>
-
Yang Yang authored
Configuration for sbq: depth=64, wake_batch=6, shift=6, map_nr=1 1. There are 64 requests in progress: map->word = 0xFFFFFFFFFFFFFFFF 2. After all the 64 requests complete, and no more requests come: map->word = 0xFFFFFFFFFFFFFFFF, map->cleared = 0xFFFFFFFFFFFFFFFF 3. Now two tasks try to allocate requests: T1: T2: __blk_mq_get_tag . __sbitmap_queue_get . sbitmap_get . sbitmap_find_bit . sbitmap_find_bit_in_word . __sbitmap_get_word -> nr=-1 __blk_mq_get_tag sbitmap_deferred_clear __sbitmap_queue_get /* map->cleared=0xFFFFFFFFFFFFFFFF */ sbitmap_find_bit if (!READ_ONCE(map->cleared)) sbitmap_find_bit_in_word return false; __sbitmap_get_word -> nr=-1 mask = xchg(&map->cleared, 0) sbitmap_deferred_clear atomic_long_andnot() /* map->cleared=0 */ if (!(map->cleared)) return false; /* * map->cleared is cleared by T1 * T2 fail to acquire the tag */ 4. T2 is the sole tag waiter. When T1 puts the tag, T2 cannot be woken up due to the wake_batch being set at 6. If no more requests come, T1 will wait here indefinitely. This patch achieves two purposes: 1. Check on ->cleared and update on both ->cleared and ->word need to be done atomically, and using spinlock could be the simplest solution. 2. Add extra check in sbitmap_deferred_clear(), to identify whether ->word has free bits. Fixes: ea86ea2c ("sbitmap: ammortize cost of clearing bits") Signed-off-by:
Yang Yang <yang.yang@vivo.com> Reviewed-by:
Ming Lei <ming.lei@redhat.com> Reviewed-by:
Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240716082644.659566-1-yang.yang@vivo.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> (cherry picked from commit 72d04bdc) Signed-off-by:
liuyu <liuyu@allwinnertech.com> Change-Id: Ibab11ef6a94d4db33fae5c4b314b119abc1cabc8
-
meitaogao authored
2 function symbol(s) added 'struct backlight_device* devm_of_find_backlight(struct device*)' 'void sdhci_reset_tuning(struct sdhci_host*)' Bug: 399689222 Change-Id: I1746ed7c4b0eef2e8f5363681b556af4eb5e7dcb Signed-off-by:
meitaogao <meitaogao@asrmicro.com>
-
Barry Song authored
The try_to_unmap_one() function currently handles PMD-mapped THPs inefficiently. It first splits the PMD into PTEs, copies the dirty state from the PMD to the PTEs, iterates over the PTEs to locate the dirty state, and then marks the THP as swap-backed. This process involves unnecessary PMD splitting and redundant iteration. Instead, this functionality can be efficiently managed in __discard_anon_folio_pmd_locked(), avoiding the extra steps and improving performance. The following microbenchmark redirties folios after invoking MADV_FREE, then measures the time taken to perform memory reclamation (actually set those folios swapbacked again) on the redirtied folios. #include <stdio.h> #include <sys/mman.h> #include <string.h> #include <time.h> #define SIZE 128*1024*1024 // 128 MB int main(int argc, char *argv[]) { while(1) { volatile int *p = mmap(0, SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); memset((void *)p, 1, SIZE); madvise((void *)p, SIZE, MADV_FREE); /* redirty after MADV_FREE */ memset((void *)p, 1, SIZE); clock_t start_time = clock(); madvise((void *)p, SIZE, MADV_PAGEOUT); clock_t end_time = clock(); double elapsed_time = (double)(end_time - start_time) / CLOCKS_PER_SEC; printf("Time taken by reclamation: %f seconds\n", elapsed_time); munmap((void *)p, SIZE); } return 0; } Testing results are as below, w/o patch: ~ # ./a.out Time taken by reclamation: 0.007300 seconds Time taken by reclamation: 0.007226 seconds Time taken by reclamation: 0.007295 seconds Time taken by reclamation: 0.007731 seconds Time taken by reclamation: 0.007134 seconds Time taken by reclamation: 0.007285 seconds Time taken by reclamation: 0.007720 seconds Time taken by reclamation: 0.007128 seconds Time taken by reclamation: 0.007710 seconds Time taken by reclamation: 0.007712 seconds Time taken by reclamation: 0.007236 seconds Time taken by reclamation: 0.007690 seconds Time taken by reclamation: 0.007174 seconds Time taken by reclamation: 0.007670 seconds Time taken by reclamation: 0.007169 seconds Time taken by reclamation: 0.007305 seconds Time taken by reclamation: 0.007432 seconds Time taken by reclamation: 0.007158 seconds Time taken by reclamation: 0.007133 seconds … w/ patch ~ # ./a.out Time taken by reclamation: 0.002124 seconds Time taken by reclamation: 0.002116 seconds Time taken by reclamation: 0.002150 seconds Time taken by reclamation: 0.002261 seconds Time taken by reclamation: 0.002137 seconds Time taken by reclamation: 0.002173 seconds Time taken by reclamation: 0.002063 seconds Time taken by reclamation: 0.002088 seconds Time taken by reclamation: 0.002169 seconds Time taken by reclamation: 0.002124 seconds Time taken by reclamation: 0.002111 seconds Time taken by reclamation: 0.002224 seconds Time taken by reclamation: 0.002297 seconds Time taken by reclamation: 0.002260 seconds Time taken by reclamation: 0.002246 seconds Time taken by reclamation: 0.002272 seconds Time taken by reclamation: 0.002277 seconds Time taken by reclamation: 0.002462 seconds … This patch significantly speeds up try_to_unmap_one() by allowing it to skip redirtied THPs without splitting the PMD. Link: https://lkml.kernel.org/r/20250214093015.51024-5-21cnbao@gmail.com Change-Id: Ifaca70178abd5b22e00d6e59ed4dcff0fc5cb0b6 Signed-off-by:
Barry Song <v-songbaohua@oppo.com> Suggested-by:
Baolin Wang <baolin.wang@linux.alibaba.com> Suggested-by:
Lance Yang <ioworker0@gmail.com> Reviewed-by:
Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by:
Lance Yang <ioworker0@gmail.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chis Li <chrisl@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Gavin Shan <gshan@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kairui Song <kasong@tencent.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mauricio Faria de Oliveira <mfo@canonical.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shaoqin Huang <shahuang@redhat.com> Cc: Tangquan Zheng <zhengtangquan@oppo.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: Yosry Ahmed <yosryahmed@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit 76a230cb https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-unstable) Bug: 313807618 [ Fix trivial conflicts in unmap_huge_pmd_locked() - Kalesh Singh ] [ __discard_anon_folio_pmd_locked() drop changes related to VM_DROPPABLE which doesnt' exist on 6.6 - Kalesh Singh ] Signed-off-by:
Kalesh Singh <kaleshsingh@google.com>
-
Barry Song authored
Currently, the PTEs and rmap of a large folio are removed one at a time. This is not only slow but also causes the large folio to be unnecessarily added to deferred_split, which can lead to races between the deferred_split shrinker callback and memory reclamation. This patch releases all PTEs and rmap entries in a batch. Currently, it only handles lazyfree large folios. The below microbench tries to reclaim 128MB lazyfree large folios whose sizes are 64KiB: #include <stdio.h> #include <sys/mman.h> #include <string.h> #include <time.h> #define SIZE 128*1024*1024 // 128 MB unsigned long read_split_deferred() { FILE *file = fopen("/sys/kernel/mm/transparent_hugepage" "/hugepages-64kB/stats/split_deferred", "r"); if (!file) { perror("Error opening file"); return 0; } unsigned long value; if (fscanf(file, "%lu", &value) != 1) { perror("Error reading value"); fclose(file); return 0; } fclose(file); return value; } int main(int argc, char *argv[]) { while(1) { volatile int *p = mmap(0, SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); memset((void *)p, 1, SIZE); madvise((void *)p, SIZE, MADV_FREE); clock_t start_time = clock(); unsigned long start_split = read_split_deferred(); madvise((void *)p, SIZE, MADV_PAGEOUT); clock_t end_time = clock(); unsigned long end_split = read_split_deferred(); double elapsed_time = (double)(end_time - start_time) / CLOCKS_PER_SEC; printf("Time taken by reclamation: %f seconds, split_deferred: %ld\n", elapsed_time, end_split - start_split); munmap((void *)p, SIZE); } return 0; } w/o patch: ~ # ./a.out Time taken by reclamation: 0.177418 seconds, split_deferred: 2048 Time taken by reclamation: 0.178348 seconds, split_deferred: 2048 Time taken by reclamation: 0.174525 seconds, split_deferred: 2048 Time taken by reclamation: 0.171620 seconds, split_deferred: 2048 Time taken by reclamation: 0.172241 seconds, split_deferred: 2048 Time taken by reclamation: 0.174003 seconds, split_deferred: 2048 Time taken by reclamation: 0.171058 seconds, split_deferred: 2048 Time taken by reclamation: 0.171993 seconds, split_deferred: 2048 Time taken by reclamation: 0.169829 seconds, split_deferred: 2048 Time taken by reclamation: 0.172895 seconds, split_deferred: 2048 Time taken by reclamation: 0.176063 seconds, split_deferred: 2048 Time taken by reclamation: 0.172568 seconds, split_deferred: 2048 Time taken by reclamation: 0.171185 seconds, split_deferred: 2048 Time taken by reclamation: 0.170632 seconds, split_deferred: 2048 Time taken by reclamation: 0.170208 seconds, split_deferred: 2048 Time taken by reclamation: 0.174192 seconds, split_deferred: 2048 ... w/ patch: ~ # ./a.out Time taken by reclamation: 0.074231 seconds, split_deferred: 0 Time taken by reclamation: 0.071026 seconds, split_deferred: 0 Time taken by reclamation: 0.072029 seconds, split_deferred: 0 Time taken by reclamation: 0.071873 seconds, split_deferred: 0 Time taken by reclamation: 0.073573 seconds, split_deferred: 0 Time taken by reclamation: 0.071906 seconds, split_deferred: 0 Time taken by reclamation: 0.073604 seconds, split_deferred: 0 Time taken by reclamation: 0.075903 seconds, split_deferred: 0 Time taken by reclamation: 0.073191 seconds, split_deferred: 0 Time taken by reclamation: 0.071228 seconds, split_deferred: 0 Time taken by reclamation: 0.071391 seconds, split_deferred: 0 Time taken by reclamation: 0.071468 seconds, split_deferred: 0 Time taken by reclamation: 0.071896 seconds, split_deferred: 0 Time taken by reclamation: 0.072508 seconds, split_deferred: 0 Time taken by reclamation: 0.071884 seconds, split_deferred: 0 Time taken by reclamation: 0.072433 seconds, split_deferred: 0 Time taken by reclamation: 0.071939 seconds, split_deferred: 0 ... Link: https://lkml.kernel.org/r/20250214093015.51024-4-21cnbao@gmail.com Change-Id: If4df73981837946621ec25247aa426c06ab7dd28 Signed-off-by:
Barry Song <v-songbaohua@oppo.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chis Li <chrisl@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Gavin Shan <gshan@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kairui Song <kasong@tencent.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Lance Yang <ioworker0@gmail.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mauricio Faria de Oliveira <mfo@canonical.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shaoqin Huang <shahuang@redhat.com> Cc: Tangquan Zheng <zhengtangquan@oppo.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: Yosry Ahmed <yosryahmed@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit a0188db7 https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-unstable) Bug: 313807618 [ Fix conflicts in try_to_unmap_one() - Kalesh Singh ] Signed-off-by:
Kalesh Singh <kaleshsingh@google.com>
-
Barry Song authored
This patch lays the groundwork for supporting batch PTE unmapping in try_to_unmap_one(). It introduces range handling for TLB batch flushing, with the range currently set to the size of PAGE_SIZE. The function __flush_tlb_range_nosync() is architecture-specific and is only used within arch/arm64. This function requires the mm structure instead of the vma structure. To allow its reuse by arch_tlbbatch_add_pending(), which operates with mm but not vma, this patch modifies the argument of __flush_tlb_range_nosync() to take mm as its parameter. Link: https://lkml.kernel.org/r/20250214093015.51024-3-21cnbao@gmail.com Change-Id: Icfca8715feda7e6298a7e2ff76be8ca9a646d8b2 Signed-off-by:
Barry Song <v-songbaohua@oppo.com> Acked-by:
Will Deacon <will@kernel.org> Reviewed-by:
Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shaoqin Huang <shahuang@redhat.com> Cc: Gavin Shan <gshan@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: Lance Yang <ioworker0@gmail.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Yosry Ahmed <yosryahmed@google.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Chis Li <chrisl@kernel.org> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Kairui Song <kasong@tencent.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mauricio Faria de Oliveira <mfo@canonical.com> Cc: Tangquan Zheng <zhengtangquan@oppo.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Bug: 313807618 (cherry picked from commit e00a2e56 https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-unstable) [ Drop changes to riscv which don't exist in 6.6. Fix trivial conflicts in arm64 tlbflush.h - Kalesh Singh ] Signed-off-by:
Kalesh Singh <kaleshsingh@google.com>
-
Barry Song authored
Patch series "mm: batched unmap lazyfree large folios during reclamation", v4. Commit 735ecdfa ("mm/vmscan: avoid split lazyfree THP during shrink_folio_list()") prevents the splitting of MADV_FREE'd THP in madvise.c. However, those folios are still added to the deferred_split list in try_to_unmap_one() because we are unmapping PTEs and removing rmap entries one by one. Firstly, this has rendered the following counter somewhat confusing, /sys/kernel/mm/transparent_hugepage/hugepages-size/stats/split_deferred The split_deferred counter was originally designed to track operations such as partial unmap or madvise of large folios. However, in practice, most split_deferred cases arise from memory reclamation of aligned lazyfree mTHPs as observed by Tangquan. This discrepancy has made the split_deferred counter highly misleading. Secondly, this approach is slow because it requires iterating through each PTE and removing the rmap one by one for a large folio. In fact, all PTEs of a pte-mapped large folio should be unmapped at once, and the entire folio should be removed from the rmap as a whole. Thirdly, it also increases the risk of a race condition where lazyfree folios are incorrectly set back to swapbacked, as a speculative folio_get may occur in the shrinker's callback. deferred_split_scan() might call folio_try_get(folio) since we have added the folio to split_deferred list while removing rmap for the 1st subpage, and while we are scanning the 2nd to nr_pages PTEs of this folio in try_to_unmap_one(), the entire mTHP could be transitioned back to swap-backed because the reference count is incremented, which can make "ref_count == 1 + map_count" within try_to_unmap_one() false. /* * The only page refs must be one from isolation * plus the rmap(s) (dropped by discard:). */ if (ref_count == 1 + map_count && (!folio_test_dirty(folio) || ... (vma->vm_flags & VM_DROPPABLE))) { dec_mm_counter(mm, MM_ANONPAGES); goto discard; } This patchset resolves the issue by marking only genuinely dirty folios as swap-backed, as suggested by David, and transitioning to batched unmapping of entire folios in try_to_unmap_one(). Consequently, the deferred_split count drops to zero, and memory reclamation performance improves significantly — reclaiming 64KiB lazyfree large folios is now 2.5x faster(The specific data is embedded in the changelog of patch 3/4). By the way, while the patchset is primarily aimed at PTE-mapped large folios, Baolin and Lance also found that try_to_unmap_one() handles lazyfree redirtied PMD-mapped large folios inefficiently — it splits the PMD into PTEs and iterates over them. This patchset removes the unnecessary splitting, enabling us to skip redirtied PMD-mapped large folios 3.5X faster during memory reclamation. (The specific data can be found in the changelog of patch 4/4). This patch (of 4): The refcount may be temporarily or long-term increased, but this does not change the fundamental nature of the folio already being lazy- freed. Therefore, we only reset 'swapbacked' when we are certain the folio is dirty and not droppable. Link: https://lkml.kernel.org/r/20250214093015.51024-1-21cnbao@gmail.com Link: https://lkml.kernel.org/r/20250214093015.51024-2-21cnbao@gmail.com Fixes: 6c8e2a25 ("mm: fix race between MADV_FREE reclaim and blkdev direct IO read") Change-Id: Ifb1d1851924ad6264caab7e0178b0f910f4b62a1 Signed-off-by:
Barry Song <v-songbaohua@oppo.com> Suggested-by:
David Hildenbrand <david@redhat.com> Acked-by:
David Hildenbrand <david@redhat.com> Reviewed-by:
Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by:
Lance Yang <ioworker0@gmail.com> Cc: Mauricio Faria de Oliveira <mfo@canonical.com> Cc: Chis Li <chrisl@kernel.org> (Google) Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Kairui Song <kasong@tencent.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Tangquan Zheng <zhengtangquan@oppo.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Gavin Shan <gshan@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Shaoqin Huang <shahuang@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: Yosry Ahmed <yosryahmed@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Bug: 313807618 (cherry picked from commit 2e595b90 https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-unstable) [ Fix conflicts in try_to_unmap_one() and drop changes for VM_DROPPABLE - Kalesh Singh ] Signed-off-by:
Kalesh Singh <kaleshsingh@google.com>
-
Andrew Morton authored
Fix used-uninitialized of `page'. Fixes: dce7d10b ("mm/madvise: optimize lazyfreeing with mTHP in madvise_free") Reported-by:
kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202406260514.SLhNM9kQ-lkp@intel.com Cc: Lance Yang <ioworker0@gmail.com> Change-Id: I35d79caf4fc6b2cabdcc435b6fa259681d3ee10f Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit d40f74ab) Bug: 313807618 Signed-off-by:
Kalesh Singh <kaleshsingh@google.com>
-
Lance Yang authored
When the user no longer requires the pages, they would use madvise(MADV_FREE) to mark the pages as lazy free. Subsequently, they typically would not re-write to that memory again. During memory reclaim, if we detect that the large folio and its PMD are both still marked as clean and there are no unexpected references (such as GUP), so we can just discard the memory lazily, improving the efficiency of memory reclamation in this case. On an Intel i5 CPU, reclaiming 1GiB of lazyfree THPs using mem_cgroup_force_empty() results in the following runtimes in seconds (shorter is better): -------------------------------------------- | Old | New | Change | -------------------------------------------- | 0.683426 | 0.049197 | -92.80% | -------------------------------------------- [ioworker0@gmail.com: minor changes per David] Link: https://lkml.kernel.org/r/20240622100057.3352-1-ioworker0@gmail.com Link: https://lkml.kernel.org/r/20240614015138.31461-4-ioworker0@gmail.com Change-Id: I716b3f00627134eb58fbaa44a8cc81fd11f52f8c Signed-off-by:
Lance Yang <ioworker0@gmail.com> Suggested-by:
Zi Yan <ziy@nvidia.com> Suggested-by:
David Hildenbrand <david@redhat.com> Cc: Bang Li <libang.li@antgroup.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: Fangrui Song <maskray@google.com> Cc: Jeff Xie <xiehuan09@gmail.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: SeongJae Park <sj@kernel.org> Cc: Yang Shi <shy828301@gmail.com> Cc: Yin Fengwei <fengwei.yin@intel.com> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit 735ecdfa) Bug: 313807618 Signed-off-by:
Kalesh Singh <kaleshsingh@google.com>
-
Lance Yang authored
In preparation for supporting try_to_unmap_one() to unmap PMD-mapped folios, start the pagewalk first, then call split_huge_pmd_address() to split the folio. Link: https://lkml.kernel.org/r/20240614015138.31461-3-ioworker0@gmail.com Change-Id: I43f84f3e1d528bbacb239ad61e75e7c76487bc0d Signed-off-by:
Lance Yang <ioworker0@gmail.com> Suggested-by:
David Hildenbrand <david@redhat.com> Acked-by:
David Hildenbrand <david@redhat.com> Suggested-by:
Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by:
Zi Yan <ziy@nvidia.com> Cc: Bang Li <libang.li@antgroup.com> Cc: Barry Song <baohua@kernel.org> Cc: Fangrui Song <maskray@google.com> Cc: Jeff Xie <xiehuan09@gmail.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: SeongJae Park <sj@kernel.org> Cc: Yang Shi <shy828301@gmail.com> Cc: Yin Fengwei <fengwei.yin@intel.com> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit 29e847d2) Bug: 313807618 [ Fix trivial conflict in __split_huge_pmd(); due to pmd_folio() not present in 6.6. instead use the equivalent page_folio(pmd_page(pmd)) - Kalesh Singh ] Signed-off-by:
Kalesh Singh <kaleshsingh@google.com>
-
Lance Yang authored
Patch series "Reclaim lazyfree THP without splitting", v8. This series adds support for reclaiming PMD-mapped THP marked as lazyfree without needing to first split the large folio via split_huge_pmd_address(). When the user no longer requires the pages, they would use madvise(MADV_FREE) to mark the pages as lazy free. Subsequently, they typically would not re-write to that memory again. During memory reclaim, if we detect that the large folio and its PMD are both still marked as clean and there are no unexpected references(such as GUP), so we can just discard the memory lazily, improving the efficiency of memory reclamation in this case. Performance Testing =================== On an Intel i5 CPU, reclaiming 1GiB of lazyfree THPs using mem_cgroup_force_empty() results in the following runtimes in seconds (shorter is better): -------------------------------------------- | Old | New | Change | -------------------------------------------- | 0.683426 | 0.049197 | -92.80% | -------------------------------------------- This patch (of 8): Introduce the labels walk_done and walk_abort as exit points to eliminate duplicated exit code in the pagewalk loop. Link: https://lkml.kernel.org/r/20240614015138.31461-1-ioworker0@gmail.com Link: https://lkml.kernel.org/r/20240614015138.31461-2-ioworker0@gmail.com Change-Id: I82a14587672d6cbfb977795ffb6c5b06841adcd9 Signed-off-by:
Lance Yang <ioworker0@gmail.com> Reviewed-by:
Zi Yan <ziy@nvidia.com> Reviewed-by:
Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by:
David Hildenbrand <david@redhat.com> Reviewed-by:
Barry Song <baohua@kernel.org> Cc: Bang Li <libang.li@antgroup.com> Cc: Fangrui Song <maskray@google.com> Cc: Jeff Xie <xiehuan09@gmail.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: SeongJae Park <sj@kernel.org> Cc: Yang Shi <shy828301@gmail.com> Cc: Yin Fengwei <fengwei.yin@intel.com> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit 26d21b18) Bug: 313807618 Signed-off-by:
Kalesh Singh <kaleshsingh@google.com>
-
- Feb 27, 2025
-
-
Udipto Goswami authored
This reverts commit cf57490a. The USB_XHCI_SIDEBAND driver is currently under development in the upstream kernel. Enabling it in the Generic Kernel Image (GKI) at this stage poses potential risks. The snapshot of the driver included in android15-6.6 is an early revision and lacks several critical fixes present in the latest upstream revisions. Bug: 391779198 Change-Id: Ifc0106e3773064b0e1ec5f770f22cb6ba68c4cad Signed-off-by:
Udipto Goswami <quic_ugoswami@quicinc.com> Signed-off-by:
Srinivasarao Pathipati <quic_c_spathi@quicinc.com> Signed-off-by:
Carlos Llamas <cmllamas@google.com>
-
Carlos Llamas authored
Users of such symbols have been notified and are in agreement. 7 function symbol(s) removed 'int xhci_sideband_add_endpoint(struct xhci_sideband*, struct usb_host_endpoint*)' 'int xhci_sideband_create_interrupter(struct xhci_sideband*, int, int, bool)' 'int xhci_sideband_enable_interrupt(struct xhci_sideband*, u32)' 'struct xhci_sideband* xhci_sideband_register(struct usb_device*)' 'int xhci_sideband_remove_endpoint(struct xhci_sideband*, struct usb_host_endpoint*)' 'void xhci_sideband_remove_interrupter(struct xhci_sideband*)' 'void xhci_sideband_unregister(struct xhci_sideband*)' Bug: 394470945 Change-Id: Ie553e3ccf96def4f2e9f3deffbf498296b082325 Signed-off-by:
Carlos Llamas <cmllamas@google.com>
-
Srinivasarao Pathipati authored
This reverts commit 7c12a8c0. Reason for revert: Disabling CONFIG_USB_XHCI_SIDEBAND in gerrit https://r.android.com/3464443 , so revert symbol change also. Bug: 391779198 Change-Id: I29eeee78d8e5a8495032b587d4268766d24bebe8 Signed-off-by:
Srinivasarao Pathipati <quic_c_spathi@quicinc.com> Signed-off-by:
Carlos Llamas <cmllamas@google.com>
-
JaeHun Jung authored
This reverts commit 812c7d0e. Reason for revert: Disabling CONFIG_USB_XHCI_SIDEBAND in change https://r.android.com/3464443 , so revert symbol change also. Bug: 394470945 Change-Id: I8ab255f00790c7a4731fbd171a44d8b804019e3d Signed-off-by:
JaeHun Jung <jh0801.jung@samsung.com> Signed-off-by:
Carlos Llamas <cmllamas@google.com>
-
yipeng xiang authored
White list the __tracepoint_android_vh_add_lazyfree_bypass. 1 function symbol(s) added 'int __traceiter_android_vh_add_lazyfree_bypass(void*, struct lruvec*, struct folio*, bool*)' 1 variable symbol(s) added 'struct tracepoint __tracepoint_android_vh_add_lazyfree_bypass' Bug: 396330858 Change-Id: I89936604aa9ae5b82f7adef55df469620759b659 Signed-off-by:
yipeng xiang <yipengxiang@honor.corp-partner.google.com>
-
yipeng xiang authored
Add vendor hook to add lazyfree to lru tail in lru_lazyfree Bug: 396330858 Change-Id: I3d421992811fa87cf1c5c45cfe2a08e06004da83 Signed-off-by:
yipeng xiang <yipengxiang@honor.corp-partner.google.com>
-
Jaegeuk Kim authored
1. fadvise(fd1, POSIX_FADV_NOREUSE, {0,3}); 2. fadvise(fd2, POSIX_FADV_NOREUSE, {1,2}); 3. fadvise(fd3, POSIX_FADV_NOREUSE, {3,1}); 4. echo 1024 > /sys/fs/f2fs/tuning/reclaim_caches_kb This gives a way to reclaim file-backed pages by iterating all f2fs mounts until reclaiming 1MB page cache ranges, registered by #1, #2, and #3. 5. cat /sys/fs/f2fs/tuning/reclaim_caches_kb -> gives total number of registered file ranges. Bug: 390229090 Reviewed-by:
Chao Yu <chao@kernel.org> Change-Id: I58f09afe4533a1814f3646dd21f883048f891f86 Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org> (cherry picked from commit a907f3a6 https: //git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git/ dev)
-
Jaegeuk Kim authored
This patch records POSIX_FADV_NOREUSE ranges for users to reclaim the caches instantly off from LRU. Bug: 390229090 Reviewed-by:
Chao Yu <chao@kernel.org> Change-Id: If83a0b97df99c45464645499ed81979dacd0ac28 Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org> (cherry picked from commit ef0c333c https: //git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git/ dev)
-
Jaegeuk Kim authored
This patch adds an ioctl to give a per-file priority hint to attach REQ_PRIO. Bug: 325443469 Reviewed-by:
Chao Yu <chao@kernel.org> Change-Id: I5110e1fc295e4e1bc3a302b75b92050bfc17059b Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org> (cherry picked from commit 5f95c181 https: //git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git/ dev)
-
Jaegeuk Kim authored
In /sys/fs/f2fs/features, there's no f2fs_sb_info, so let's avoid to get the pointer. Bug: 390229090 Reviewed-by:
Chao Yu <chao@kernel.org> Change-Id: I1a3575018046187b239daa4061d05520f3c7f1b2 Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org> (cherry picked from commit 21925ede https: //git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git/ dev)
-
- Feb 26, 2025
-
-
Ram Prakash Gupta authored
Buffer allocated in sequential files are used by firmware loader, when no vendor firmware path is used, firmware uploader is modifying the memory in the redzone. ============================================================================= BUG kmalloc-4k (Tainted: G W OE ): Left Redzone overwritten ----------------------------------------------------------------------------- 0xffffff8854ae0fff-0xffffff8854ae0fff @offset=4095. First byte 0x0 instead of 0xcc Allocated in kvmalloc_node+0x194/0x2b4 age=10 cpu=2 pid=4758 __kmem_cache_alloc_node+0x2a8/0x388 __kmalloc_node+0x60/0x1e0 kvmalloc_node+0x194/0x2b4 seq_read_iter+0x8c/0x4f0 kernfs_fop_read_iter+0x70/0x1ec vfs_read+0x238/0x2d8 ksys_read+0x78/0xe8 __arm64_sys_read+0x1c/0x2c invoke_syscall+0x58/0x114 el0_svc_common+0xac/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x3c/0x74 el0t_64_sync_handler+0x68/0xbc el0t_64_sync+0x1a8/0x1ac Freed in kfree_link+0x10/0x20 age=46 cpu=1 pid=4786 __kmem_cache_free+0x268/0x358 kfree+0xa0/0x168 kfree_link+0x10/0x20 walk_component+0x90/0x128 link_path_walk+0x27c/0x3cc path_openat+0x94/0xc7c do_filp_open+0xb8/0x164 do_sys_openat2+0x84/0xf0 __arm64_sys_openat+0x70/0x9c invoke_syscall+0x58/0x114 el0_svc_common+0xac/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x3c/0x74 el0t_64_sync_handler+0x68/0xbc el0t_64_sync+0x1a8/0x1ac Redzone ffffff8854ae0ff0: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 00 redzone modified ^^ Object ffffff8854ae1000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Add check to avoid memory update in redzone when no vendor firmware path is used. Fixes: 2a46f357 ("ANDROID: firmware_loader: Add support for customer firmware paths") Bug: 395517985 Change-Id: If58a44c0c8a26f3fe58b0e37b0fcc1f0e88e28cb Signed-off-by:
Ram Prakash Gupta <quic_rampraka@quicinc.com> Signed-off-by:
Souradeep Chowdhury <quic_schowdhu@quicinc.com>
-
- Feb 25, 2025
-
-
Daniel Mentz authored
When populating a Level 1 Stream Table Descriptor, perform the required Cache Maintenance Operations for this change to become visible to SMMU. This is only required for non-coherent SMMUs. Bug: 397554239 Fixes: 6a072b18 ("ANDROID: drivers/arm-smmu-v3-kvm: Support non-coherent SMMUs") Change-Id: I5c8391f487c14389698bdcc84e09c87c21270b95 Signed-off-by:
Daniel Mentz <danielmentz@google.com>
-