Skip to content
Snippets Groups Projects
  1. Mar 18, 2025
    • Chunhai Guo's avatar
      BACKPORT: erofs: allocate more short-lived pages from reserved pool first · fdae1f4c
      Chunhai Guo authored
      
      This patch aims to allocate bvpages and short-lived compressed pages
      from the reserved pool first.
      
      After applying this patch, there are three benefits.
      
      1. It reduces the page allocation time.
       The bvpages and short-lived compressed pages account for about 4% of
      the pages allocated from the system in the multi-app launch benchmarks
      [1]. It reduces the page allocation time accordingly and lowers the
      likelihood of blockage by page allocation in low memory scenarios.
      
      2. The pages in the reserved pool will be allocated on demand.
       Currently, bvpages and short-lived compressed pages are short-lived
      pages allocated from the system, and the pages in the reserved pool all
      originate from short-lived pages. Consequently, the number of reserved
      pool pages will increase to z_erofs_rsv_nrpages over time.
       With this patch, all short-lived pages are allocated from the reserved
      pool first, so the number of reserved pool pages will only increase when
      there are not enough pages. Thus, even if z_erofs_rsv_nrpages is set to
      a large number for specific reasons, the actual number of reserved pool
      pages may remain low as per demand. In the multi-app launch benchmarks
      [1], z_erofs_rsv_nrpages is set at 256, while the number of reserved
      pool pages remains below 64.
      
      3. When erofs cache decompression is disabled
         (EROFS_ZIP_CACHE_DISABLED), all pages will *only* be allocated from
      the reserved pool for erofs. This will significantly reduce the memory
      pressure from erofs.
      
      [1] For additional details on the multi-app launch benchmarks, please
      refer to commit 0f6273ab ("erofs: add a reserved buffer pool for lz4
      decompression").
      
      Signed-off-by: default avatarChunhai Guo <guochunhai@vivo.com>
      Reviewed-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Link: https://lore.kernel.org/r/20240906121110.3701889-1-guochunhai@vivo.com
      
      
      Signed-off-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      
      Bug: 387202250
      Bug: 404427448
      Change-Id: Ife45adcb4c22c9d73952db1de956e1b9cda1b8c2
      (cherry picked from commit 79f504a2)
      Signed-off-by: default avatarliujinbao1 <liujinbao1@xiaomi.corp-partner.google.com>
      (cherry picked from commit 6e7af99d)
      fdae1f4c
  2. Mar 11, 2025
    • Kaiqian Zhu's avatar
      BACKPORT: FROMGIT: cgroup/cpuset: Make cpuset hotplug processing synchronous · 7f19c751
      Kaiqian Zhu authored
      
      Since commit 3a5a6d0c("cpuset: don't nest cgroup_mutex inside
      get_online_cpus()"), cpuset hotplug was done asynchronously via a work
      function. This is to avoid recursive locking of cgroup_mutex.
      
      Since then, the cgroup locking scheme has changed quite a bit. A
      cpuset_mutex was introduced to protect cpuset specific operations.
      The cpuset_mutex is then replaced by a cpuset_rwsem. With commit
      d74b27d6 ("cgroup/cpuset: Change cpuset_rwsem and hotplug lock
      order"), cpu_hotplug_lock is acquired before cpuset_rwsem. Later on,
      cpuset_rwsem is reverted back to cpuset_mutex. All these locking changes
      allow the hotplug code to call into cpuset core directly.
      
      The following commits were also merged due to the asynchronous nature
      of cpuset hotplug processing.
      
      - commit b22afcdf ("cpu/hotplug: Cure the cpusets trainwreck")
      - commit 50e76632 ("sched/cpuset/pm: Fix cpuset vs. suspend-resume
      bugs")
      - commit 28b89b9e ("cpuset: handle race between CPU hotplug and
      cpuset_hotplug_work")
      
      Clean up all these bandages by making cpuset hotplug
      processing synchronous again with the exception that the call to
      cgroup_transfer_tasks() to transfer tasks out of an empty cgroup v1
      cpuset, if necessary, will still be done via a work function due to the
      existing cgroup_mutex -> cpu_hotplug_lock dependency. It is possible
      to reverse that dependency, but that will require updating a number of
      different cgroup controllers. This special hotplug code path should be
      rarely taken anyway.
      
      As all the cpuset states will be updated by the end of the hotplug
      operation, we can revert most the above commits except commit
      50e76632 ("sched/cpuset/pm: Fix cpuset vs. suspend-resume bugs")
      which is partially reverted. Also removing some cpus_read_lock trylock
      attempts in the cpuset partition code as they are no longer necessary
      since the cpu_hotplug_lock is now held for the whole duration of the
      cpuset hotplug code path.
      
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Tested-by: default avatarValentin Schneider <vschneid@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      
      Bug: 401393559
      Bug: 402078031
      (cherry picked from commit 2125c003 https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
      
       master)
      [kaiqian: Removed all the cpus_read_trylock() related functions introduced in later cpuset updates]
      Change-Id: I252e24629388a0be5746ac4cc3f475ea6767a462
      Signed-off-by: default avatarZhu Kaiqian <zhukaiqian@xiaomi.com>
      2 tags
      7f19c751
  3. Mar 06, 2025
    • pengzhongcui's avatar
      ANDROID: GKI: Update symbol list for xiaomi · 12cad3f6
      pengzhongcui authored
      
      2 variable symbol(s) added
        'struct tracepoint __tracepoint_android_vh_tune_swappiness'
        'struct tracepoint __tracepoint_android_vh_shrink_slab_async'
      
      Bug: 399777353
      Bug: 400818928
      Bug: 401099893
      
      Change-Id: If3fb7fa00349160e5b939b53208725396237c999
      Signed-off-by: default avatarpengzhongcui <pengzhongcui@xiaomi.corp-partner.google.com>
      (cherry picked from commit 8f602e19)
    • pengzhongcui's avatar
      ANDROID: vendor_hook: Add hook is to optimize the time consumption of shrink slab · ec559a7e
      pengzhongcui authored
      
      one Vendor hook add:
          android_vh_do_shrink_slab_ex
      
      Add vendor hook point in do_shrink_slab to optimize for user
      experience related threads and time-consuming shrinkers.
      
      Bug: 399777353
      Bug: 400818928
      Bug: 401099893
      
      Change-Id: I63778c73f76930fe27869e33ba6cdb97d50cf543
      Signed-off-by: default avatarpengzhongcui <pengzhongcui@xiaomi.corp-partner.google.com>
      (cherry picked from commit 05ab4ba8)
      ec559a7e
    • xiaoxiang.xiong's avatar
      ANDROID: GKI: Update symbol list for transsion · 3f0f1c71
      xiaoxiang.xiong authored
      
      74 function symbol(s) added
        'u64 __blkg_prfill_rwstat(struct seq_file*, struct blkg_policy_data*, const struct blkg_rwstat_sample*)'
        'int __percpu_counter_init_many(struct percpu_counter*, s64, gfp_t, u32, struct lock_class_key*)'
        's64 __percpu_counter_sum(struct percpu_counter*)'
        'int _atomic_dec_and_lock_irqsave(atomic_t*, spinlock_t*, unsigned long*)'
        'void add_disk_randomness(struct gendisk*)'
        'ssize_t badblocks_show(struct badblocks*, char*, int)'
        'void bdev_end_io_acct(struct block_device*, enum req_op, unsigned int, unsigned long)'
        'unsigned long bdev_start_io_acct(struct block_device*, enum req_op, unsigned long)'
        'const char* bdi_dev_name(struct backing_dev_info*)'
        'void bio_associate_blkg_from_css(struct bio*, struct cgroup_subsys_state*)'
        'struct bio* bio_split(struct bio*, int, gfp_t, struct bio_set*)'
        'void bio_uninit(struct bio*)'
        'struct gendisk* blk_mq_alloc_disk_for_queue(struct request_queue*, struct lock_class_key*)'
        'void blk_queue_required_elevator_features(struct request_queue*, unsigned int)'
        'void blkcg_print_blkgs(struct seq_file*, struct blkcg*, u64(*)(struct seq_file*, struct blkg_policy_data*, int), const struct blkcg_policy*, int, bool)'
        'int blkg_conf_prep(struct blkcg*, const struct blkcg_policy*, struct blkg_conf_ctx*)'
        'u64 blkg_prfill_rwstat(struct seq_file*, struct blkg_policy_data*, int)'
        'void blkg_rwstat_exit(struct blkg_rwstat*)'
        'int blkg_rwstat_init(struct blkg_rwstat*, gfp_t)'
        'void blkg_rwstat_recursive_sum(struct blkcg_gq*, struct blkcg_policy*, int, struct blkg_rwstat_sample*)'
        'enum scsi_pr_type block_pr_type_to_scsi(enum pr_type)'
        'int block_read_full_folio(struct folio*, get_block_t*)'
        'struct bsg_device* bsg_register_queue(struct request_queue*, struct device*, const char*, bsg_sg_io_fn*)'
        'void bsg_unregister_queue(struct bsg_device*)'
        'void call_rcu_hurry(struct callback_head*, rcu_callback_t)'
        'unsigned long clock_t_to_jiffies(unsigned long)'
        'int devcgroup_check_permission(short, u32, u32, short)'
        'bool disk_check_media_change(struct gendisk*)'
        'struct device_driver* driver_find(const char*, const struct bus_type*)'
        'blk_status_t errno_to_blk_status(int)'
        'bool folio_mark_dirty(struct folio*)'
        'struct cpumask* group_cpus_evenly(unsigned int)'
        'struct io_cq* ioc_find_get_icq(struct request_queue*)'
        'struct io_cq* ioc_lookup_icq(struct request_queue*)'
        'void* kmem_cache_alloc_node(struct kmem_cache*, gfp_t, int)'
        'void* mempool_alloc_pages(gfp_t, void*)'
        'void mempool_free_pages(void*, void*)'
        'unsigned int mmc_calc_max_discard(struct mmc_card*)'
        'int mmc_card_alternative_gpt_sector(struct mmc_card*, sector_t*)'
        'int mmc_cqe_recovery(struct mmc_host*)'
        'int mmc_cqe_start_req(struct mmc_host*, struct mmc_request*)'
        'void mmc_crypto_prepare_req(struct mmc_queue_req*)'
        'int mmc_detect_card_removed(struct mmc_host*)'
        'int mmc_erase(struct mmc_card*, unsigned int, unsigned int, unsigned int)'
        'int mmc_poll_for_busy(struct mmc_card*, unsigned int, bool, enum mmc_busy_cmd)'
        'int mmc_register_driver(struct mmc_driver*)'
        'void mmc_retune_pause(struct mmc_host*)'
        'void mmc_retune_unpause(struct mmc_host*)'
        'void mmc_run_bkops(struct mmc_card*)'
        'int mmc_sanitize(struct mmc_card*, unsigned int)'
        'int mmc_start_request(struct mmc_host*, struct mmc_request*)'
        'void mmc_unregister_driver(struct mmc_driver*)'
        'void percpu_counter_destroy_many(struct percpu_counter*, u32)'
        'bool percpu_ref_is_zero(struct percpu_ref*)'
        'void percpu_ref_kill_and_confirm(struct percpu_ref*, percpu_ref_func_t*)'
        'void percpu_ref_resurrect(struct percpu_ref*)'
        'void percpu_ref_switch_to_atomic_sync(struct percpu_ref*)'
        'void percpu_ref_switch_to_percpu(struct percpu_ref*)'
        'void put_io_context(struct io_context*)'
        'int radix_tree_preload(gfp_t)'
        'struct folio* read_cache_folio(struct address_space*, unsigned long, filler_t*, struct file*)'
        'enum scsi_disposition scsi_check_sense(struct scsi_cmnd*)'
        'void scsi_eh_finish_cmd(struct scsi_cmnd*, struct list_head*)'
        'enum pr_type scsi_pr_type_to_block(enum scsi_pr_type)'
        'int scsi_rescan_device(struct scsi_device*)'
        'const u8* scsi_sense_desc_find(const u8*, int, int)'
        'void sdev_evt_send_simple(struct scsi_device*, enum scsi_device_event, gfp_t)'
        'int thaw_super(struct super_block*, enum freeze_holder)'
        'void trace_seq_puts(struct trace_seq*, const char*)'
        'int transport_add_device(struct device*)'
        'void transport_configure_device(struct device*)'
        'void transport_destroy_device(struct device*)'
        'void transport_remove_device(struct device*)'
        'void transport_setup_device(struct device*)'
      
      2 variable symbol(s) added
        'struct cgroup_subsys io_cgrp_subsys'
        'struct static_key_true io_cgrp_subsys_on_dfl_key'
      
      Bug: 400475995
      Bug: 401190798
      Change-Id: I959e7f45641df674096da689089096bd14e4ed65
      Signed-off-by: default avatarxiaoxiang.xiong <xiaoxiang.xiong@transsion.com>
      (cherry picked from commit ca0752ee)
      3f0f1c71
  4. Mar 05, 2025
    • weipengliang's avatar
      ANDROID: GKI: update symbol list file for xiaomi · 89f91459
      weipengliang authored
      
      38 function symbol(s) added
        'unsigned long __alloc_pages_bulk(gfp_t, int, nodemask_t*, int, struct list_head*, struct page**)'
        'int __hwspin_trylock(struct hwspinlock*, int, unsigned long*)'
        'int __traceiter_android_vh_freq_table_limits(void*, struct cpufreq_policy*, unsigned int, unsigned int)'
        'int __traceiter_cma_alloc_busy_retry(void*, const char*, unsigned long, const struct page*, unsigned long, unsigned int)'
        'int __traceiter_cma_alloc_finish(void*, const char*, unsigned long, const struct page*, unsigned long, unsigned int, int)'
        'int __traceiter_cma_alloc_start(void*, const char*, unsigned long, unsigned int)'
        'int __traceiter_cma_release(void*, const char*, unsigned long, const struct page*, unsigned long)'
        'void arch_wb_cache_pmem(void*, size_t)'
        'int blk_crypto_init_key(struct blk_crypto_key*, const u8*, size_t, enum blk_crypto_key_type, enum blk_crypto_mode_num, unsigned int, unsigned int)'
        'int blk_crypto_start_using_key(struct block_device*, const struct blk_crypto_key*)'
        'unsigned int cpumask_any_distribute(const struct cpumask*)'
        'void devfreq_get_freq_range(struct devfreq*, unsigned long*, unsigned long*)'
        'int device_property_read_u64_array(const struct device*, const char*, u64*, size_t)'
        'int devm_rproc_add(struct device*, struct rproc*)'
        'struct rproc* devm_rproc_alloc(struct device*, const char*, const struct rproc_ops*, const char*, int)'
        'int dma_fence_signal_timestamp(struct dma_fence*, ktime_t)'
        'int dw_pcie_link_up(struct dw_pcie*)'
        'int fwnode_irq_get(const struct fwnode_handle*, unsigned int)'
        'int gether_get_host_addr_cdc(struct net_device*, char*, int)'
        'int gpio_request_array(const struct gpio*, size_t)'
        'int hwspin_lock_get_id(struct hwspinlock*)'
        'struct device_node* of_get_next_cpu_node(struct device_node*)'
        'const char* pci_speed_string(enum pci_bus_speed)'
        'int pcie_capability_write_word(struct pci_dev*, int, u16)'
        'struct pinctrl_gpio_range* pinctrl_find_gpio_range_from_pin(struct pinctrl_dev*, unsigned int)'
        'int probe_irq_off(unsigned long)'
        'unsigned long probe_irq_on()'
        'int proc_do_large_bitmap(struct ctl_table*, int, void*, size_t*, loff_t*)'
        'struct pwm_device* pwm_request_from_chip(struct pwm_chip*, unsigned int, const char*)'
        'struct sys_off_handler* register_sys_off_handler(enum sys_off_mode, int, int(*)(struct sys_off_data*), void*)'
        'int rproc_detach(struct rproc*)'
        'void* snd_usb_find_csint_desc(void*, int, void*, u8)'
        'const struct audioformat* snd_usb_find_format(struct list_head*, snd_pcm_format_t, unsigned int, unsigned int, bool, struct snd_usb_substream*)'
        'depot_stack_handle_t stack_depot_save(unsigned long*, unsigned int, gfp_t)'
        'void tcp_get_info(struct sock*, struct tcp_info*)'
        'void uart_xchar_out(struct uart_port*, int)'
        'int usb_pipe_type_check(struct usb_device*, unsigned int)'
        'const char* usb_state_string(enum usb_device_state)'
      
      5 variable symbol(s) added
        'struct tracepoint __tracepoint_android_vh_freq_table_limits'
        'struct tracepoint __tracepoint_cma_alloc_busy_retry'
        'struct tracepoint __tracepoint_cma_alloc_finish'
        'struct tracepoint __tracepoint_cma_alloc_start'
        'struct tracepoint __tracepoint_cma_release'
      
      Bug: 395131250
      Bug: 400566736
      
      Change-Id: Idab764db85e4711cbcf544ef4268a3e8b7d6dd41
      Signed-off-by: default avatarweipengliang <weipengliang@xiaomi.com>
      (cherry picked from commit fcbb7926)
  5. Mar 04, 2025
  6. Mar 03, 2025
  7. Mar 01, 2025
    • Qi Han's avatar
      UPSTREAM: f2fs: modify f2fs_is_checkpoint_ready logic to allow more data to be... · f44dcf72
      Qi Han authored
      UPSTREAM: f2fs: modify f2fs_is_checkpoint_ready logic to allow more data to be written with the CP disable
      
      When the free segment is used up during CP disable, many write or
      ioctl operations will get ENOSPC error codes, even if there are
      still many blocks available. We can reproduce it in the following
      steps:
      
      dd if=/dev/zero of=f2fs.img bs=1M count=65
      mkfs.f2fs -f f2fs.img
      mount f2fs.img f2fs_dir -o checkpoint=disable:10%
      cd f2fs_dir
      i=1 ; while [[ $i -lt 50 ]] ; do (file_name=./2M_file$i ; dd \
      if=/dev/random of=$file_name bs=1M count=2); i=$((i+1)); done
      sync
      i=1 ; while [[ $i -lt 50 ]] ; do (file_name=./2M_file$i ; truncate \
      -s 1K $file_name); i=$((i+1)); done
      sync
      dd if=/dev/zero of=./file bs=1M count=20
      
      In f2fs_need_SSR() function, it is allowed to use SSR to allocate
      blocks when CP is disabled, so in f2fs_is_checkpoint_ready function,
      can we judge the number of invalid blocks when free segment is not
      enough, and return ENOSPC only if the number of invalid blocks is
      also not enough.
      
      Change-Id: I96cec738b6b4da05c76132e7b6c71ff9c4c63daf
      Signed-off-by: default avatarQi Han <hanqi@vivo.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      (cherry picked from commit 84b5bb8b)
      (cherry picked from commit c3fe4328)
      Bug: 399286786
      2 tags
      f44dcf72
  8. Feb 28, 2025
    • Fei's avatar
      ANDROID: GKI: update symbol list for xiaomi · 49901fed
      Fei authored
      
      3 function symbol(s) added
        'int __traceiter_android_vh_free_mod_mem(void*, const struct module*)'
        'int __traceiter_android_vh_set_mod_perm_after_init(void*, const struct module*)'
        'int __traceiter_android_vh_set_mod_perm_before_init(void*, const struct module*)'
      
      3 variable symbol(s) added
        'struct tracepoint __tracepoint_android_vh_free_mod_mem'
        'struct tracepoint __tracepoint_android_vh_set_mod_perm_after_init'
        'struct tracepoint __tracepoint_android_vh_set_mod_perm_before_init'
      
      Bug: 373794466
      Bug: 399785745
      Change-Id: I9e76336db92e7b2b8ae2894ee92e45580e7e650d
      Signed-off-by: default avatarFei <xuefei7@xiaomi.com>
      49901fed
    • Fei's avatar
      ANDROID: module: Add vendor hooks · 6bc6743e
      Fei authored
      
      Add vendor hook for module init, so we can get memory type and addr
      info, then we can use this info to set corresponding memory access
      atrribute in EL2 stage2 page table. We can get enhanced security
      protection by this way, as long as the stage2 page table is not
      corrupted.
      For releasing modules, corresponding page table attributes should be
      destroyed and restored.
      
      Bug: 373794466
      Bug: 399785745
      Change-Id: Ieccb3bdd1041dfe41a9c808a91cc19f04389e826
      Signed-off-by: default avatarxuefei7 <xuefei7@xiaomi.com>
      6bc6743e
    • Hugh Dickins's avatar
      BACKPORT: mm/thp: fix deferred split unqueue naming and locking · b346d6be
      Hugh Dickins authored
      Recent changes are putting more pressure on THP deferred split queues:
      under load revealing long-standing races, causing list_del corruptions,
      "Bad page state"s and worse (I keep BUGs in both of those, so usually
      don't get to see how badly they end up without).  The relevant recent
      changes being 6.8's mTHP, 6.10's mTHP swapout, and 6.12's mTHP swapin,
      improved swap allocation, and underused THP splitting.
      
      Before fixing locking: rename misleading folio_undo_large_rmappable(),
      which does not undo large_rmappable, to folio_unqueue_deferred_split(),
      which is what it does.  But that and its out-of-line __callee are mm
      internals of very limited usability: add comment and WARN_ON_ONCEs to
      check usage; and return a bool to say if a deferred split was unqueued,
      which can then be used in WARN_ON_ONCEs around safety checks (sparing
      callers the arcane conditionals in __folio_unqueue_deferred_split()).
      
      Just omit the folio_unqueue_deferred_split() from free_unref_folios(), all
      of whose callers now call it beforehand (and if any forget then bad_page()
      will tell) - except for its caller put_pages_list(), which itself no
      longer has any callers (and will be deleted separately).
      
      Swapout: mem_cgroup_swapout() has been resetting folio->memcg_data 0
      without checking and unqueueing a THP folio from deferred split list;
      which is unfortunate, since the split_queue_lock depends on the memcg
      (when memcg is enabled); so swapout has been unqueueing such THPs later,
      when freeing the folio, using the pgdat's lock instead: potentially
      corrupting the memcg's list.  __remove_mapping() has frozen refcount to 0
      here, so no problem with calling folio_unqueue_deferred_split() before
      resetting memcg_data.
      
      That goes back to 5.4 commit 87eaceb3 ("mm: thp: make deferred split
      shrinker memcg aware"): which included a check on swapcache before adding
      to deferred queue, but no check on deferred queue before adding THP to
      swapcache.  That worked fine with the usual sequence of events in reclaim
      (though there were a couple of rare ways in which a THP on deferred queue
      could have been swapped out), but 6.12 commit dafff3f4 ("mm: split
      underused THPs") avoids splitting underused THPs in reclaim, which makes
      swapcache THPs on deferred queue commonplace.
      
      Keep the check on swapcache before adding to deferred queue?  Yes: it is
      no longer essential, but preserves the existing behaviour, and is likely
      to be a worthwhile optimization (vmstat showed much more traffic on the
      queue under swapping load if the check was removed); update its comment.
      
      Memcg-v1 move (deprecated): mem_cgroup_move_account() has been changing
      folio->memcg_data without checking and unqueueing a THP folio from the
      deferred list, sometimes corrupting "from" memcg's list, like swapout.
      Refcount is non-zero here, so folio_unqueue_deferred_split() can only be
      used in a WARN_ON_ONCE to validate the fix, which must be done earlier:
      mem_cgroup_move_charge_pte_range() first try to split the THP (splitting
      of course unqueues), or skip it if that fails.  Not ideal, but moving
      charge has been requested, and khugepaged should repair the THP later:
      nobody wants new custom unqueueing code just for this deprecated case.
      
      The 87eaceb3 commit did have the code to move from one deferred list
      to another (but was not conscious of its unsafety while refcount non-0);
      but that was removed by 5.6 commit fac0516b ("mm: thp: don't need care
      deferred split queue in memcg charge move path"), which argued that the
      existence of a PMD mapping guarantees that the THP cannot be on a deferred
      list.  As above, false in rare cases, and now commonly false.
      
      Backport to 6.11 should be straightforward.  Earlier backports must take
      care that other _deferred_list fixes and dependencies are included.  There
      is not a strong case for backports, but they can fix cornercases.
      
      Link: https://lkml.kernel.org/r/8dc111ae-f6db-2da7-b25c-7a20b1effe3b@google.com
      
      
      Fixes: 87eaceb3 ("mm: thp: make deferred split shrinker memcg aware")
      Fixes: dafff3f4 ("mm: split underused THPs")
      Change-Id: I86d7fcd68ca35171b679c76ad2a1e21584417fc6
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: default avatarYang Shi <shy828301@gmail.com>
      Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
      Cc: Barry Song <baohua@kernel.org>
      Cc: Chris Li <chrisl@kernel.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Nhat Pham <nphamcs@gmail.com>
      Cc: Ryan Roberts <ryan.roberts@arm.com>
      Cc: Shakeel Butt <shakeel.butt@linux.dev>
      Cc: Usama Arif <usamaarif642@gmail.com>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Cc: Zi Yan <ziy@nvidia.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      (cherry picked from commit f8f931bb)
      Bug: 378967818
      Bug: 399794577
      [ Fix conflict in mem_cgroup_move_account() in file memcontrol-v1.c
      and trivial conflict with renaming of function folio_unqueue_deferred_split()
      - yan Chang ]
      Signed-off-by: default avataryan Chang <changyan1@xiaomi.com>
      b346d6be
    • Kefeng Wang's avatar
      UPSTREAM: mm: refactor folio_undo_large_rmappable() · f7e5a83d
      Kefeng Wang authored
      commit 593a10da upstream.
      
      Folios of order <= 1 are not in deferred list, the check of order is added
      into folio_undo_large_rmappable() from commit 8897277a ("mm: support
      order-1 folios in the page cache"), but there is a repeated check for
      small folio (order 0) during each call of the
      folio_undo_large_rmappable(), so only keep folio_order() check inside the
      function.
      
      In addition, move all the checks into header file to save a function call
      for non-large-rmappable or empty deferred_list folio.
      
      Link: https://lkml.kernel.org/r/20240521130315.46072-1-wangkefeng.wang@huawei.com
      
      
      Change-Id: I1d9811de36061b7df2cab9589e6bb5d6237d73fd
      Bug: 399794577
      Signed-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: default avatarVishal Moola (Oracle) <vishal.moola@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Lance Yang <ioworker0@gmail.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Muchun Song <muchun.song@linux.dev>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Shakeel Butt <shakeel.butt@linux.dev>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      [ Upstream commit itself does not apply cleanly, because there
        are fewer calls to folio_undo_large_rmappable() in this tree. ]
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      (cherry picked from commit eb6b6d3e)
      Signed-off-by: default avataryan Chang <changyan1@xiaomi.com>
      f7e5a83d
    • Matthew Wilcox (Oracle)'s avatar
      UPSTREAM: mm: always initialise folio->_deferred_list · 9d0c1e5b
      Matthew Wilcox (Oracle) authored
      commit b7b098cf upstream.
      
      Patch series "Various significant MM patches".
      
      These patches all interact in annoying ways which make it tricky to send
      them out in any way other than a big batch, even though there's not really
      an overarching theme to connect them.
      
      The big effects of this patch series are:
      
       - folio_test_hugetlb() becomes reliable, even when called without a
         page reference
       - We free up PG_slab, and we could always use more page flags
       - We no longer need to check PageSlab before calling page_mapcount()
      
      This patch (of 9):
      
      For compound pages which are at least order-2 (and hence have a
      deferred_list), initialise it and then we can check at free that the page
      is not part of a deferred list.  We recently found this useful to rule out
      a source of corruption.
      
      [peterx@redhat.com: always initialise folio->_deferred_list]
        Link: https://lkml.kernel.org/r/20240417211836.2742593-2-peterx@redhat.com
      Link: https://lkml.kernel.org/r/20240321142448.1645400-1-willy@infradead.org
      Link: https://lkml.kernel.org/r/20240321142448.1645400-2-willy@infradead.org
      
      
      Change-Id: Ib1a3574f8ff6b19f24af4704e9dde290c26bfc55
      Bug: 399794577
      Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Miaohe Lin <linmiaohe@huawei.com>
      Cc: Muchun Song <muchun.song@linux.dev>
      Cc: Oscar Salvador <osalvador@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      [ Include three small changes from the upstream commit, for backport safety:
        replace list_del() by list_del_init() in split_huge_page_to_list(),
        like c010d47f ("mm: thp: split huge page to any lower order pages");
        replace list_del() by list_del_init() in folio_undo_large_rmappable(), like
        9bcef597 ("mm: memcg: fix split queue list crash when large folio migration");
        keep __free_pages() instead of folio_put() in __update_and_free_hugetlb_folio(). ]
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      (cherry picked from commit 0275e402)
      [ Fix conflict in split_huge_page_to_list()
      to ignore the function folio_ref_freeze() - yan Chang ]
      Signed-off-by: default avataryan Chang <changyan1@xiaomi.com>
      9d0c1e5b
    • Matthew Wilcox (Oracle)'s avatar
      UPSTREAM: mm: support order-1 folios in the page cache · 2b700d75
      Matthew Wilcox (Oracle) authored
      commit 8897277a upstream.
      
      Folios of order 1 have no space to store the deferred list.  This is not a
      problem for the page cache as file-backed folios are never placed on the
      deferred list.  All we need to do is prevent the core MM from touching the
      deferred list for order 1 folios and remove the code which prevented us
      from allocating order 1 folios.
      
      Link: https://lore.kernel.org/linux-mm/90344ea7-4eec-47ee-5996-0c22f42d6a6a@google.com/
      Link: https://lkml.kernel.org/r/20240226205534.1603748-3-zi.yan@sent.com
      
      
      Bug: 399794577
      Change-Id: Ibaabee8a7dfd37adb407ee8e3861d301156f7aa5
      Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Signed-off-by: default avatarZi Yan <ziy@nvidia.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Cc: Michal Koutny <mkoutny@suse.com>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Ryan Roberts <ryan.roberts@arm.com>
      Cc: Yang Shi <shy828301@gmail.com>
      Cc: Yu Zhao <yuzhao@google.com>
      Cc: Zach O'Keefe <zokeefe@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      (cherry picked from commit e8769509)
      [ Fix conflict in split_huge_page_to_list()
      to ignore the function folio_ref_freeze(), and
      delete the filter of order1 in function page_cache_ra_order()
      - yan Chang ]
      Signed-off-by: default avataryan Chang <changyan1@xiaomi.com>
      2b700d75
    • Ryan Roberts's avatar
      UPSTREAM: mm/readahead: do not allow order-1 folio · 74b7aab7
      Ryan Roberts authored
      commit ec056cef upstream.
      
      The THP machinery does not support order-1 folios because it requires meta
      data spanning the first 3 `struct page`s.  So order-2 is the smallest
      large folio that we can safely create.
      
      There was a theoretical bug whereby if ra->size was 2 or 3 pages (due to
      the device-specific bdi->ra_pages being set that way), we could end up
      with order = 1.  Fix this by unconditionally checking if the preferred
      order is 1 and if so, set it to 0.  Previously this was done in a few
      specific places, but with this refactoring it is done just once,
      unconditionally, at the end of the calculation.
      
      This is a theoretical bug found during review of the code; I have no
      evidence to suggest this manifests in the real world (I expect all
      device-specific ra_pages values are much bigger than 3).
      Bug: 399794577
      Link: https://lkml.kernel.org/r/20231201161045.3962614-1-ryan.roberts@arm.com
      
      
      Change-Id: I5b024e995d3b85954cfb35d7df1c2fdcc9be9e16
      Signed-off-by: default avatarRyan Roberts <ryan.roberts@arm.com>
      Reviewed-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      (cherry picked from commit 2ad2067e)
      Signed-off-by: default avataryan Chang <changyan1@xiaomi.com>
      74b7aab7
    • Hugh Dickins's avatar
      UPSTREAM: mm: add page_rmappable_folio() wrapper · 1c515437
      Hugh Dickins authored
      commit 23e48832 upstream.
      
      folio_prep_large_rmappable() is being used repeatedly along with a
      conversion from page to folio, a check non-NULL, a check order > 1: wrap
      it all up into struct folio *page_rmappable_folio(struct page *).
      
      Link: https://lkml.kernel.org/r/8d92c6cf-eebe-748-e29c-c8ab224c741@google.com
      
      
      Change-Id: Ide07d3577fc7ab6ee3ec8c0680dacfc5d22822c8
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Huang, Ying" <ying.huang@intel.com>
      Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Nhat Pham <nphamcs@gmail.com>
      Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Tejun heo <tj@kernel.org>
      Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
      Cc: Yang Shi <shy828301@gmail.com>
      Cc: Yosry Ahmed <yosryahmed@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      (cherry picked from commit bc899023)
      Signed-off-by: default avataryan Chang <changyan1@xiaomi.com>
      Bug: 399794577
      1c515437
  9. Feb 27, 2025
  10. Feb 25, 2025
  11. Feb 21, 2025
Loading