Skip to content
Snippets Groups Projects
  1. Nov 27, 2023
    • Fabio M. De Francesco's avatar
      mm/mempool: replace kmap_atomic() with kmap_local_page() · 564a4e83
      Fabio M. De Francesco authored
      kmap_atomic() has been deprecated in favor of kmap_local_page().
      
      Therefore, replace kmap_atomic() with kmap_local_page().
      
      kmap_atomic() is implemented like a kmap_local_page() which also disables
      page-faults and preemption (the latter only in !PREEMPT_RT kernels).  The
      kernel virtual addresses returned by these two API are only valid in the
      context of the callers (i.e., they cannot be handed to other threads).
      
      With kmap_local_page() the mappings are per thread and CPU local like in
      kmap_atomic(); however, they can handle page-faults and can be called from
      any context (including interrupts).  The tasks that call kmap_local_page()
      can be preempted and, when they are scheduled to run again, the kernel
      virtual addresses are restored and are still valid.
      
      The code blocks between the mappings and un-mappings don't rely on the
      above-mentioned side effects of kmap_atomic(), so that mere replacements
      of the old API with the new one is all that they require (i.e., there is
      no need to explicitly call pagefault_disable() and/or preempt_disable()).
      
      Link: https://lkml.kernel.org/r/20231120142640.7077-1-fabio.maria.de.francesco@linux.intel.com
      
      
      Signed-off-by: default avatarFabio M. De Francesco <fabio.maria.de.francesco@linux.intel.com>
      Cc: Ira Weiny <ira.weiny@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      564a4e83
  2. Nov 24, 2023
  3. Nov 30, 2022
  4. Jun 17, 2022
  5. Oct 18, 2021
  6. Jun 04, 2021
  7. May 05, 2021
  8. Apr 30, 2021
  9. Feb 24, 2021
  10. Dec 22, 2020
  11. Oct 14, 2020
  12. Aug 15, 2020
    • Qian Cai's avatar
      mm/mempool: fix a data race in mempool_free() · abe1de42
      Qian Cai authored
      
      mempool_t pool.curr_nr could be accessed concurrently as noticed by
      KCSAN,
      
       BUG: KCSAN: data-race in mempool_free / remove_element
      
       write to 0xffffffffa937638c of 4 bytes by task 6359 on cpu 113:
        remove_element+0x4a/0x1c0
        remove_element at mm/mempool.c:132
        mempool_alloc+0x102/0x210
        (inlined by) mempool_alloc at mm/mempool.c:399
        bio_alloc_bioset+0x106/0x2c0
        get_swap_bio+0x49/0x230
        __swap_writepage+0x680/0xc30
        swap_writepage+0x9c/0xf0
        pageout+0x33e/0xae0
        shrink_page_list+0x1f57/0x2870
        shrink_inactive_list+0x316/0x880
        shrink_lruvec+0x8dc/0x1380
        shrink_node+0x317/0xd80
        do_try_to_free_pages+0x1f7/0xa10
        try_to_free_pages+0x26c/0x5e0
        __alloc_pages_slowpath+0x458/0x1290
        <snip>
      
       read to 0xffffffffa937638c of 4 bytes by interrupt on cpu 64:
        mempool_free+0x3e/0x150
        mempool_free at mm/mempool.c:492
        bio_free+0x192/0x280
        bio_put+0x91/0xd0
        end_swap_bio_write+0x1d8/0x280
        bio_endio+0x2c2/0x5b0
        dec_pending+0x22b/0x440 [dm_mod]
        clone_endio+0xe4/0x2c0 [dm_mod]
        bio_endio+0x2c2/0x5b0
        blk_update_request+0x217/0x940
        scsi_end_request+0x6b/0x4d0
        scsi_io_completion+0xb7/0x7e0
        scsi_finish_command+0x223/0x310
        scsi_softirq_done+0x1d5/0x210
        blk_mq_complete_request+0x224/0x250
        scsi_mq_done+0xc2/0x250
        pqi_raid_io_complete+0x5a/0x70 [smartpqi]
        pqi_irq_handler+0x150/0x1410 [smartpqi]
        __handle_irq_event_percpu+0x90/0x540
        handle_irq_event_percpu+0x49/0xd0
        handle_irq_event+0x85/0xca
        handle_edge_irq+0x13f/0x3e0
        do_IRQ+0x86/0x190
        <snip>
      
      Since the write is under pool->lock but the read is done as lockless.
      Even though the commit 5b990546 ("mempool: fix and document
      synchronization and memory barrier usage") introduced the smp_wmb() and
      smp_rmb() pair to improve the situation, it is adequate to protect it
      from data races which could lead to a logic bug, so fix it by adding
      READ_ONCE() for the read.
      
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Marco Elver <elver@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Link: http://lkml.kernel.org/r/1581446384-2131-1-git-send-email-cai@lca.pw
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      abe1de42
  13. Mar 06, 2019
  14. Aug 22, 2018
  15. Aug 17, 2018
  16. May 14, 2018
  17. Feb 07, 2018
  18. Nov 16, 2017
  19. Nov 02, 2017
    • Greg Kroah-Hartman's avatar
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman authored
      
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
         lines).
      
      All documentation files were explicitly excluded.
      
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
      
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
      
         For non */uapi/* files that summary was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0                                              11139
      
         and resulted in the first patch in this series.
      
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0 WITH Linux-syscall-note                        930
      
         and resulted in the second patch in this series.
      
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|------
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
      
         and that resulted in the third patch in this series.
      
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
      
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
      
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
      
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
      
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      
      Reviewed-by: default avatarKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: default avatarPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2441318
  20. Jun 20, 2017
    • Ingo Molnar's avatar
      sched/wait: Rename wait_queue_t => wait_queue_entry_t · ac6424b9
      Ingo Molnar authored
      
      Rename:
      
      	wait_queue_t		=>	wait_queue_entry_t
      
      'wait_queue_t' was always a slight misnomer: its name implies that it's a "queue",
      but in reality it's a queue *entry*. The 'real' queue is the wait queue head,
      which had to carry the name.
      
      Start sorting this out by renaming it to 'wait_queue_entry_t'.
      
      This also allows the real structure name 'struct __wait_queue' to
      lose its double underscore and become 'struct wait_queue_entry',
      which is the more canonical nomenclature for such data types.
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ac6424b9
  21. Jul 28, 2016
    • Michal Hocko's avatar
      Revert "mm, mempool: only set __GFP_NOMEMALLOC if there are free elements" · 4e390b2b
      Michal Hocko authored
      This reverts commit f9054c70 ("mm, mempool: only set __GFP_NOMEMALLOC
      if there are free elements").
      
      There has been a report about OOM killer invoked when swapping out to a
      dm-crypt device.  The primary reason seems to be that the swapout out IO
      managed to completely deplete memory reserves.  Ondrej was able to
      bisect and explained the issue by pointing to f9054c70 ("mm,
      mempool: only set __GFP_NOMEMALLOC if there are free elements").
      
      The reason is that the swapout path is not throttled properly because
      the md-raid layer needs to allocate from the generic_make_request path
      which means it allocates from the PF_MEMALLOC context.  dm layer uses
      mempool_alloc in order to guarantee a forward progress which used to
      inhibit access to memory reserves when using page allocator.  This has
      changed by f9054c70 ("mm, mempool: only set __GFP_NOMEMALLOC if
      there are free elements") which has dropped the __GFP_NOMEMALLOC
      protection when the memory pool is depleted.
      
      If we are running out of memory and the only way forward to free memory
      is to perform swapout we just keep consuming memory reserves rather than
      throttling the mempool allocations and allowing the pending IO to
      complete up to a moment when the memory is depleted completely and there
      is no way forward but invoking the OOM killer.  This is less than
      optimal.
      
      The original intention of f9054c70 was to help with the OOM
      situations where the oom victim depends on mempool allocation to make a
      forward progress.  David has mentioned the following backtrace:
      
        schedule
        schedule_timeout
        io_schedule_timeout
        mempool_alloc
        __split_and_process_bio
        dm_request
        generic_make_request
        submit_bio
        mpage_readpages
        ext4_readpages
        __do_page_cache_readahead
        ra_submit
        filemap_fault
        handle_mm_fault
        __do_page_fault
        do_page_fault
        page_fault
      
      We do not know more about why the mempool is depleted without being
      replenished in time, though.  In any case the dm layer shouldn't depend
      on any allocations outside of the dedicated pools so a forward progress
      should be guaranteed.  If this is not the case then the dm should be
      fixed rather than papering over the problem and postponing it to later
      by accessing more memory reserves.
      
      mempools are a mechanism to maintain dedicated memory reserves to
      guaratee forward progress.  Allowing them an unbounded access to the
      page allocator memory reserves is going against the whole purpose of
      this mechanism.
      
      Bisected by Ondrej Kozina.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Link: http://lkml.kernel.org/r/20160721145309.GR26379@dhcp22.suse.cz
      
      
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reported-by: default avatarOndrej Kozina <okozina@redhat.com>
      Reviewed-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarNeilBrown <neilb@suse.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Mikulas Patocka <mpatocka@redhat.com>
      Cc: Ondrej Kozina <okozina@redhat.com>
      Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4e390b2b
  22. Jun 25, 2016
  23. May 21, 2016
    • Alexander Potapenko's avatar
      mm: kasan: initial memory quarantine implementation · 55834c59
      Alexander Potapenko authored
      Quarantine isolates freed objects in a separate queue.  The objects are
      returned to the allocator later, which helps to detect use-after-free
      errors.
      
      When the object is freed, its state changes from KASAN_STATE_ALLOC to
      KASAN_STATE_QUARANTINE.  The object is poisoned and put into quarantine
      instead of being returned to the allocator, therefore every subsequent
      access to that object triggers a KASAN error, and the error handler is
      able to say where the object has been allocated and deallocated.
      
      When it's time for the object to leave quarantine, its state becomes
      KASAN_STATE_FREE and it's returned to the allocator.  From now on the
      allocator may reuse it for another allocation.  Before that happens,
      it's still possible to detect a use-after free on that object (it
      retains the allocation/deallocation stacks).
      
      When the allocator reuses this object, the shadow is unpoisoned and old
      allocation/deallocation stacks are wiped.  Therefore a use of this
      object, even an incorrect one, won't trigger ASan warning.
      
      Without the quarantine, it's not guaranteed that the objects aren't
      reused immediately, that's why the probability of catching a
      use-after-free is lower than with quarantine in place.
      
      Quarantine isolates freed objects in a separate queue.  The objects are
      returned to the allocator later, which helps to detect use-after-free
      errors.
      
      Freed objects are first added to per-cpu quarantine queues.  When a
      cache is destroyed or memory shrinking is requested, the objects are
      moved into the global quarantine queue.  Whenever a kmalloc call allows
      memory reclaiming, the oldest objects are popped out of the global queue
      until the total size of objects in quarantine is less than 3/4 of the
      maximum quarantine size (which is a fraction of installed physical
      memory).
      
      As long as an object remains in the quarantine, KASAN is able to report
      accesses to it, so the chance of reporting a use-after-free is
      increased.  Once the object leaves quarantine, the allocator may reuse
      it, in which case the object is unpoisoned and KASAN can't detect
      incorrect accesses to it.
      
      Right now quarantine support is only enabled in SLAB allocator.
      Unification of KASAN features in SLAB and SLUB will be done later.
      
      This patch is based on the "mm: kasan: quarantine" patch originally
      prepared by Dmitry Chernenkov.  A number of improvements have been
      suggested by Andrey Ryabinin.
      
      [glider@google.com: v9]
        Link: http://lkml.kernel.org/r/1462987130-144092-1-git-send-email-glider@google.com
      
      
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrey Konovalov <adech.fo@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Konstantin Serebryany <kcc@google.com>
      Cc: Dmitry Chernenkov <dmitryc@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      55834c59
  24. Mar 25, 2016
    • Alexander Potapenko's avatar
      mm, kasan: add GFP flags to KASAN API · 505f5dcb
      Alexander Potapenko authored
      
      Add GFP flags to KASAN hooks for future patches to use.
      
      This patch is based on the "mm: kasan: unified support for SLUB and SLAB
      allocators" patch originally prepared by Dmitry Chernenkov.
      
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrey Konovalov <adech.fo@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Konstantin Serebryany <kcc@google.com>
      Cc: Dmitry Chernenkov <dmitryc@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      505f5dcb
  25. Mar 17, 2016
    • David Rientjes's avatar
      mm, mempool: only set __GFP_NOMEMALLOC if there are free elements · f9054c70
      David Rientjes authored
      
      If an oom killed thread calls mempool_alloc(), it is possible that it'll
      loop forever if there are no elements on the freelist since
      __GFP_NOMEMALLOC prevents it from accessing needed memory reserves in
      oom conditions.
      
      Only set __GFP_NOMEMALLOC if there are elements on the freelist.  If
      there are no free elements, allow allocations without the bit set so
      that memory reserves can be accessed if needed.
      
      Additionally, using mempool_alloc() with __GFP_NOMEMALLOC is not
      supported since the implementation can loop forever without accessing
      memory reserves when needed.
      
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f9054c70
  26. Mar 12, 2016
  27. Nov 07, 2015
    • Mel Gorman's avatar
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep... · d0164adc
      Mel Gorman authored
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd
      
      __GFP_WAIT has been used to identify atomic context in callers that hold
      spinlocks or are in interrupts.  They are expected to be high priority and
      have access one of two watermarks lower than "min" which can be referred
      to as the "atomic reserve".  __GFP_HIGH users get access to the first
      lower watermark and can be called the "high priority reserve".
      
      Over time, callers had a requirement to not block when fallback options
      were available.  Some have abused __GFP_WAIT leading to a situation where
      an optimisitic allocation with a fallback option can access atomic
      reserves.
      
      This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
      cannot sleep and have no alternative.  High priority users continue to use
      __GFP_HIGH.  __GFP_DIRECT_RECLAIM identifies callers that can sleep and
      are willing to enter direct reclaim.  __GFP_KSWAPD_RECLAIM to identify
      callers that want to wake kswapd for background reclaim.  __GFP_WAIT is
      redefined as a caller that is willing to enter direct reclaim and wake
      kswapd for background reclaim.
      
      This patch then converts a number of sites
      
      o __GFP_ATOMIC is used by callers that are high priority and have memory
        pools for those requests. GFP_ATOMIC uses this flag.
      
      o Callers that have a limited mempool to guarantee forward progress clear
        __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
        into this category where kswapd will still be woken but atomic reserves
        are not used as there is a one-entry mempool to guarantee progress.
      
      o Callers that are checking if they are non-blocking should use the
        helper gfpflags_allow_blocking() where possible. This is because
        checking for __GFP_WAIT as was done historically now can trigger false
        positives. Some exceptions like dm-crypt.c exist where the code intent
        is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
        flag manipulations.
      
      o Callers that built their own GFP flags instead of starting with GFP_KERNEL
        and friends now also need to specify __GFP_KSWAPD_RECLAIM.
      
      The first key hazard to watch out for is callers that removed __GFP_WAIT
      and was depending on access to atomic reserves for inconspicuous reasons.
      In some cases it may be appropriate for them to use __GFP_HIGH.
      
      The second key hazard is callers that assembled their own combination of
      GFP flags instead of starting with something like GFP_KERNEL.  They may
      now wish to specify __GFP_KSWAPD_RECLAIM.  It's almost certainly harmless
      if it's missed in most cases as other activity will wake kswapd.
      
      Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Vitaly Wool <vitalywool@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d0164adc
  28. Sep 08, 2015
  29. Apr 15, 2015
    • Andrey Ryabinin's avatar
      mm/mempool.c: kasan: poison mempool elements · 92393615
      Andrey Ryabinin authored
      
      Mempools keep allocated objects in reserved for situations when ordinary
      allocation may not be possible to satisfy.  These objects shouldn't be
      accessed before they leave the pool.
      
      This patch poison elements when get into the pool and unpoison when they
      leave it.  This will let KASan to detect use-after-free of mempool's
      elements.
      
      Signed-off-by: default avatarAndrey Ryabinin <a.ryabinin@samsung.com>
      Tested-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dmitry Chernenkov <drcheren@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      92393615
    • David Rientjes's avatar
      mm, mempool: poison elements backed by slab allocator · bdfedb76
      David Rientjes authored
      
      Mempools keep elements in a reserved pool for contexts in which allocation
      may not be possible.  When an element is allocated from the reserved pool,
      its memory contents is the same as when it was added to the reserved pool.
      
      Because of this, elements lack any free poisoning to detect use-after-free
      errors.
      
      This patch adds free poisoning for elements backed by the slab allocator.
      This is possible because the mempool layer knows the object size of each
      element.
      
      When an element is added to the reserved pool, it is poisoned with
      POISON_FREE.  When it is removed from the reserved pool, the contents are
      checked for POISON_FREE.  If there is a mismatch, a warning is emitted to
      the kernel log.
      
      This is only effective for configs with CONFIG_DEBUG_SLAB or
      CONFIG_SLUB_DEBUG_ON.
      
      [fabio.estevam@freescale.com: use '%zu' for printing 'size_t' variable]
      [arnd@arndb.de: add missing include]
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Dave Kleikamp <shaggy@kernel.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
      Cc: Mikulas Patocka <mpatocka@redhat.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarFabio Estevam <fabio.estevam@freescale.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bdfedb76
    • David Rientjes's avatar
      mm, mempool: disallow mempools based on slab caches with constructors · e244c9e6
      David Rientjes authored
      
      All occurrences of mempools based on slab caches with object constructors
      have been removed from the tree, so disallow creating them.
      
      We can only dereference mem->ctor in mm/mempool.c without including
      mm/slab.h in include/linux/mempool.h.  So simply note the restriction,
      just like the comment restricting usage of __GFP_ZERO, and warn on kernels
      with CONFIG_DEBUG_VM() if such a mempool is allocated from.
      
      We don't want to incur this check on every element allocation, so use
      VM_BUG_ON().
      
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Dave Kleikamp <shaggy@kernel.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
      Cc: Mikulas Patocka <mpatocka@redhat.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e244c9e6
  30. Apr 14, 2015
  31. Jun 06, 2014
  32. Jun 04, 2014
  33. Apr 07, 2014
  34. Sep 11, 2013
  35. Jun 25, 2012
  36. Jan 11, 2012
    • Tejun Heo's avatar
      mempool: fix first round failure behavior · 1ebb7044
      Tejun Heo authored
      
      mempool modifies gfp_mask so that the backing allocator doesn't try too
      hard or trigger warning message when there's pool to fall back on.  In
      addition, for the first try, it removes __GFP_WAIT and IO, so that it
      doesn't trigger reclaim or wait when allocation can be fulfilled from
      pool; however, when that allocation fails and pool is empty too, it waits
      for the pool to be replenished before retrying.
      
      Allocation which could have succeeded after a bit of reclaim has to wait
      on the reserved items and it's not like mempool doesn't retry with
      __GFP_WAIT and IO.  It just does that *after* someone returns an element,
      pointlessly delaying things.
      
      Fix it by retrying immediately if the first round of allocation attempts
      w/o __GFP_WAIT and IO fails.
      
      [akpm@linux-foundation.org: shorten the lock hold time]
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1ebb7044
    • Tejun Heo's avatar
      mempool: drop unnecessary and incorrect BUG_ON() from mempool_destroy() · 0565d317
      Tejun Heo authored
      
      mempool_destroy() is a thin wrapper around free_pool().  The only thing it
      adds is BUG_ON(pool->curr_nr != pool->min_nr).  The intention seems to be
      to enforce that all allocated elements are freed; however, the BUG_ON()
      can't achieve that (it doesn't know anything about objects above min_nr)
      and incorrect as mempool_resize() is allowed to leave the pool extended
      but not filled.  Furthermore, panicking is way worse than any memory leak
      and there are better debug tools to track memory leaks.
      
      Drop the BUG_ON() from mempool_destory() and as that leaves the function
      identical to free_pool(), replace it.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0565d317
Loading