- May 06, 2023
-
-
Matthew Wilcox authored
Smatch reports that filemap_fault() was missed in the conversion of __filemap_get_folio() error returns from NULL to ERR_PTR. Fixes: 66dabbb6 ("mm: return an ERR_PTR from __filemap_get_folio") Reported-by:
Dan Carpenter <dan.carpenter@linaro.org> Reported-by:
<syzbot+48011b86c8ea329af1b9@syzkaller.appspotmail.com> Reported-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Apr 06, 2023
-
-
Christoph Hellwig authored
Instead of returning NULL for all errors, distinguish between: - no entry found and not asked to allocated (-ENOENT) - failed to allocate memory (-ENOMEM) - would block (-EAGAIN) so that callers don't have to guess the error based on the passed in flags. Also pass through the error through the direct callers: filemap_get_folio, filemap_lock_folio filemap_grab_folio and filemap_get_incore_folio. [hch@lst.de: fix null-pointer deref] Link: https://lkml.kernel.org/r/20230310070023.GA13563@lst.de Link: https://lkml.kernel.org/r/20230310043137.GA1624890@u2004 Link: https://lkml.kernel.org/r/20230307143410.28031-8-hch@lst.de Signed-off-by:
Christoph Hellwig <hch@lst.de> Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> [nilfs2] Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Christoph Hellwig authored
FGP_ENTRY is unused now, so remove it. Link: https://lkml.kernel.org/r/20230307143410.28031-7-hch@lst.de Signed-off-by:
Christoph Hellwig <hch@lst.de> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Christoph Hellwig authored
mapping_get_entry is useful for page cache API users that need to know about xa_value internals. Rename it and make it available in pagemap.h. Link: https://lkml.kernel.org/r/20230307143410.28031-3-hch@lst.de Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Feb 20, 2023
-
-
David Howells authored
filemap_splice_read() and direct_splice_read() should be exported. Signed-off-by:
David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Jens Axboe <axboe@kernel.dk> cc: Christoph Hellwig <hch@lst.de> cc: Al Viro <viro@zeniv.linux.org.uk> cc: David Hildenbrand <david@redhat.com> cc: John Hubbard <jhubbard@nvidia.com> cc: linux-cifs@vger.kernel.org cc: linux-mm@kvack.org cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Signed-off-by:
Steve French <stfrench@microsoft.com>
-
David Howells authored
Provide a function to do splice read from a buffered file, pulling the folios out of the pagecache directly by calling filemap_get_pages() to do any required reading and then pasting the returned folios into the pipe. A helper function is provided to do the actual folio pasting and will handle multipage folios by splicing as many of the relevant subpages as will fit into the pipe. The code is loosely based on filemap_read() and might belong in mm/filemap.c with that as it needs to use filemap_get_pages(). Signed-off-by:
David Howells <dhowells@redhat.com> Reviewed-by:
Jens Axboe <axboe@kernel.dk> cc: Christoph Hellwig <hch@lst.de> cc: Al Viro <viro@zeniv.linux.org.uk> cc: David Hildenbrand <david@redhat.com> cc: John Hubbard <jhubbard@nvidia.com> cc: linux-mm@kvack.org cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Signed-off-by:
Steve French <stfrench@microsoft.com>
-
David Howells authored
filemap_get_pages() and a number of functions that it calls take an iterator to provide two things: the number of bytes to be got from the file specified and whether partially uptodate pages are allowed. Change these functions so that this information is passed in directly. This allows it to be called without having an iterator to hand. Signed-off-by:
David Howells <dhowells@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Jens Axboe <axboe@kernel.dk> cc: Christoph Hellwig <hch@lst.de> cc: Matthew Wilcox <willy@infradead.org> cc: Al Viro <viro@zeniv.linux.org.uk> cc: David Hildenbrand <david@redhat.com> cc: John Hubbard <jhubbard@nvidia.com> cc: linux-mm@kvack.org cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Signed-off-by:
Steve French <stfrench@microsoft.com>
-
- Feb 17, 2023
-
-
Qian Yingjin authored
I was running traces of the read code against an RAID storage system to understand why read requests were being misaligned against the underlying RAID strips. I found that the page end offset calculation in filemap_get_read_batch() was off by one. When a read is submitted with end offset 1048575, then it calculates the end page for read of 256 when it should be 255. "last_index" is the index of the page beyond the end of the read and it should be skipped when get a batch of pages for read in @filemap_get_read_batch(). The below simple patch fixes the problem. This code was introduced in kernel 5.12. Link: https://lkml.kernel.org/r/20230208022400.28962-1-coolqyj@163.com Fixes: cbd59c48 ("mm/filemap: use head pages in generic_file_buffered_read") Signed-off-by:
Qian Yingjin <qian@ddn.com> Reviewed-by:
Matthew Wilcox (Oracle) <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Feb 10, 2023
-
-
Matthew Wilcox (Oracle) authored
This is like read_cache_page_gfp() except it returns the folio instead of the precise page. Link: https://lkml.kernel.org/r/20230206162520.4029022-1-willy@infradead.org Signed-off-by:
Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Charan Teja Kalla <quic_charante@quicinc.com> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mark Hemment <markhemm@googlemail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Pavankumar Kondeti <quic_pkondeti@quicinc.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Inline the work of __vma_adjust() into vma_merge(). This reduces code size and has the added benefits of the comments for the cases being located with the code. Change the comments referencing vma_adjust() accordingly. [Liam.Howlett@oracle.com: fix vma_merge() offset when expanding the next vma] Link: https://lkml.kernel.org/r/20230130195713.2881766-1-Liam.Howlett@oracle.com Link: https://lkml.kernel.org/r/20230120162650.984577-49-Liam.Howlett@oracle.com Signed-off-by:
Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Feb 03, 2023
-
-
Matthew Wilcox (Oracle) authored
The folio isn't returned from this function, so this is an entirely internal change. Link: https://lkml.kernel.org/r/20230116193941.2148487-3-willy@infradead.org Signed-off-by:
Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by:
William Kucharski <william.kucharski@oracle.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Patch series "Some more filemap folio conversions". Three more places which could easily be converted to folios. The third one fixes a minor bug in readahead_expand(), but it's only a performance bug and there are few users of readahead_expand(), so I don't think it's worth backporting. This patch (of 3): Save a few calls to compound_head(). We specify exactly which page from the folio to use by passing in start_pgoff, which means this will work for a folio which is larger than PMD size. The rest of the VM isn't prepared for that yet, but now this function is. Link: https://lkml.kernel.org/r/20230116193941.2148487-1-willy@infradead.org Link: https://lkml.kernel.org/r/20230116193941.2148487-2-willy@infradead.org Signed-off-by:
Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by:
William Kucharski <william.kucharski@oracle.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Vishal Moola (Oracle) authored
All callers to find_get_pages_range_tag(), find_get_pages_tag(), pagevec_lookup_range_tag(), and pagevec_lookup_tag() have been removed. Link: https://lkml.kernel.org/r/20230104211448.4804-24-vishal.moola@gmail.com Signed-off-by:
Vishal Moola (Oracle) <vishal.moola@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Vishal Moola (Oracle) authored
Convert function to use folios. This is in preparation for the removal of find_get_pages_range_tag(). This change removes 2 calls to compound_head(). Link: https://lkml.kernel.org/r/20230104211448.4804-4-vishal.moola@gmail.com Signed-off-by:
Vishal Moola (Oracle) <vishal.moola@gmail.com> Reviewed-by:
Matthew Wilcow (Oracle) <willy@infradead.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Vishal Moola (Oracle) authored
This is the equivalent of find_get_pages_range_tag(), except for folios instead of pages. One noteable difference is filemap_get_folios_tag() does not take in a maximum pages argument. It instead tries to fill a folio batch and stops either once full (15 folios) or reaching the end of the search range. The new function supports large folios, the initial function did not since all callers don't use large folios. Link: https://lkml.kernel.org/r/20230104211448.4804-3-vishal.moola@gmail.com Signed-off-by:
Vishal Moola (Oracle) <vishal.moola@gmail.com> Reviewed-by:
Matthew Wilcow (Oracle) <willy@infradead.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Dec 12, 2022
-
-
Vishal Moola (Oracle) authored
Patch series "Removing the lru_cache_add() wrapper". This patchset replaces all calls of lru_cache_add() with the folio equivalent: folio_add_lru(). This is allows us to get rid of the wrapper The series passes xfstests and the userfaultfd selftests. This patch (of 5): Eliminates 7 calls to compound_head(). Link: https://lkml.kernel.org/r/20221101175326.13265-1-vishal.moola@gmail.com Link: https://lkml.kernel.org/r/20221101175326.13265-2-vishal.moola@gmail.com Signed-off-by:
Vishal Moola (Oracle) <vishal.moola@gmail.com> Reviewed-by:
Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Brian Foster authored
Patch series "filemap: skip write and wait if end offset precedes start", v2. A fix for the odd write and wait behavior described in the patch 1 commit log. Technically patch 1 could simply remove the check rather than lift it into the callers, but this seemed a bit more user friendly to me. Patch 2 is appended after observation that fadvise() interacted poorly with the v1 patch. This is no longer a problem with v2, making patch 2 purely a cleanup. This series survived both fstests and ltp regression runs without observable problems. I had (end < start) warning checks in each relevant function, with fadvise() being the only caller that triggered them. That said, I dropped the warnings after testing because there seemed to much potential for noise from the various other callers. This patch (of 2): A call to file[map]_write_and_wait_range() with an end offset that precedes the start offset but happens to land in the same page can trigger writeback submission but fails to wait on the submitted page. Writeback submission occurs because __filemap_fdatawrite_range() passes both offsets down into write_cache_pages(), which rounds down to page indexes before it starts processing writeback. However, __filemap_fdatawait_range() immediately returns if the byte-granular end offset precedes the start offset. This behavior was observed in the form of unpredictable latency from a frequent write and wait call with incorrect parameters. The behavior gave the impression that the fdatawait path might occasionally fail to wait on writeback, but further investigation showed the latency was from write_cache_pages() waiting on writeback state to clear for a page already under writeback. Therefore, this indicated that fdatawait actually never waits on writeback in this particular situation. The byte granular check in __filemap_fdatawait_range() goes all the way back to the old wait_on_page_writeback() helper. It originally used page offsets and so would have waited in this problematic case. That changed to byte granularity file offsets in commit 94004ed7 ("kill wait_on_page_writeback_range"), which subtly changed this behavior. The check itself has become somewhat redundant since the error checking code that used to follow the wait loop (at the time of the aforementioned commit) has now been removed and lifted into the higher level callers. Therefore, we can restore historical fdatawait behavior by simply removing the check. Since the current fdatawait behavior has been in place for quite some time and is consistent with other interfaces that use file offsets, instead lift the check into the file[map]_write_and_wait_range() helpers to provide consistent behavior between the write and wait. Link: https://lkml.kernel.org/r/20221128155632.3950447-1-bfoster@redhat.com Link: https://lkml.kernel.org/r/20221128155632.3950447-2-bfoster@redhat.com Signed-off-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Nov 09, 2022
-
-
Vishal Moola (Oracle) authored
Initially, find_get_entries() was being passed in the start offset as a value. That left the calculation of the offset to the callers. This led to complexity in the callers trying to keep track of the index. Now find_get_entries() takes in a pointer to the start offset and updates the value to be directly after the last entry found. If no entry is found, the offset is not changed. This gets rid of multiple hacky calculations that kept track of the start offset. Link: https://lkml.kernel.org/r/20221017161800.2003-3-vishal.moola@gmail.com Signed-off-by:
Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Vishal Moola (Oracle) authored
Patch series "Rework find_get_entries() and find_lock_entries()", v3. Originally the callers of find_get_entries() and find_lock_entries() were keeping track of the start index themselves as they traverse the search range. This resulted in hacky code such as in shmem_undo_range(): index = folio->index + folio_nr_pages(folio) - 1; where the - 1 is only present to stay in the right spot after incrementing index later. This sort of calculation was also being done on every folio despite not even using index later within that function. These patches change find_get_entries() and find_lock_entries() to calculate the new index instead of leaving it to the callers so we can avoid all these complications. This patch (of 2): Initially, find_lock_entries() was being passed in the start offset as a value. That left the calculation of the offset to the callers. This led to complexity in the callers trying to keep track of the index. Now find_lock_entries() takes in a pointer to the start offset and updates the value to be directly after the last entry found. If no entry is found, the offset is not changed. This gets rid of multiple hacky calculations that kept track of the start offset. Link: https://lkml.kernel.org/r/20221017161800.2003-1-vishal.moola@gmail.com Link: https://lkml.kernel.org/r/20221017161800.2003-2-vishal.moola@gmail.com Signed-off-by:
Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Oct 03, 2022
-
-
Alexander Potapenko authored
Functions implementing the a_ops->write_end() interface accept the `void *fsdata` parameter that is supposed to be initialized by the corresponding a_ops->write_begin() (which accepts `void **fsdata`). However not all a_ops->write_begin() implementations initialize `fsdata` unconditionally, so it may get passed uninitialized to a_ops->write_end(), resulting in undefined behavior. Fix this by initializing fsdata with NULL before the call to write_begin(), rather than doing so in all possible a_ops implementations. This patch covers only the following cases found by running x86 KMSAN under syzkaller: - generic_perform_write() - cont_expand_zero() and generic_cont_expand_simple() - page_symlink() Other cases of passing uninitialized fsdata may persist in the codebase. Link: https://lkml.kernel.org/r/20220915150417.722975-43-glider@google.com Signed-off-by:
Alexander Potapenko <glider@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Ke Sun authored
It's only used in mm/filemap.c, since commit <ffa65753> ("mm/migrate.c: rework migration_entry_wait() to not take a pageref"). Make it static. Link: https://lkml.kernel.org/r/20220914021738.3228011-1-sunke@kylinos.cn Signed-off-by:
Ke Sun <sunke@kylinos.cn> Reported-by:
k2ci <kernel-bot@kylinos.cn> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Vishal Moola (Oracle) authored
Removes 3 calls to compound_head(). Link: https://lkml.kernel.org/r/20220905214557.868606-1-vishal.moola@gmail.com Signed-off-by:
Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Sep 27, 2022
-
-
Yang Yang authored
Once upon a time, we only support accounting thrashing of page cache. Then Joonsoo introduced workingset detection for anonymous pages and we gained the ability to account thrashing of them[1]. For page cache thrashing accounting, there is no suitable place to do it in fs level likes swap_readpage(). So we have to do it in folio_wait_bit_common(). Then for anonymous pages thrashing accounting, we have to do it in both swap_readpage() and folio_wait_bit_common(). This likes PSI, so we should let thrashing accounting supports re-entrance detection. This patch is to prepare complete thrashing accounting, and is based on patch "filemap: make the accounting of thrashing more consistent". [1] commit aae466b0 ("mm/swap: implement workingset detection for anonymous LRU") Link: https://lkml.kernel.org/r/20220815071134.74551-1-yang.yang29@zte.com.cn Signed-off-by:
Yang Yang <yang.yang29@zte.com.cn> Signed-off-by:
CGEL ZTE <cgel.zte@gmail.com> Reviewed-by:
Ran Xiaokai <ran.xiaokai@zte.com.cn> Reviewed-by:
wangyong <wang.yong12@zte.com.cn> Acked-by:
Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Yang Yang authored
Once upon a time, we only support accounting thrashing of page cache. Then Joonsoo introduced workingset detection for anonymous pages and we gained the ability to account thrashing of them[1]. So let delayacct account both the thrashing of page cache and anonymous pages, this could make the codes more consistent and simpler. [1] commit aae466b0 ("mm/swap: implement workingset detection for anonymous LRU") Link: https://lkml.kernel.org/r/20220805033838.1714674-1-yang.yang29@zte.com.cn Signed-off-by:
Yang Yang <yang.yang29@zte.com.cn> Signed-off-by:
CGEL ZTE <cgel.zte@gmail.com> Acked-by:
Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Yang Yang <yang.yang29@zte.com.cn> Cc: David Hildenbrand <david@redhat.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Sep 20, 2022
-
-
Christoph Hellwig authored
PSI tries to account for the cost of bringing back in pages discarded by the MM LRU management. Currently the prime place for that is hooked into the bio submission path, which is a rather bad place: - it does not actually account I/O for non-block file systems, of which we have many - it adds overhead and a layering violation to the block layer Add the accounting into the two places in the core MM code that read pages into an address space by calling into ->read_folio and ->readahead so that the entire file system operations are covered, to broaden the coverage and allow removing the accounting in the block layer going forward. As psi_memstall_enter can deal with nested calls this will not lead to double accounting even while the bio annotations are still present. Signed-off-by:
Christoph Hellwig <hch@lst.de> Acked-by:
Johannes Weiner <hannes@cmpxchg.org> Link: https://lore.kernel.org/r/20220915094200.139713-2-hch@lst.de Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Sep 12, 2022
-
-
Vishal Moola (Oracle) authored
All callers of find_get_pages_contig() have been removed, so it is no longer needed. Link: https://lkml.kernel.org/r/20220824004023.77310-8-vishal.moola@gmail.com Signed-off-by:
Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Chris Mason <clm@fb.com> Cc: David Sterba <dsterba@suse.com> Cc: David Sterba <dsterb@suse.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Vishal Moola (Oracle) authored
Patch series "Convert to filemap_get_folios_contig()", v3. This patch series replaces find_get_pages_contig() with filemap_get_folios_contig(). This patch (of 7): This function is meant to replace find_get_pages_contig(). Unlike find_get_pages_contig(), filemap_get_folios_contig() no longer takes in a target number of pages to find - It returns up to 15 contiguous folios. To be more consistent with filemap_get_folios(), filemap_get_folios_contig() now also updates the start index passed in, and takes an end index. Link: https://lkml.kernel.org/r/20220824004023.77310-1-vishal.moola@gmail.com Link: https://lkml.kernel.org/r/20220824004023.77310-2-vishal.moola@gmail.com Signed-off-by:
Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: David Sterba <dsterba@suse.com> Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Matthew Wilcox <willy@infradead.org> Cc: David Sterba <dsterb@suse.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Shaoqin Huang authored
Replace three calls to compound_head() with one. Link: https://lkml.kernel.org/r/20220809023256.178194-1-shaoqin.huang@intel.com Signed-off-by:
Shaoqin Huang <shaoqin.huang@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Jul 30, 2022
-
-
Miaohe Lin authored
Restructure the logic in filemap_write_and_wait_range to simplify the code and make it more consistent with file_write_and_wait_range. No functional change intended. Link: https://lkml.kernel.org/r/20220627132351.55680-1-linmiaohe@huawei.com Signed-off-by:
Miaohe Lin <linmiaohe@huawei.com> Reviewed-by:
Muchun Song <songmuchun@bytedance.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Jul 25, 2022
-
-
Jens Axboe authored
If we're creating a page cache page with FGP_CREAT but FGP_NOWAIT is set, we should dial back the gfp flags to avoid frivolous blocking which is trivial to hit in low memory conditions: [ 10.117661] __schedule+0x8c/0x550 [ 10.118305] schedule+0x58/0xa0 [ 10.118897] schedule_timeout+0x30/0xdc [ 10.119610] __wait_for_common+0x88/0x114 [ 10.120348] wait_for_completion+0x1c/0x24 [ 10.121103] __flush_work.isra.0+0x16c/0x19c [ 10.121896] flush_work+0xc/0x14 [ 10.122496] __drain_all_pages+0x144/0x218 [ 10.123267] drain_all_pages+0x10/0x18 [ 10.123941] __alloc_pages+0x464/0x9e4 [ 10.124633] __folio_alloc+0x18/0x3c [ 10.125294] __filemap_get_folio+0x17c/0x204 [ 10.126084] iomap_write_begin+0xf8/0x428 [ 10.126829] iomap_file_buffered_write+0x144/0x24c [ 10.127710] xfs_file_buffered_write+0xe8/0x248 [ 10.128553] xfs_file_write_iter+0xa8/0x120 [ 10.129324] io_write+0x16c/0x38c [ 10.129940] io_issue_sqe+0x70/0x1cc [ 10.130617] io_queue_sqe+0x18/0xfc [ 10.131277] io_submit_sqes+0x5d4/0x600 [ 10.131946] __arm64_sys_io_uring_enter+0x224/0x600 [ 10.132752] invoke_syscall.constprop.0+0x70/0xc0 [ 10.133616] do_el0_svc+0xd0/0x118 [ 10.134238] el0_svc+0x78/0xa0 Clear IO, FS, and reclaim flags and mark the allocation as GFP_NOWAIT and add __GFP_NOWARN to avoid polluting dmesg with pointless allocations failures. A caller with FGP_NOWAIT must be expected to handle the resulting -EAGAIN return and retry from a suitable context without NOWAIT set. Reviewed-by:
Shakeel Butt <shakeelb@google.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Jun 29, 2022
-
-
Matthew Wilcox (Oracle) authored
By passing ->read_folio to filemap_read_folio(), we can use filemap_read_folio() in do_read_cache_folio(). Signed-off-by:
Matthew Wilcox (Oracle) <willy@infradead.org>
-
Matthew Wilcox (Oracle) authored
If the call to filler() returns AOP_TRUNCATED_PAGE, we need to retry the page cache lookup. Signed-off-by:
Matthew Wilcox (Oracle) <willy@infradead.org>
-
Matthew Wilcox (Oracle) authored
No functionality change intended; this simply moves code around to disentangle the function a little. Signed-off-by:
Matthew Wilcox (Oracle) <willy@infradead.org>
-
Matthew Wilcox (Oracle) authored
All callers of find_get_pages_range(), pagevec_lookup_range() and pagevec_lookup() have now been removed. Signed-off-by:
Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by:
Christoph Hellwig <hch@lst.de> Acked-by:
Christian Brauner (Microsoft) <brauner@kernel.org>
-
Matthew Wilcox (Oracle) authored
This is the equivalent of find_get_pages() but fills a folio_batch instead of an array of pages. Signed-off-by:
Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by:
Christoph Hellwig <hch@lst.de> Acked-by:
Christian Brauner (Microsoft) <brauner@kernel.org>
-
Matthew Wilcox (Oracle) authored
These functions have no more users, so delete them. Signed-off-by:
Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by:
Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by:
Muchun Song <songmuchun@bytedance.com>
-
- Jun 20, 2022
-
-
Matthew Wilcox (Oracle) authored
If a read races with an invalidation followed by another read, it is possible for a folio to be replaced with a higher-order folio. If that happens, we'll see a sibling entry for the new folio in the next iteration of the loop. This manifests as a NULL pointer dereference while holding the RCU read lock. Handle this by simply returning. The next call will find the new folio and handle it correctly. The other ways of handling this rare race are more complex and it's just not worth it. Reported-by:
Dave Chinner <david@fromorbit.com> Reported-by:
Brian Foster <bfoster@redhat.com> Debugged-by:
Brian Foster <bfoster@redhat.com> Tested-by:
Brian Foster <bfoster@redhat.com> Reviewed-by:
Brian Foster <bfoster@redhat.com> Fixes: cbd59c48 ("mm/filemap: use head pages in generic_file_buffered_read") Cc: stable@vger.kernel.org Signed-off-by:
Matthew Wilcox (Oracle) <willy@infradead.org>
-
Matthew Wilcox (Oracle) authored
We had an off-by-one error which meant that we never marked the first page in a read as accessed. This was visible as a slowdown when re-reading a file as pages were being evicted from cache too soon. In reviewing this code, we noticed a second bug where a multi-page folio would be marked as accessed multiple times when doing reads that were less than the size of the folio. Abstract the comparison of whether two file positions are in the same folio into a new function, fixing both of these bugs. Reported-by:
Yu Kuai <yukuai3@huawei.com> Reviewed-by:
Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by:
Matthew Wilcox (Oracle) <willy@infradead.org>
-
- Jun 09, 2022
-
-
Matthew Wilcox (Oracle) authored
After we have unlocked the mmap_lock for I/O, the file is pinned, but the VMA is not. Checking this flag after that can be a use-after-free. It's not a terribly interesting use-after-free as it can only read one bit, and it's used to decide whether to read 2MB or 4MB. But it upsets the automated tools and it's generally bad practice anyway, so let's fix it. Reported-by:
<syzbot+5b96d55e5b54924c77ad@syzkaller.appspotmail.com> Fixes: 4687fdbb ("mm/filemap: Support VM_HUGEPAGE for file mappings") Cc: stable@vger.kernel.org Signed-off-by:
Matthew Wilcox (Oracle) <willy@infradead.org>
-
- May 13, 2022
-
-
Peter Xu authored
This patch still does not use pte marker in any way, however it teaches the core mm about the pte marker idea. For example, handle_pte_marker() is introduced that will parse and handle all the pte marker faults. Many of the places are more about commenting it up - so that we know there's the possibility of pte marker showing up, and why we don't need special code for the cases. [peterx@redhat.com: userfaultfd.c needs swapops.h] Link: https://lkml.kernel.org/r/YmRlVj3cdizYJsr0@xz-m1.local Link: https://lkml.kernel.org/r/20220405014833.14015-1-peterx@redhat.com Signed-off-by:
Peter Xu <peterx@redhat.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Nadav Amit <nadav.amit@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-