Skip to content
Snippets Groups Projects
  1. Jul 24, 2024
    • Stephen Rothwell's avatar
    • Stephen Rothwell's avatar
    • Stephen Rothwell's avatar
    • Stephen Rothwell's avatar
    • Stephen Rothwell's avatar
    • Stephen Rothwell's avatar
    • Stephen Rothwell's avatar
    • Stephen Rothwell's avatar
    • Linus Torvalds's avatar
      Merge tag 'phy-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy · c33ffdb7
      Linus Torvalds authored
      Pull phy updates from Vinod Koul:
       "New Support
         - Samsung Exynos gs101 drd combo phy
         - Qualcomm SC8180x USB uniphy, IPQ9574 QMP PCIe phy
         - Airoha EN7581 PCIe phy
         - Freescale i.MX8Q HSIO SerDes phy
         - Starfive jh7110 dphy tx
      
        Updates:
         - Resume support for j721e-wiz driver
         - Updates to Exynos usbdrd driver
         - Support for optional power domains in g12a usb2-phy driver
         - Debugfs support and updates to zynqmp driver"
      
      * tag 'phy-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (56 commits)
        phy: airoha: Add dtime and Rx AEQ IO registers
        dt-bindings: phy: airoha: Add dtime and Rx AEQ IO registers
        dt-bindings: phy: rockchip-emmc-phy: Convert to dtschema
        dt-bindings: phy: qcom,qmp-usb: fix spelling error
        phy: exynos5-usbdrd: support Exynos USBDRD 3.1 combo phy (HS & SS)
        phy: exynos5-usbdrd: convert Vbus supplies to regulator_bulk
        phy: exynos5-usbdrd: convert (phy) register access clock to clk_bulk
        phy: exynos5-usbdrd: convert core clocks to clk_bulk
        phy: exynos5-usbdrd: support isolating HS and SS ports independently
        dt-bindings: phy: samsung,usb3-drd-phy: add gs101 compatible
        phy: core: Fix documentation of of_phy_get
        phy: starfive: Correct the dphy configure process
        phy: zynqmp: Add debugfs support
        phy: zynqmp: Take the phy mutex in xlate
        phy: zynqmp: Only wait for PLL lock "primary" instances
        phy: zynqmp: Store instance instead of type
        phy: zynqmp: Enable reference clock correctly
        phy: cadence-torrent: Check return value on register read
        phy: Fix the cacography in phy-exynos5250-usb2.c
        phy: phy-rockchip-samsung-hdptx: Select CONFIG_MFD_SYSCON
        ...
      c33ffdb7
    • Linus Torvalds's avatar
      Merge tag 'soundwire-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire · ad7b0b7b
      Linus Torvalds authored
      Pull soundwire updates from Vinod Koul:
      
       - Simplification across subsystem using cleanup.h
      
       - Support for debugfs to read/write commands
      
       - Few Intel and Qualcomm driver updates
      
      * tag 'soundwire-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
        soundwire: debugfs: simplify with cleanup.h
        soundwire: cadence: simplify with cleanup.h
        soundwire: intel_ace2x: simplify with cleanup.h
        soundwire: intel_ace2x: simplify return path in hw_params
        soundwire: intel: simplify with cleanup.h
        soundwire: intel: simplify return path in hw_params
        soundwire: amd_init: simplify with cleanup.h
        soundwire: amd: simplify with cleanup.h
        soundwire: amd: simplify return path in hw_params
        soundwire: intel_auxdevice: start the bus at default frequency
        soundwire: intel_auxdevice: add cs42l43 codec to wake_capable_list
        drivers:soundwire: qcom: cleanup port maask calculations
        soundwire: bus: simplify by using local slave->prop
        soundwire: generic_bandwidth_allocation: change port_bo parameter to pointer
        soundwire: Intel: clarify Copyright information
        soundwire: intel_ace2.x: add AC timing extensions for PantherLake
        soundwire: bus: add stream refcount
        soundwire: debugfs: add interface to read/write commands
      ad7b0b7b
    • Linus Torvalds's avatar
      Merge tag 'dmaengine-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine · 7a46b17d
      Linus Torvalds authored
      Pull dmaengine updates from Vinod Koul:
       "New support:
      
         - New dmaengine_prep_peripheral_dma_vec() to support transfers using
           dma vectors and documentation and user in AXI dma
      
         - STMicro STM32 DMA3 support and new capabilities of cyclic dma
      
        Updates:
      
         - Yaml conversion for Freescale imx dma and qdma bindings,
           sprd sc9860 dma binding
      
         - Altera msgdma updates for descriptor management"
      
      * tag 'dmaengine-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (35 commits)
        dt-bindings: fsl-qdma: fix interrupts 'if' check logic
        dt-bindings: dma: sprd,sc9860-dma: convert to YAML
        dmaengine: fsl-dpaa2-qdma: add missing MODULE_DESCRIPTION() macro
        dmaengine: ti: add missing MODULE_DESCRIPTION() macros
        dmaengine: ti: cppi41: add missing MODULE_DESCRIPTION() macro
        dmaengine: virt-dma: add missing MODULE_DESCRIPTION() macro
        dmaengine: ti: k3-udma: Fix BCHAN count with UHC and HC channels
        dmaengine: sh: rz-dmac: Fix lockdep assert warning
        dmaengine: qcom: gpi: clean up the IRQ disable/enable in gpi_reset_chan()
        dmaengine: fsl-edma: change the memory access from local into remote mode in i.MX 8QM
        dmaengine: qcom: gpi: remove unused struct 'reg_info'
        dmaengine: moxart-dma: remove unused struct 'moxart_filter_data'
        dt-bindings: fsl-qdma: Convert to yaml format
        dmaengine: fsl-edma: remove redundant "idle" field from fsl_chan
        dmaengine: fsl-edma: request per-channel IRQ only when channel is allocated
        dmaengine: stm32-dma3: defer channel registration to specify channel name
        dmaengine: add channel device name to channel registration
        dmaengine: stm32-dma3: improve residue granularity
        dmaengine: stm32-dma3: add device_pause and device_resume ops
        dmaengine: stm32-dma3: add DMA_MEMCPY capability
        ...
      7a46b17d
    • Linus Torvalds's avatar
      Merge tag 'random-6.11-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random · 7a3fad30
      Linus Torvalds authored
      Pull random number generator updates from Jason Donenfeld:
       "This adds getrandom() support to the vDSO.
      
        First, it adds a new kind of mapping to mmap(2), MAP_DROPPABLE, which
        lets the kernel zero out pages anytime under memory pressure, which
        enables allocating memory that never gets swapped to disk but also
        doesn't count as being mlocked.
      
        Then, the vDSO implementation of getrandom() is introduced in a
        generic manner and hooked into random.c.
      
        Next, this is implemented on x86. (Also, though it's not ready for
        this pull, somebody has begun an arm64 implementation already)
      
        Finally, two vDSO selftests are added.
      
        There are also two housekeeping cleanup commits"
      
      * tag 'random-6.11-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
        MAINTAINERS: add random.h headers to RNG subsection
        random: note that RNDGETPOOL was removed in 2.6.9-rc2
        selftests/vDSO: add tests for vgetrandom
        x86: vdso: Wire up getrandom() vDSO implementation
        random: introduce generic vDSO getrandom() implementation
        mm: add MAP_DROPPABLE for designating always lazily freeable mappings
      7a3fad30
    • Steve French's avatar
      smb3: add four dynamic tracepoints for copy_file_range and reflink · 9a819f13
      Steve French authored
      
      Add more dynamic tracepoints to help debug copy_file_range (copychunk)
      and clone_range ("duplicate extents").  These are tracepoints for
      entering the function and completing without error. For example:
      
        "trace-cmd record -e smb3_copychunk_enter -e smb3_copychunk_done"
      
      or
      
        "trace-cmd record -e smb3_clone_enter -e smb3_clone_done"
      
      Here is sample output:
      
             TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
               | |         |   |||||     |         |
             cp-5964    [005] .....  2176.168977: smb3_clone_enter:
               xid=17 sid=0xeb275be4 tid=0x7ffa7cdb source fid=0x1ed02e15
               source offset=0x0 target fid=0x1ed02e15 target offset=0x0
               len=0xa0000
             cp-5964    [005] .....  2176.170668: smb3_clone_done:
               xid=17 sid=0xeb275be4 tid=0x7ffa7cdb source fid=0x1ed02e15
               source offset=0x0 target fid=0x1ed02e15 target offset=0x0
               len=0xa0000
      
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      9a819f13
    • Steve French's avatar
      smb3: add dynamic tracepoints for copy_file_range and reflink errors · da96efc4
      Steve French authored
      
      There are cases where debugging clone_range ("smb2_duplicate_extents"
      function) and copy_range ("smb2_copychunk_range") can be helpful.
      Add dynamic trace points for any errors in these two routines. e,g,
      
        "trace-cmd record -e smb3_copychunk_err -e smb3_clone_err"
      
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      da96efc4
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.11-rc1.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · d1e9a63d
      Linus Torvalds authored
      Pull vfs fixes from Christian Brauner:
       "VFS:
      
         - The new 64bit mount ids start after the old mount id, i.e., at the
           first non-32 bit value. However, we started counting one id too
           late and thus lost 4294967296 as the first valid id. Fix that.
      
         - Update a few comments on some vfs_*() creation helpers.
      
         - Move copying of the xattr name out from the locks required to start
           a filesystem write.
      
         - Extend the filelock lock UAF fix to the compat code as well.
      
         - Now that we added the ability to look up an inode under RCU it's
           possible that lockless hash lookup can find and lock an inode after
           it gets I_FREEING set. It then waits until inode teardown in
           evict() is finished.
      
           The flag however is still set after evict() has woken up all
           waiters. If the inode lock is taken late enough on the waiting side
           after hash removal and wakeup happened the waiting thread will
           never be woken.
      
           Before RCU based lookup this was synchronized via the
           inode_hash_lock. But since unhashing requires the inode lock as
           well we can check whether the inode is unhashed while holding inode
           lock even without holding inode_hash_lock.
      
        pidfd:
      
         - The nsproxy structure contains nearly all of the namespaces
           associated with a task. When a namespace type isn't supported
           nsproxy might contain a NULL pointer or always point to the initial
           namespace type. The logic isn't consistent. So when deriving
           namespace fds we need to ensure that the namespace type is
           supported.
      
           First, so that we don't risk dereferncing NULL pointers. The
           correct bigger fix would be to change all namespaces to always set
           a valid namespace pointer in struct nsproxy independent of whether
           or not it is compiled in. But that requires quite a few changes.
      
           Second, so that we don't allow deriving namespace fds when the
           namespace type doesn't exist and thus when they couldn't also be
           derived via /proc/self/ns/.
      
         - Add missing selftests for the new pidfd ioctls to derive namespace
           fds. This simply extends the already existing testsuite.
      
        netfs:
      
         - Fix debug logging and fix kconfig variable name so it actually
           works.
      
         - Fix writeback that goes both to the server and cache. The streams
           are only activated once a subreq is added. When a server write
           happens the subreq doesn't need to have finished by the time the
           cache write is started. If the server write has already finished by
           the time the cache write is about to start the cache write will
           operate on a folio that might already have been reused. Fix this by
           preactivating the cache write.
      
         - Limit cachefiles subreq size for cache writes to MAX_RW_COUNT"
      
      * tag 'vfs-6.11-rc1.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        inode: clarify what's locked
        vfs: Fix potential circular locking through setxattr() and removexattr()
        filelock: Fix fcntl/close race recovery compat path
        fs: use all available ids
        cachefiles: Set the max subreq size for cache writes to MAX_RW_COUNT
        netfs: Fix writeback that needs to go to both server and cache
        pidfs: add selftests for new namespace ioctls
        pidfs: handle kernels without namespaces cleanly
        pidfs: when time ns disabled add check for ioctl
        vfs: correct the comments of vfs_*() helpers
        vfs: handle __wait_on_freeing_inode() and evict() race
        netfs: Rename CONFIG_FSCACHE_DEBUG to CONFIG_NETFS_DEBUG
        netfs: Revert "netfs: Switch debug logging to pr_debug()"
      d1e9a63d
    • Linus Torvalds's avatar
      hostfs: fix folio conversion · e44be002
      Linus Torvalds authored
      
      Commit e3ec0fe9 ("hostfs: Convert hostfs_read_folio() to use a
      folio") simplified hostfs_read_folio(), but in the process of converting
      to using folios natively also mis-used the folio_zero_tail() function
      due to the very confusing API of that function.
      
      Very arguably it's folio_zero_tail() API itself that is buggy, since it
      would make more sense (and the documentation kind of implies) that the
      third argument would be the pointer to the beginning of the folio
      buffer.
      
      But no, the third argument to folio_zero_tail() is where we should start
      zeroing the tail (even if we already also pass in the offset separately
      as the second argument).
      
      So fix the hostfs caller, and we can leave any folio_zero_tail() sanity
      cleanup for later.
      
      Reported-and-tested-by: default avatarMaciej Żenczykowski <maze@google.com>
      Fixes: e3ec0fe9 ("hostfs: Convert hostfs_read_folio() to use a folio")
      Link: https://lore.kernel.org/all/CANP3RGceNzwdb7w=vPf5=7BCid5HVQDmz1K5kC9JG42+HVAh_g@mail.gmail.com/
      
      
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Christian Brauner <brauner@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e44be002
    • Ilya Dryomov's avatar
      rbd: don't assume rbd_is_lock_owner() for exclusive mappings · 5c9e62c1
      Ilya Dryomov authored
      
      Expanding on the previous commit, assuming that rbd_is_lock_owner()
      always returns true (i.e. that we are either in RBD_LOCK_STATE_LOCKED
      or RBD_LOCK_STATE_QUIESCING) if the mapping is exclusive is wrong too.
      In case ceph_cls_set_cookie() fails, the lock would be temporarily
      released even if the mapping is exclusive, meaning that we can end up
      even in RBD_LOCK_STATE_UNLOCKED.
      
      IOW, exclusive mappings are really "just" about disabling automatic
      lock transitions (as documented in the man page), not about grabbing
      the lock and holding on to it whatever it takes.
      
      Cc: stable@vger.kernel.org
      Fixes: 637cd060 ("rbd: new exclusive lock wait/wake code")
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      5c9e62c1
    • Christian Brauner's avatar
      xattr: use simple helper to copy xattr name · 7d82086d
      Christian Brauner authored
      
      to avoid pointless code duplication.
      
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      7d82086d
    • Christian Brauner's avatar
      inode: clarify what's locked · f5e5e97c
      Christian Brauner authored
      
      In __wait_on_freeing_inode() we warn in case the inode_hash_lock is held
      but the inode is unhashed. We then release the inode_lock. So using
      "locked" as parameter name is confusing. Use is_inode_hash_locked as
      parameter name instead.
      
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      f5e5e97c
    • Ilya Dryomov's avatar
      rbd: don't assume RBD_LOCK_STATE_LOCKED for exclusive mappings · fabc4d45
      Ilya Dryomov authored
      
      Every time a watch is reestablished after getting lost, we need to
      update the cookie which involves quiescing exclusive lock.  For this,
      we transition from RBD_LOCK_STATE_LOCKED to RBD_LOCK_STATE_QUIESCING
      roughly for the duration of rbd_reacquire_lock() call.  If the mapping
      is exclusive and I/O happens to arrive in this time window, it's failed
      with EROFS (later translated to EIO) based on the wrong assumption in
      rbd_img_exclusive_lock() -- "lock got released?" check there stopped
      making sense with commit a2b1da09 ("rbd: lock should be quiesced on
      reacquire").
      
      To make it worse, any such I/O is added to the acquiring list before
      EROFS is returned and this sets up for violating rbd_lock_del_request()
      precondition that the request is either on the running list or not on
      any list at all -- see commit ded080c8 ("rbd: don't move requests
      to the running list on errors").  rbd_lock_del_request() ends up
      processing these requests as if they were on the running list which
      screws up quiescing_wait completion counter and ultimately leads to
      
          rbd_assert(!completion_done(&rbd_dev->quiescing_wait));
      
      being triggered on the next watch error.
      
      Cc: stable@vger.kernel.org # 06ef84c4: rbd: rename RBD_LOCK_STATE_RELEASING and releasing_wait
      Cc: stable@vger.kernel.org
      Fixes: 637cd060 ("rbd: new exclusive lock wait/wake code")
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      fabc4d45
    • Ilya Dryomov's avatar
      rbd: rename RBD_LOCK_STATE_RELEASING and releasing_wait · 06ef84c4
      Ilya Dryomov authored
      
      ... to RBD_LOCK_STATE_QUIESCING to quiescing_wait to recognize that
      this state and the associated completion are backing rbd_quiesce_lock(),
      which isn't specific to releasing the lock.
      
      While exclusive lock does get quiesced before it's released, it also
      gets quiesced before an attempt to update the cookie is made and there
      the lock is not released as long as ceph_cls_set_cookie() succeeds.
      
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      06ef84c4
    • David Howells's avatar
      vfs: Fix potential circular locking through setxattr() and removexattr() · c3a5e3e8
      David Howells authored
      
      When using cachefiles, lockdep may emit something similar to the circular
      locking dependency notice below.  The problem appears to stem from the
      following:
      
       (1) Cachefiles manipulates xattrs on the files in its cache when called
           from ->writepages().
      
       (2) The setxattr() and removexattr() system call handlers get the name
           (and value) from userspace after taking the sb_writers lock, putting
           accesses of the vma->vm_lock and mm->mmap_lock inside of that.
      
       (3) The afs filesystem uses a per-inode lock to prevent multiple
           revalidation RPCs and in writeback vs truncate to prevent parallel
           operations from deadlocking against the server on one side and local
           page locks on the other.
      
      Fix this by moving the getting of the name and value in {get,remove}xattr()
      outside of the sb_writers lock.  This also has the minor benefits that we
      don't need to reget these in the event of a retry and we never try to take
      the sb_writers lock in the event we can't pull the name and value into the
      kernel.
      
      Alternative approaches that might fix this include moving the dispatch of a
      write to the cache off to a workqueue or trying to do without the
      validation lock in afs.  Note that this might also affect other filesystems
      that use netfslib and/or cachefiles.
      
       ======================================================
       WARNING: possible circular locking dependency detected
       6.10.0-build2+ #956 Not tainted
       ------------------------------------------------------
       fsstress/6050 is trying to acquire lock:
       ffff888138fd82f0 (mapping.invalidate_lock#3){++++}-{3:3}, at: filemap_fault+0x26e/0x8b0
      
       but task is already holding lock:
       ffff888113f26d18 (&vma->vm_lock->lock){++++}-{3:3}, at: lock_vma_under_rcu+0x165/0x250
      
       which lock already depends on the new lock.
      
       the existing dependency chain (in reverse order) is:
      
       -> #4 (&vma->vm_lock->lock){++++}-{3:3}:
              __lock_acquire+0xaf0/0xd80
              lock_acquire.part.0+0x103/0x280
              down_write+0x3b/0x50
              vma_start_write+0x6b/0xa0
              vma_link+0xcc/0x140
              insert_vm_struct+0xb7/0xf0
              alloc_bprm+0x2c1/0x390
              kernel_execve+0x65/0x1a0
              call_usermodehelper_exec_async+0x14d/0x190
              ret_from_fork+0x24/0x40
              ret_from_fork_asm+0x1a/0x30
      
       -> #3 (&mm->mmap_lock){++++}-{3:3}:
              __lock_acquire+0xaf0/0xd80
              lock_acquire.part.0+0x103/0x280
              __might_fault+0x7c/0xb0
              strncpy_from_user+0x25/0x160
              removexattr+0x7f/0x100
              __do_sys_fremovexattr+0x7e/0xb0
              do_syscall_64+0x9f/0x100
              entry_SYSCALL_64_after_hwframe+0x76/0x7e
      
       -> #2 (sb_writers#14){.+.+}-{0:0}:
              __lock_acquire+0xaf0/0xd80
              lock_acquire.part.0+0x103/0x280
              percpu_down_read+0x3c/0x90
              vfs_iocb_iter_write+0xe9/0x1d0
              __cachefiles_write+0x367/0x430
              cachefiles_issue_write+0x299/0x2f0
              netfs_advance_write+0x117/0x140
              netfs_write_folio.isra.0+0x5ca/0x6e0
              netfs_writepages+0x230/0x2f0
              afs_writepages+0x4d/0x70
              do_writepages+0x1e8/0x3e0
              filemap_fdatawrite_wbc+0x84/0xa0
              __filemap_fdatawrite_range+0xa8/0xf0
              file_write_and_wait_range+0x59/0x90
              afs_release+0x10f/0x270
              __fput+0x25f/0x3d0
              __do_sys_close+0x43/0x70
              do_syscall_64+0x9f/0x100
              entry_SYSCALL_64_after_hwframe+0x76/0x7e
      
       -> #1 (&vnode->validate_lock){++++}-{3:3}:
              __lock_acquire+0xaf0/0xd80
              lock_acquire.part.0+0x103/0x280
              down_read+0x95/0x200
              afs_writepages+0x37/0x70
              do_writepages+0x1e8/0x3e0
              filemap_fdatawrite_wbc+0x84/0xa0
              filemap_invalidate_inode+0x167/0x1e0
              netfs_unbuffered_write_iter+0x1bd/0x2d0
              vfs_write+0x22e/0x320
              ksys_write+0xbc/0x130
              do_syscall_64+0x9f/0x100
              entry_SYSCALL_64_after_hwframe+0x76/0x7e
      
       -> #0 (mapping.invalidate_lock#3){++++}-{3:3}:
              check_noncircular+0x119/0x160
              check_prev_add+0x195/0x430
              __lock_acquire+0xaf0/0xd80
              lock_acquire.part.0+0x103/0x280
              down_read+0x95/0x200
              filemap_fault+0x26e/0x8b0
              __do_fault+0x57/0xd0
              do_pte_missing+0x23b/0x320
              __handle_mm_fault+0x2d4/0x320
              handle_mm_fault+0x14f/0x260
              do_user_addr_fault+0x2a2/0x500
              exc_page_fault+0x71/0x90
              asm_exc_page_fault+0x22/0x30
      
       other info that might help us debug this:
      
       Chain exists of:
         mapping.invalidate_lock#3 --> &mm->mmap_lock --> &vma->vm_lock->lock
      
        Possible unsafe locking scenario:
      
              CPU0                    CPU1
              ----                    ----
         rlock(&vma->vm_lock->lock);
                                      lock(&mm->mmap_lock);
                                      lock(&vma->vm_lock->lock);
         rlock(mapping.invalidate_lock#3);
      
        *** DEADLOCK ***
      
       1 lock held by fsstress/6050:
        #0: ffff888113f26d18 (&vma->vm_lock->lock){++++}-{3:3}, at: lock_vma_under_rcu+0x165/0x250
      
       stack backtrace:
       CPU: 0 PID: 6050 Comm: fsstress Not tainted 6.10.0-build2+ #956
       Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
       Call Trace:
        <TASK>
        dump_stack_lvl+0x57/0x80
        check_noncircular+0x119/0x160
        ? queued_spin_lock_slowpath+0x4be/0x510
        ? __pfx_check_noncircular+0x10/0x10
        ? __pfx_queued_spin_lock_slowpath+0x10/0x10
        ? mark_lock+0x47/0x160
        ? init_chain_block+0x9c/0xc0
        ? add_chain_block+0x84/0xf0
        check_prev_add+0x195/0x430
        __lock_acquire+0xaf0/0xd80
        ? __pfx___lock_acquire+0x10/0x10
        ? __lock_release.isra.0+0x13b/0x230
        lock_acquire.part.0+0x103/0x280
        ? filemap_fault+0x26e/0x8b0
        ? __pfx_lock_acquire.part.0+0x10/0x10
        ? rcu_is_watching+0x34/0x60
        ? lock_acquire+0xd7/0x120
        down_read+0x95/0x200
        ? filemap_fault+0x26e/0x8b0
        ? __pfx_down_read+0x10/0x10
        ? __filemap_get_folio+0x25/0x1a0
        filemap_fault+0x26e/0x8b0
        ? __pfx_filemap_fault+0x10/0x10
        ? find_held_lock+0x7c/0x90
        ? __pfx___lock_release.isra.0+0x10/0x10
        ? __pte_offset_map+0x99/0x110
        __do_fault+0x57/0xd0
        do_pte_missing+0x23b/0x320
        __handle_mm_fault+0x2d4/0x320
        ? __pfx___handle_mm_fault+0x10/0x10
        handle_mm_fault+0x14f/0x260
        do_user_addr_fault+0x2a2/0x500
        exc_page_fault+0x71/0x90
        asm_exc_page_fault+0x22/0x30
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Link: https://lore.kernel.org/r/2136178.1721725194@warthog.procyon.org.uk
      
      
      cc: Alexander Viro <viro@zeniv.linux.org.uk>
      cc: Christian Brauner <brauner@kernel.org>
      cc: Jan Kara <jack@suse.cz>
      cc: Jeff Layton <jlayton@kernel.org>
      cc: Gao Xiang <xiang@kernel.org>
      cc: Matthew Wilcox <willy@infradead.org>
      cc: netfs@lists.linux.dev
      cc: linux-erofs@lists.ozlabs.org
      cc: linux-fsdevel@vger.kernel.org
      [brauner: fix minor issues]
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      c3a5e3e8
    • Jann Horn's avatar
      filelock: Fix fcntl/close race recovery compat path · f8138f2a
      Jann Horn authored
      When I wrote commit 3cad1bc0 ("filelock: Remove locks reliably when
      fcntl/close race is detected"), I missed that there are two copies of the
      code I was patching: The normal version, and the version for 64-bit offsets
      on 32-bit kernels.
      Thanks to Greg KH for stumbling over this while doing the stable
      backport...
      
      Apply exactly the same fix to the compat path for 32-bit kernels.
      
      Fixes: c293621b ("[PATCH] stale POSIX lock handling")
      Cc: stable@kernel.org
      Link: https://bugs.chromium.org/p/project-zero/issues/detail?id=2563
      
      
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Link: https://lore.kernel.org/r/20240723-fs-lock-recover-compatfix-v1-1-148096719529@google.com
      
      
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      f8138f2a
    • Christian Brauner's avatar
      fs: use all available ids · 8eac5358
      Christian Brauner authored
      The counter is unconditionally incremented for each mount allocation.
      If we set it to 1ULL << 32 we're losing 4294967296 as the first valid
      non-32 bit mount id.
      
      Link: https://lore.kernel.org/r/20240719-work-mount-namespace-v1-1-834113cab0d2@kernel.org
      
      
      Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      8eac5358
    • David Howells's avatar
      cachefiles: Set the max subreq size for cache writes to MAX_RW_COUNT · 51d37982
      David Howells authored
      
      Set the maximum size of a subrequest that writes to cachefiles to be
      MAX_RW_COUNT so that we don't overrun the maximum write we can make to the
      backing filesystem.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Link: https://lore.kernel.org/r/1599005.1721398742@warthog.procyon.org.uk
      
      
      cc: Jeff Layton <jlayton@kernel.org>
      cc: netfs@lists.linux.dev
      cc: linux-fsdevel@vger.kernel.org
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      51d37982
    • David Howells's avatar
      netfs: Fix writeback that needs to go to both server and cache · 212be98a
      David Howells authored
      
      When netfslib is performing writeback (ie. ->writepages), it maintains two
      parallel streams of writes, one to the server and one to the cache, but it
      doesn't mark either stream of writes as active until it gets some data that
      needs to be written to that stream.
      
      This is done because some folios will only be written to the cache
      (e.g. copying to the cache on read is done by marking the folios and
      letting writeback do the actual work) and sometimes we'll only be writing
      to the server (e.g. if there's no cache).
      
      Now, since we don't actually dispatch uploads and cache writes in parallel,
      but rather flip between the streams, depending on which has the lowest
      so-far-issued offset, and don't wait for the subreqs to finish before
      flipping, we can end up in a situation where, say, we issue a write to the
      server and this completes before we start the write to the cache.
      
      But because we only activate a stream when we first add a subreq to it, the
      result collection code may run before we manage to activate the stream -
      resulting in the folio being cleaned and having the writeback-in-progress
      mark removed.  At this point, the folio no longer belongs to us.
      
      This is only really a problem for folios that need to be written to both
      streams - and in that case, the upload to the server is started first,
      followed by the write to the cache - and the cache write may see a bad
      folio.
      
      Fix this by activating the cache stream up front if there's a cache
      available.  If there's a cache, then all data is going to be written to it.
      
      Fixes: 288ace2f ("netfs: New writeback implementation")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Link: https://lore.kernel.org/r/1599053.1721398818@warthog.procyon.org.uk
      
      
      cc: Jeff Layton <jlayton@kernel.org>
      cc: netfs@lists.linux.dev
      cc: linux-fsdevel@vger.kernel.org
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      212be98a
    • Christian Brauner's avatar
      pidfs: add selftests for new namespace ioctls · 1bb8dce5
      Christian Brauner authored
      Add selftests to verify that deriving namespace file descriptors from
      pidfd file descriptors works correctly.
      
      Link: https://lore.kernel.org/r/20240722-work-pidfs-69dbea91edab@brauner
      
      
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      1bb8dce5
    • Christian Brauner's avatar
      pidfs: handle kernels without namespaces cleanly · 9b3e1504
      Christian Brauner authored
      The nsproxy structure contains nearly all of the namespaces associated
      with a task. When a given namespace type is not supported by this kernel
      the rules whether the corresponding pointer in struct nsproxy is NULL or
      always init_<ns_type>_ns differ per namespace. Ideally, that wouldn't be
      the case and for all namespace types we'd always set it to
      init_<ns_type>_ns when the corresponding namespace type isn't supported.
      
      Make sure we handle all namespaces where the pointer in struct nsproxy
      can be NULL when the namespace type isn't supported.
      
      Link: https://lore.kernel.org/r/20240722-work-pidfs-e6a83030f63e@brauner
      
      
      Fixes: 5b08bd40 ("pidfs: allow retrieval of namespace file descriptors") # mainline only
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      9b3e1504
    • Edward Adam Davis's avatar
      pidfs: when time ns disabled add check for ioctl · f60d38cb
      Edward Adam Davis authored
      
      syzbot call pidfd_ioctl() with cmd "PIDFD_GET_TIME_NAMESPACE" and disabled
      CONFIG_TIME_NS, since time_ns is NULL, it will make NULL ponter deref in
      open_namespace.
      
      Fixes: 5b08bd40 ("pidfs: allow retrieval of namespace file descriptors") # mainline only
      Reported-and-tested-by: default avatar <syzbot+34a0ee986f61f15da35d@syzkaller.appspotmail.com>
      Closes: https://syzkaller.appspot.com/bug?extid=34a0ee986f61f15da35d
      
      
      Signed-off-by: default avatarEdward Adam Davis <eadavis@qq.com>
      Link: https://lore.kernel.org/r/tencent_7FAE8DB725EE0DD69236DDABDDDE195E4F07@qq.com
      
      
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      f60d38cb
    • Congjie Zhou's avatar
      vfs: correct the comments of vfs_*() helpers · b40c8e7a
      Congjie Zhou authored
      
      correct the comments of vfs_*() helpers in fs/namei.c, including:
      1. vfs_create()
      2. vfs_mknod()
      3. vfs_mkdir()
      4. vfs_rmdir()
      5. vfs_symlink()
      
      All of them come from the same commit:
      6521f891 "namei: prepare for idmapped mounts"
      
      The @dentry is actually the dentry of child directory rather than
      base directory(parent directory), and thus the @dir has to be
      modified due to the change of @dentry.
      
      Signed-off-by: default avatarCongjie Zhou <zcjie0802@qq.com>
      Link: https://lore.kernel.org/r/tencent_2FCF6CC9E10DC8A27AE58A5A0FE4FCE96D0A@qq.com
      
      
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      b40c8e7a
    • Mateusz Guzik's avatar
      vfs: handle __wait_on_freeing_inode() and evict() race · 5bc9ad78
      Mateusz Guzik authored
      Lockless hash lookup can find and lock the inode after it gets the
      I_FREEING flag set, at which point it blocks waiting for teardown in
      evict() to finish.
      
      However, the flag is still set even after evict() wakes up all waiters.
      
      This results in a race where if the inode lock is taken late enough, it
      can happen after both hash removal and wakeups, meaning there is nobody
      to wake the racing thread up.
      
      This worked prior to RCU-based lookup because the entire ordeal was
      synchronized with the inode hash lock.
      
      Since unhashing requires the inode lock, we can safely check whether it
      happened after acquiring it.
      
      Link: https://lore.kernel.org/v9fs/20240717102458.649b60be@kernel.org/
      
      
      Reported-by: default avatarDominique Martinet <asmadeus@codewreck.org>
      Fixes: 7180f8d9 ("vfs: add rcu-based find_inode variants for iget ops")
      Signed-off-by: default avatarMateusz Guzik <mjguzik@gmail.com>
      Link: https://lore.kernel.org/r/20240718151838.611807-1-mjguzik@gmail.com
      
      
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      5bc9ad78
    • David Howells's avatar
      netfs: Rename CONFIG_FSCACHE_DEBUG to CONFIG_NETFS_DEBUG · fcad9336
      David Howells authored
      
      CONFIG_FSCACHE_DEBUG should have been renamed to CONFIG_NETFS_DEBUG, so do
      that now.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Link: https://lore.kernel.org/r/1410796.1721333406@warthog.procyon.org.uk
      
      
      cc: Uwe Kleine-König <ukleinek@kernel.org>
      cc: Christian Brauner <brauner@kernel.org>
      cc: Jeff Layton <jlayton@kernel.org>
      cc: netfs@lists.linux.dev
      cc: linux-fsdevel@vger.kernel.org
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      fcad9336
    • David Howells's avatar
      netfs: Revert "netfs: Switch debug logging to pr_debug()" · a9d47a50
      David Howells authored
      
      Revert commit 163eae0f to get back the
      original operation of the debugging macros.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Link: https://lore.kernel.org/r/20240608151352.22860-2-ukleinek@kernel.org
      Link: https://lore.kernel.org/r/1410685.1721333252@warthog.procyon.org.uk
      
      
      cc: Uwe Kleine-König <ukleinek@kernel.org>
      cc: Christian Brauner <brauner@kernel.org>
      cc: Jeff Layton <jlayton@kernel.org>
      cc: netfs@lists.linux.dev
      cc: linux-fsdevel@vger.kernel.org
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      a9d47a50
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-fixes-for-v6.11-2024-07-23' of... · 786c8248
      Linus Torvalds authored
      Merge tag 'perf-tools-fixes-for-v6.11-2024-07-23' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
      
      Pull perf tools fixes from Namhyung Kim:
       "Two fixes for building perf and other tools:
      
         - Fix breakage in tracing tools due to pkg-config for
           libtrace{event,fs}
      
         - Fix build of perf when libunwind is used"
      
      * tag 'perf-tools-fixes-for-v6.11-2024-07-23' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
        perf dso: Fix build when libunwind is enabled
        tools/latency: Use pkg-config in lib_setup of Makefile.config
        tools/rtla: Use pkg-config in lib_setup of Makefile.config
        tools/verification: Use pkg-config in lib_setup of Makefile.config
        tools: Make pkg-config dependency checks usable by other tools
        perf build: Warn if libtracefs is not found
      786c8248
    • Linus Torvalds's avatar
      Merge tag 'execve-v6.11-rc1-fix1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · e9e96979
      Linus Torvalds authored
      Pull execve fix from Kees Cook:
       "This moves the exec and binfmt_elf tests out of your way and into the
        tests/ subdirectory, following the newly ratified KUnit naming
        conventions. :)"
      
      * tag 'execve-v6.11-rc1-fix1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        execve: Move KUnit tests to tests/ subdirectory
      e9e96979
  2. Jul 23, 2024
    • Steve French's avatar
      cifs: mount with "unix" mount option for SMB1 incorrectly handled · 0e314e45
      Steve French authored
      
      Although by default we negotiate CIFS Unix Extensions for SMB1 mounts to
      Samba (and they work if the user does not specify "unix" or "posix" or
      "linux" on mount), and we do properly handle when a user turns them off
      with "nounix" mount parm.  But with the changes to the mount API we
      broke cases where the user explicitly specifies the "unix" option (or
      equivalently "linux" or "posix") on mount with vers=1.0 to Samba or other
      servers which support the CIFS Unix Extensions.
      
       "mount error(95): Operation not supported"
      
      and logged:
      
       "CIFS: VFS: Check vers= mount option. SMB3.11 disabled but required for POSIX extensions"
      
      even though CIFS Unix Extensions are supported for vers=1.0  This patch fixes
      the case where the user specifies both "unix" (or equivalently "posix" or
      "linux") and "vers=1.0" on mount to a server which supports the
      CIFS Unix Extensions.
      
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarDavid Howells <dhowell@redhat.com>
      Reviewed-by: default avatarPaulo Alcantara (Red Hat) <pc@manguebit.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      0e314e45
    • Steve French's avatar
      cifs: fix reconnect with SMB1 UNIX Extensions · a214384c
      Steve French authored
      
      When mounting with the SMB1 Unix Extensions (e.g. mounts
      to Samba with vers=1.0), reconnects no longer reset the
      Unix Extensions (SetFSInfo SET_FILE_UNIX_BASIC) after tcon so most
      operations (e.g. stat, ls, open, statfs) will fail continuously
      with:
              "Operation not supported"
      if the connection ever resets (e.g. due to brief network disconnect)
      
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarPaulo Alcantara (Red Hat) <pc@manguebit.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      a214384c
    • Linus Torvalds's avatar
      Merge tag 'f2fs-for-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs · 5ad7ff87
      Linus Torvalds authored
      Pull f2fs updates from Jaegeuk Kim:
       "A pretty small update including mostly minor bug fixes in zoned
        storage along with the large section support.
      
        Enhancements:
         - add support for FS_IOC_GETFSSYSFSPATH
         - enable atgc dynamically if conditions are met
         - use new ioprio Macro to get ckpt thread ioprio level
         - remove unreachable lazytime mount option parsing
      
        Bug fixes:
         - fix null reference error when checking end of zone
         - fix start segno of large section
         - fix to cover read extent cache access with lock
         - don't dirty inode for readonly filesystem
         - allocate a new section if curseg is not the first seg in its zone
         - only fragment segment in the same section
         - truncate preallocated blocks in f2fs_file_open()
         - fix to avoid use SSR allocate when do defragment
         - fix to force buffered IO on inline_data inode
      
        And some minor code clean-ups and sanity checks"
      
      * tag 'f2fs-for-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (26 commits)
        f2fs: clean up addrs_per_{inode,block}()
        f2fs: clean up F2FS_I()
        f2fs: use meta inode for GC of COW file
        f2fs: use meta inode for GC of atomic file
        f2fs: only fragment segment in the same section
        f2fs: fix to update user block counts in block_operations()
        f2fs: remove unreachable lazytime mount option parsing
        f2fs: fix null reference error when checking end of zone
        f2fs: fix start segno of large section
        f2fs: remove redundant sanity check in sanity_check_inode()
        f2fs: assign CURSEG_ALL_DATA_ATGC if blkaddr is valid
        f2fs: fix to use mnt_{want,drop}_write_file replace file_{start,end}_wrtie
        f2fs: clean up set REQ_RAHEAD given rac
        f2fs: enable atgc dynamically if conditions are met
        f2fs: fix to truncate preallocated blocks in f2fs_file_open()
        f2fs: fix to cover read extent cache access with lock
        f2fs: fix return value of f2fs_convert_inline_inode()
        f2fs: use new ioprio Macro to get ckpt thread ioprio level
        f2fs: fix to don't dirty inode for readonly filesystem
        f2fs: fix to avoid use SSR allocate when do defragment
        ...
      5ad7ff87
    • Linus Torvalds's avatar
      Merge tag 'jfs-6.11' of github.com:kleikamp/linux-shaggy · 371c1414
      Linus Torvalds authored
      Pull jfs updates from David Kleikamp:
       "Folio conversion from Matthew Wilcox and a few various fixes"
      
      * tag 'jfs-6.11' of github.com:kleikamp/linux-shaggy:
        jfs: don't walk off the end of ealist
        jfs: Fix shift-out-of-bounds in dbDiscardAG
        jfs: Fix array-index-out-of-bounds in diFree
        jfs: fix null ptr deref in dtInsertEntry
        jfs: Remove use of folio error flag
        fs: Remove i_blocks_per_page
        jfs: Change metapage->page to metapage->folio
        jfs: Convert force_metapage to use a folio
        jfs: Convert inc_io to take a folio
        jfs: Convert page_to_mp to folio_to_mp
        jfs; Convert __invalidate_metapages to use a folio
        jfs: Convert dec_io to take a folio
        jfs: Convert drop_metapage and remove_metapage to take a folio
        jfs; Convert release_metapage to use a folio
        jfs: Convert insert_metapage() to take a folio
        jfs: Convert __get_metapage to use a folio
        jfs: Convert metapage_writepage to metapage_write_folio
        jfs: Convert metapage_read_folio to use folio APIs
      371c1414
    • Linus Torvalds's avatar
      Merge tag 'kbuild-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild · ca83c61c
      Linus Torvalds authored
      Pull Kbuild updates from Masahiro Yamada:
      
       - Remove tristate choice support from Kconfig
      
       - Stop using the PROVIDE() directive in the linker script
      
       - Reduce the number of links for the combination of CONFIG_KALLSYMS and
         CONFIG_DEBUG_INFO_BTF
      
       - Enable the warning for symbol reference to .exit.* sections by
         default
      
       - Fix warnings in RPM package builds
      
       - Improve scripts/make_fit.py to generate a FIT image with separate
         base DTB and overlays
      
       - Improve choice value calculation in Kconfig
      
       - Fix conditional prompt behavior in choice in Kconfig
      
       - Remove support for the uncommon EMAIL environment variable in Debian
         package builds
      
       - Remove support for the uncommon "name <email>" form for the DEBEMAIL
         environment variable
      
       - Raise the minimum supported GNU Make version to 4.0
      
       - Remove stale code for the absolute kallsyms
      
       - Move header files commonly used for host programs to scripts/include/
      
       - Introduce the pacman-pkg target to generate a pacman package used in
         Arch Linux
      
       - Clean up Kconfig
      
      * tag 'kbuild-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (65 commits)
        kbuild: doc: gcc to CC change
        kallsyms: change sym_entry::percpu_absolute to bool type
        kallsyms: unify seq and start_pos fields of struct sym_entry
        kallsyms: add more original symbol type/name in comment lines
        kallsyms: use \t instead of a tab in printf()
        kallsyms: avoid repeated calculation of array size for markers
        kbuild: add script and target to generate pacman package
        modpost: use generic macros for hash table implementation
        kbuild: move some helper headers from scripts/kconfig/ to scripts/include/
        Makefile: add comment to discourage tools/* addition for kernel builds
        kbuild: clean up scripts/remove-stale-files
        kconfig: recursive checks drop file/lineno
        kbuild: rpm-pkg: introduce a simple changelog section for kernel.spec
        kallsyms: get rid of code for absolute kallsyms
        kbuild: Create INSTALL_PATH directory if it does not exist
        kbuild: Abort make on install failures
        kconfig: remove 'e1' and 'e2' macros from expression deduplication
        kconfig: remove SYMBOL_CHOICEVAL flag
        kconfig: add const qualifiers to several function arguments
        kconfig: call expr_eliminate_yn() at least once in expr_eliminate_dups()
        ...
      ca83c61c
Loading