Skip to content
Snippets Groups Projects
  1. Oct 18, 2021
  2. Oct 17, 2021
  3. Oct 16, 2021
  4. Oct 04, 2021
  5. Oct 02, 2021
  6. Sep 28, 2021
  7. Sep 24, 2021
  8. Sep 15, 2021
    • Li Jinlin's avatar
      blk-cgroup: fix UAF by grabbing blkcg lock before destroying blkg pd · 858560b2
      Li Jinlin authored
      
      KASAN reports a use-after-free report when doing fuzz test:
      
      [693354.104835] ==================================================================
      [693354.105094] BUG: KASAN: use-after-free in bfq_io_set_weight_legacy+0xd3/0x160
      [693354.105336] Read of size 4 at addr ffff888be0a35664 by task sh/1453338
      
      [693354.105607] CPU: 41 PID: 1453338 Comm: sh Kdump: loaded Not tainted 4.18.0-147
      [693354.105610] Hardware name: Huawei 2288H V5/BC11SPSCB0, BIOS 0.81 07/02/2018
      [693354.105612] Call Trace:
      [693354.105621]  dump_stack+0xf1/0x19b
      [693354.105626]  ? show_regs_print_info+0x5/0x5
      [693354.105634]  ? printk+0x9c/0xc3
      [693354.105638]  ? cpumask_weight+0x1f/0x1f
      [693354.105648]  print_address_description+0x70/0x360
      [693354.105654]  kasan_report+0x1b2/0x330
      [693354.105659]  ? bfq_io_set_weight_legacy+0xd3/0x160
      [693354.105665]  ? bfq_io_set_weight_legacy+0xd3/0x160
      [693354.105670]  bfq_io_set_weight_legacy+0xd3/0x160
      [693354.105675]  ? bfq_cpd_init+0x20/0x20
      [693354.105683]  cgroup_file_write+0x3aa/0x510
      [693354.105693]  ? ___slab_alloc+0x507/0x540
      [693354.105698]  ? cgroup_file_poll+0x60/0x60
      [693354.105702]  ? 0xffffffff89600000
      [693354.105708]  ? usercopy_abort+0x90/0x90
      [693354.105716]  ? mutex_lock+0xef/0x180
      [693354.105726]  kernfs_fop_write+0x1ab/0x280
      [693354.105732]  ? cgroup_file_poll+0x60/0x60
      [693354.105738]  vfs_write+0xe7/0x230
      [693354.105744]  ksys_write+0xb0/0x140
      [693354.105749]  ? __ia32_sys_read+0x50/0x50
      [693354.105760]  do_syscall_64+0x112/0x370
      [693354.105766]  ? syscall_return_slowpath+0x260/0x260
      [693354.105772]  ? do_page_fault+0x9b/0x270
      [693354.105779]  ? prepare_exit_to_usermode+0xf9/0x1a0
      [693354.105784]  ? enter_from_user_mode+0x30/0x30
      [693354.105793]  entry_SYSCALL_64_after_hwframe+0x65/0xca
      
      [693354.105875] Allocated by task 1453337:
      [693354.106001]  kasan_kmalloc+0xa0/0xd0
      [693354.106006]  kmem_cache_alloc_node_trace+0x108/0x220
      [693354.106010]  bfq_pd_alloc+0x96/0x120
      [693354.106015]  blkcg_activate_policy+0x1b7/0x2b0
      [693354.106020]  bfq_create_group_hierarchy+0x1e/0x80
      [693354.106026]  bfq_init_queue+0x678/0x8c0
      [693354.106031]  blk_mq_init_sched+0x1f8/0x460
      [693354.106037]  elevator_switch_mq+0xe1/0x240
      [693354.106041]  elevator_switch+0x25/0x40
      [693354.106045]  elv_iosched_store+0x1a1/0x230
      [693354.106049]  queue_attr_store+0x78/0xb0
      [693354.106053]  kernfs_fop_write+0x1ab/0x280
      [693354.106056]  vfs_write+0xe7/0x230
      [693354.106060]  ksys_write+0xb0/0x140
      [693354.106064]  do_syscall_64+0x112/0x370
      [693354.106069]  entry_SYSCALL_64_after_hwframe+0x65/0xca
      
      [693354.106114] Freed by task 1453336:
      [693354.106225]  __kasan_slab_free+0x130/0x180
      [693354.106229]  kfree+0x90/0x1b0
      [693354.106233]  blkcg_deactivate_policy+0x12c/0x220
      [693354.106238]  bfq_exit_queue+0xf5/0x110
      [693354.106241]  blk_mq_exit_sched+0x104/0x130
      [693354.106245]  __elevator_exit+0x45/0x60
      [693354.106249]  elevator_switch_mq+0xd6/0x240
      [693354.106253]  elevator_switch+0x25/0x40
      [693354.106257]  elv_iosched_store+0x1a1/0x230
      [693354.106261]  queue_attr_store+0x78/0xb0
      [693354.106264]  kernfs_fop_write+0x1ab/0x280
      [693354.106268]  vfs_write+0xe7/0x230
      [693354.106271]  ksys_write+0xb0/0x140
      [693354.106275]  do_syscall_64+0x112/0x370
      [693354.106280]  entry_SYSCALL_64_after_hwframe+0x65/0xca
      
      [693354.106329] The buggy address belongs to the object at ffff888be0a35580
                       which belongs to the cache kmalloc-1k of size 1024
      [693354.106736] The buggy address is located 228 bytes inside of
                       1024-byte region [ffff888be0a35580, ffff888be0a35980)
      [693354.107114] The buggy address belongs to the page:
      [693354.107273] page:ffffea002f828c00 count:1 mapcount:0 mapping:ffff888107c17080 index:0x0 compound_mapcount: 0
      [693354.107606] flags: 0x17ffffc0008100(slab|head)
      [693354.107760] raw: 0017ffffc0008100 ffffea002fcbc808 ffffea0030bd3a08 ffff888107c17080
      [693354.108020] raw: 0000000000000000 00000000001c001c 00000001ffffffff 0000000000000000
      [693354.108278] page dumped because: kasan: bad access detected
      
      [693354.108511] Memory state around the buggy address:
      [693354.108671]  ffff888be0a35500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [693354.116396]  ffff888be0a35580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [693354.124473] >ffff888be0a35600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [693354.132421]                                                        ^
      [693354.140284]  ffff888be0a35680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [693354.147912]  ffff888be0a35700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [693354.155281] ==================================================================
      
      blkgs are protected by both queue and blkcg locks and holding
      either should stabilize them. However, the path of destroying
      blkg policy data is only protected by queue lock in
      blkcg_activate_policy()/blkcg_deactivate_policy(). Other tasks
      can get the blkg policy data before the blkg policy data is
      destroyed, and use it after destroyed, which will result in a
      use-after-free.
      
      CPU0                             CPU1
      blkcg_deactivate_policy
        spin_lock_irq(&q->queue_lock)
                                       bfq_io_set_weight_legacy
                                         spin_lock_irq(&blkcg->lock)
                                         blkg_to_bfqg(blkg)
                                           pd_to_bfqg(blkg->pd[pol->plid])
                                           ^^^^^^blkg->pd[pol->plid] != NULL
                                                 bfqg != NULL
        pol->pd_free_fn(blkg->pd[pol->plid])
          pd_to_bfqg(blkg->pd[pol->plid])
          bfqg_put(bfqg)
            kfree(bfqg)
        blkg->pd[pol->plid] = NULL
        spin_unlock_irq(q->queue_lock);
                                         bfq_group_set_weight(bfqg, val, 0)
                                           bfqg->entity.new_weight
                                           ^^^^^^trigger uaf here
                                         spin_unlock_irq(&blkcg->lock);
      
      Fix by grabbing the matching blkcg lock before trying to
      destroy blkg policy data.
      
      Suggested-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarLi Jinlin <lijinlin3@huawei.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Link: https://lore.kernel.org/r/20210914042605.3260596-1-lijinlin3@huawei.com
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      858560b2
    • Yanfei Xu's avatar
      blkcg: fix memory leak in blk_iolatency_init · 6f5ddde4
      Yanfei Xu authored
      
      BUG: memory leak
      unreferenced object 0xffff888129acdb80 (size 96):
        comm "syz-executor.1", pid 12661, jiffies 4294962682 (age 15.220s)
        hex dump (first 32 bytes):
          20 47 c9 85 ff ff ff ff 20 d4 8e 29 81 88 ff ff   G...... ..)....
          01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<ffffffff82264ec8>] kmalloc include/linux/slab.h:591 [inline]
          [<ffffffff82264ec8>] kzalloc include/linux/slab.h:721 [inline]
          [<ffffffff82264ec8>] blk_iolatency_init+0x28/0x190 block/blk-iolatency.c:724
          [<ffffffff8225b8c4>] blkcg_init_queue+0xb4/0x1c0 block/blk-cgroup.c:1185
          [<ffffffff822253da>] blk_alloc_queue+0x22a/0x2e0 block/blk-core.c:566
          [<ffffffff8223b175>] blk_mq_init_queue_data block/blk-mq.c:3100 [inline]
          [<ffffffff8223b175>] __blk_mq_alloc_disk+0x25/0xd0 block/blk-mq.c:3124
          [<ffffffff826a9303>] loop_add+0x1c3/0x360 drivers/block/loop.c:2344
          [<ffffffff826a966e>] loop_control_get_free drivers/block/loop.c:2501 [inline]
          [<ffffffff826a966e>] loop_control_ioctl+0x17e/0x2e0 drivers/block/loop.c:2516
          [<ffffffff81597eec>] vfs_ioctl fs/ioctl.c:51 [inline]
          [<ffffffff81597eec>] __do_sys_ioctl fs/ioctl.c:874 [inline]
          [<ffffffff81597eec>] __se_sys_ioctl fs/ioctl.c:860 [inline]
          [<ffffffff81597eec>] __x64_sys_ioctl+0xfc/0x140 fs/ioctl.c:860
          [<ffffffff843fa745>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
          [<ffffffff843fa745>] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
          [<ffffffff84600068>] entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Once blk_throtl_init() queue init failed, blkcg_iolatency_exit() will
      not be invoked for cleanup. That leads a memory leak. Swap the
      blk_throtl_init() and blk_iolatency_init() calls can solve this.
      
      Reported-by: default avatar <syzbot+01321b15cc98e6bf96d6@syzkaller.appspotmail.com>
      Fixes: 19688d7f (block/blk-cgroup: Swap the blk_throtl_init() and blk_iolatency_init() calls)
      Signed-off-by: default avatarYanfei Xu <yanfei.xu@windriver.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Link: https://lore.kernel.org/r/20210915072426.4022924-1-yanfei.xu@windriver.com
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      6f5ddde4
    • Lihong Kou's avatar
      block: flush the integrity workqueue in blk_integrity_unregister · 3df49967
      Lihong Kou authored
      
      When the integrity profile is unregistered there can still be integrity
      reads queued up which could see a NULL verify_fn as shown by the race
      window below:
      
      CPU0                                    CPU1
        process_one_work                      nvme_validate_ns
          bio_integrity_verify_fn                nvme_update_ns_info
      	                                     nvme_update_disk_info
      	                                       blk_integrity_unregister
                                                     ---set queue->integrity as 0
      	bio_integrity_process
      	--access bi->profile->verify_fn(bi is a pointer of queue->integity)
      
      Before calling blk_integrity_unregister in nvme_update_disk_info, we must
      make sure that there is no work item in the kintegrityd_wq. Just call
      blk_flush_integrity to flush the work queue so the bug can be resolved.
      
      Signed-off-by: default avatarLihong Kou <koulihong@huawei.com>
      [hch: split up and shortened the changelog]
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Link: https://lore.kernel.org/r/20210914070657.87677-3-hch@lst.de
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      3df49967
    • Christoph Hellwig's avatar
      block: check if a profile is actually registered in blk_integrity_unregister · 783a40a1
      Christoph Hellwig authored
      
      While clearing the profile itself is harmless, we really should not clear
      the stable writes flag if it wasn't set due to a registered integrity
      profile.
      
      Reported-by: default avatarLihong Kou <koulihong@huawei.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Link: https://lore.kernel.org/r/20210914070657.87677-2-hch@lst.de
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      783a40a1
  9. Sep 14, 2021
  10. Sep 13, 2021
    • Ming Lei's avatar
      blk-mq: avoid to iterate over stale request · 67f3b2f8
      Ming Lei authored
      
      blk-mq can't run allocating driver tag and updating ->rqs[tag]
      atomically, meantime blk-mq doesn't clear ->rqs[tag] after the driver
      tag is released.
      
      So there is chance to iterating over one stale request just after the
      tag is allocated and before updating ->rqs[tag].
      
      scsi_host_busy_iter() calls scsi_host_check_in_flight() to count scsi
      in-flight requests after scsi host is blocked, so no new scsi command can
      be marked as SCMD_STATE_INFLIGHT. However, driver tag allocation still can
      be run by blk-mq core. One request is marked as SCMD_STATE_INFLIGHT,
      but this request may have been kept in another slot of ->rqs[], meantime
      the slot can be allocated out but ->rqs[] isn't updated yet. Then this
      in-flight request is counted twice as SCMD_STATE_INFLIGHT. This way causes
      trouble in handling scsi error.
      
      Fixes the issue by not iterating over stale request.
      
      Cc: linux-scsi@vger.kernel.org
      Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
      Reported-by: default avatarluojiaxing <luojiaxing@huawei.com>
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Link: https://lore.kernel.org/r/20210906065003.439019-1-ming.lei@redhat.com
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      67f3b2f8
Loading