Skip to content
Snippets Groups Projects
  1. Oct 12, 2022
  2. Oct 05, 2022
    • Greg Kroah-Hartman's avatar
    • Chris Wilson's avatar
      drm/i915/gem: Really move i915_gem_context.link under ref protection · 713fa3e4
      Chris Wilson authored
      
      commit d119888b upstream.
      
      i915_perf assumes that it can use the i915_gem_context reference to
      protect its i915->gem.contexts.list iteration. However, this requires
      that we do not remove the context from the list until after we drop the
      final reference and release the struct. If, as currently, we remove the
      context from the list during context_close(), the link.next pointer may
      be poisoned while we are holding the context reference and cause a GPF:
      
      [ 4070.573157] i915 0000:00:02.0: [drm:i915_perf_open_ioctl [i915]] filtering on ctx_id=0x1fffff ctx_id_mask=0x1fffff
      [ 4070.574881] general protection fault, probably for non-canonical address 0xdead000000000100: 0000 [#1] PREEMPT SMP
      [ 4070.574897] CPU: 1 PID: 284392 Comm: amd_performance Tainted: G            E     5.17.9 #180
      [ 4070.574903] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
      [ 4070.574907] RIP: 0010:oa_configure_all_contexts.isra.0+0x222/0x350 [i915]
      [ 4070.574982] Code: 08 e8 32 6e 10 e1 4d 8b 6d 50 b8 ff ff ff ff 49 83 ed 50 f0 41 0f c1 04 24 83 f8 01 0f 84 e3 00 00 00 85 c0 0f 8e fa 00 00 00 <49> 8b 45 50 48 8d 70 b0 49 8d 45 50 48 39 44 24 10 0f 85 34 fe ff
      [ 4070.574990] RSP: 0018:ffffc90002077b78 EFLAGS: 00010202
      [ 4070.574995] RAX: 0000000000000002 RBX: 0000000000000002 RCX: 0000000000000000
      [ 4070.575000] RDX: 0000000000000001 RSI: ffffc90002077b20 RDI: ffff88810ddc7c68
      [ 4070.575004] RBP: 0000000000000001 R08: ffff888103242648 R09: fffffffffffffffc
      [ 4070.575008] R10: ffffffff82c50bc0 R11: 0000000000025c80 R12: ffff888101bf1860
      [ 4070.575012] R13: dead0000000000b0 R14: ffffc90002077c04 R15: ffff88810be5cabc
      [ 4070.575016] FS:  00007f1ed50c0780(0000) GS:ffff88885ec80000(0000) knlGS:0000000000000000
      [ 4070.575021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 4070.575025] CR2: 00007f1ed5590280 CR3: 000000010ef6f005 CR4: 00000000003706e0
      [ 4070.575029] Call Trace:
      [ 4070.575033]  <TASK>
      [ 4070.575037]  lrc_configure_all_contexts+0x13e/0x150 [i915]
      [ 4070.575103]  gen8_enable_metric_set+0x4d/0x90 [i915]
      [ 4070.575164]  i915_perf_open_ioctl+0xbc0/0x1500 [i915]
      [ 4070.575224]  ? asm_common_interrupt+0x1e/0x40
      [ 4070.575232]  ? i915_oa_init_reg_state+0x110/0x110 [i915]
      [ 4070.575290]  drm_ioctl_kernel+0x85/0x110
      [ 4070.575296]  ? update_load_avg+0x5f/0x5e0
      [ 4070.575302]  drm_ioctl+0x1d3/0x370
      [ 4070.575307]  ? i915_oa_init_reg_state+0x110/0x110 [i915]
      [ 4070.575382]  ? gen8_gt_irq_handler+0x46/0x130 [i915]
      [ 4070.575445]  __x64_sys_ioctl+0x3c4/0x8d0
      [ 4070.575451]  ? __do_softirq+0xaa/0x1d2
      [ 4070.575456]  do_syscall_64+0x35/0x80
      [ 4070.575461]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [ 4070.575467] RIP: 0033:0x7f1ed5c10397
      [ 4070.575471] Code: 3c 1c e8 1c ff ff ff 85 c0 79 87 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 da 0d 00 f7 d8 64 89 01 48
      [ 4070.575478] RSP: 002b:00007ffd65c8d7a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      [ 4070.575484] RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 00007f1ed5c10397
      [ 4070.575488] RDX: 00007ffd65c8d7c0 RSI: 0000000040106476 RDI: 0000000000000006
      [ 4070.575492] RBP: 00005620972f9c60 R08: 000000000000000a R09: 0000000000000005
      [ 4070.575496] R10: 000000000000000d R11: 0000000000000246 R12: 000000000000000a
      [ 4070.575500] R13: 000000000000000d R14: 0000000000000000 R15: 00007ffd65c8d7c0
      [ 4070.575505]  </TASK>
      [ 4070.575507] Modules linked in: nls_ascii(E) nls_cp437(E) vfat(E) fat(E) i915(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) aesni_intel(E) crypto_simd(E) intel_gtt(E) cryptd(E) ttm(E) rapl(E) intel_cstate(E) drm_kms_helper(E) cfbfillrect(E) syscopyarea(E) cfbimgblt(E) intel_uncore(E) sysfillrect(E) mei_me(E) sysimgblt(E) i2c_i801(E) fb_sys_fops(E) mei(E) intel_pch_thermal(E) i2c_smbus(E) cfbcopyarea(E) video(E) button(E) efivarfs(E) autofs4(E)
      [ 4070.575549] ---[ end trace 0000000000000000 ]---
      
      v3: fix incorrect syntax of spin_lock() replacing spin_lock_irqsave()
      
      v2: irqsave not required in a worker, neither conversion to irq safe
          elsewhere (Tvrtko),
        - perf: it's safe to call gen8_configure_context() even if context has
          been closed, no need to check,
        - drop unrelated cleanup (Andi, Tvrtko)
      
      Reported-by: default avatarMark Janes <mark.janes@intel.com>
      Closes: https://gitlab.freedesktop.org/drm/intel/issues/6222
      
      
      References: a4e7ccda ("drm/i915: Move context management under GEM")
      Fixes: f8246cf4 ("drm/i915/gem: Drop free_work for GEM contexts")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarAndi Shyti <andi.shyti@linux.intel.com>
      Signed-off-by: default avatarJanusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: <stable@vger.kernel.org> # v5.12+
      Signed-off-by: default avatarAndi Shyti <andi.shyti@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20220916092403.201355-3-janusz.krzysztofik@linux.intel.com
      
      
      (cherry picked from commit ad3aa7c31efa5a09b0dba42e66cfdf77e0db7dc2)
      Signed-off-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      [janusz: backport]
      Signed-off-by: default avatarJanusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      713fa3e4
    • Nadav Amit's avatar
      x86/alternative: Fix race in try_get_desc() · a00ed4e5
      Nadav Amit authored
      
      commit efd608fa upstream.
      
      I encountered some occasional crashes of poke_int3_handler() when
      kprobes are set, while accessing desc->vec.
      
      The text poke mechanism claims to have an RCU-like behavior, but it
      does not appear that there is any quiescent state to ensure that
      nobody holds reference to desc. As a result, the following race
      appears to be possible, which can lead to memory corruption.
      
        CPU0					CPU1
        ----					----
        text_poke_bp_batch()
        -> smp_store_release(&bp_desc, &desc)
      
        [ notice that desc is on
          the stack			]
      
      					poke_int3_handler()
      
      					[ int3 might be kprobe's
      					  so sync events are do not
      					  help ]
      
      					-> try_get_desc(descp=&bp_desc)
      					   desc = __READ_ONCE(bp_desc)
      
      					   if (!desc) [false, success]
        WRITE_ONCE(bp_desc, NULL);
        atomic_dec_and_test(&desc.refs)
      
        [ success, desc space on the stack
          is being reused and might have
          non-zero value. ]
      					arch_atomic_inc_not_zero(&desc->refs)
      
      					[ might succeed since desc points to
      					  stack memory that was freed and might
      					  be reused. ]
      
      Fix this issue with small backportable patch. Instead of trying to
      make RCU-like behavior for bp_desc, just eliminate the unnecessary
      level of indirection of bp_desc, and hold the whole descriptor as a
      global.  Anyhow, there is only a single descriptor at any given
      moment.
      
      Fixes: 1f676247 ("x86/alternatives: Implement a better poke_int3_handler() completion scheme")
      Signed-off-by: default avatarNadav Amit <namit@vmware.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: stable@kernel.org
      Link: https://lkml.kernel.org/r/20220920224743.3089-1-namit@vmware.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a00ed4e5
Loading