Skip to content
Snippets Groups Projects
  1. Mar 15, 2025
  2. Mar 14, 2025
  3. Mar 13, 2025
    • Dave Airlie's avatar
      Merge tag 'amd-drm-fixes-6.14-2025-03-12' of... · 385b6432
      Dave Airlie authored
      Merge tag 'amd-drm-fixes-6.14-2025-03-12' of https://gitlab.freedesktop.org/agd5f/linux
      
       into drm-fixes
      
      amd-drm-fixes-6.14-2025-03-12:
      
      amdgpu:
      - GC 12.x DCC fix
      - DC DCE 6.x fix
      - Hibernation fix
      - HPD fix
      - Backlight fixes
      - Color depth fix
      - UAF fix in hdcp_work
      - VCE 2.x fix
      - GC 12.x PTE fix
      
      amdkfd:
      - Queue eviction fix
      
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      From: Alex Deucher <alexander.deucher@amd.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20250312190931.216506-1-alexander.deucher@amd.com
      385b6432
    • Ajay Kaher's avatar
      x86/vmware: Parse MP tables for SEV-SNP enabled guests under VMware hypervisors · a2ab2552
      Ajay Kaher authored
      
      Under VMware hypervisors, SEV-SNP enabled VMs are fundamentally able to boot
      without UEFI, but this regressed a year ago due to:
      
        0f4a1e80 ("x86/sev: Skip ROM range scans and validation for SEV-SNP guests")
      
      In this case, mpparse_find_mptable() has to be called to parse MP
      tables which contains the necessary boot information.
      
      [ mingo: Updated the changelog. ]
      
      Fixes: 0f4a1e80 ("x86/sev: Skip ROM range scans and validation for SEV-SNP guests")
      Co-developed-by: default avatarYe Li <ye.li@broadcom.com>
      Signed-off-by: default avatarYe Li <ye.li@broadcom.com>
      Signed-off-by: default avatarAjay Kaher <ajay.kaher@broadcom.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Tested-by: default avatarYe Li <ye.li@broadcom.com>
      Reviewed-by: default avatarKevin Loughlin <kevinloughlin@google.com>
      Acked-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Link: https://lore.kernel.org/r/20250313173111.10918-1-ajay.kaher@broadcom.com
      a2ab2552
    • Linus Torvalds's avatar
      Merge tag 'net-6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 4003c9e7
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from netfilter, bluetooth and wireless.
      
        No known regressions outstanding.
      
        Current release - regressions:
      
         - wifi: nl80211: fix assoc link handling
      
         - eth: lan78xx: sanitize return values of register read/write
           functions
      
        Current release - new code bugs:
      
         - ethtool: tsinfo: fix dump command
      
         - bluetooth: btusb: configure altsetting for HCI_USER_CHANNEL
      
         - eth: mlx5: DR, use the right action structs for STEv3
      
        Previous releases - regressions:
      
         - netfilter: nf_tables: make destruction work queue pernet
      
         - gre: fix IPv6 link-local address generation.
      
         - wifi: iwlwifi: fix TSO preparation
      
         - bluetooth: revert "bluetooth: hci_core: fix sleeping function
           called from invalid context"
      
         - ovs: revert "openvswitch: switch to per-action label counting in
           conntrack"
      
         - eth:
             - ice: fix switchdev slow-path in LAG
             - bonding: fix incorrect MAC address setting to receive NS
               messages
      
        Previous releases - always broken:
      
         - core: prevent TX of unreadable skbs
      
         - sched: prevent creation of classes with TC_H_ROOT
      
         - netfilter: nft_exthdr: fix offset with ipv4_find_option()
      
         - wifi: cfg80211: cancel wiphy_work before freeing wiphy
      
         - mctp: copy headers if cloned
      
         - phy: nxp-c45-tja11xx: add errata for TJA112XA/B
      
         - eth:
             - bnxt: fix kernel panic in the bnxt_get_queue_stats{rx | tx}
             - mlx5: bridge, fix the crash caused by LAG state check"
      
      * tag 'net-6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (65 commits)
        net: mana: cleanup mana struct after debugfs_remove()
        net/mlx5e: Prevent bridge link show failure for non-eswitch-allowed devices
        net/mlx5: Bridge, fix the crash caused by LAG state check
        net/mlx5: Lag, Check shared fdb before creating MultiPort E-Switch
        net/mlx5: Fix incorrect IRQ pool usage when releasing IRQs
        net/mlx5: HWS, Rightsize bwc matcher priority
        net/mlx5: DR, use the right action structs for STEv3
        Revert "openvswitch: switch to per-action label counting in conntrack"
        net: openvswitch: remove misbehaving actions length check
        selftests: Add IPv6 link-local address generation tests for GRE devices.
        gre: Fix IPv6 link-local address generation.
        netfilter: nft_exthdr: fix offset with ipv4_find_option()
        selftests/tc-testing: Add a test case for DRR class with TC_H_ROOT
        net_sched: Prevent creation of classes with TC_H_ROOT
        ipvs: prevent integer overflow in do_ip_vs_get_ctl()
        selftests: netfilter: skip br_netfilter queue tests if kernel is tainted
        netfilter: nf_conncount: Fully initialize struct nf_conncount_tuple in insert_tree()
        wifi: mac80211: fix MPDU length parsing for EHT 5/6 GHz
        qlcnic: fix memory leak issues in qlcnic_sriov_common.c
        rtase: Fix improper release of ring list entries in rtase_sw_reset
        ...
      4003c9e7
    • Kent Overstreet's avatar
      dm-flakey: Fix memory corruption in optional corrupt_bio_byte feature · 57e9417f
      Kent Overstreet authored
      
      Fix memory corruption due to incorrect parameter being passed to bio_init
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Cc: stable@vger.kernel.org	# v6.5+
      Fixes: 1d9a9438 ("dm flakey: clone pages on write bio before corrupting them")
      57e9417f
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.14-rc7.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs · 8f7617f4
      Linus Torvalds authored
      Pull vfs fixes from Christian Brauner:
      
       - Bring in an RCU pathwalk fix for afs. This is brought in as a merge
         from the vfs-6.15.shared.afs branch that needs this commit and other
         trees already depend on it.
      
       - Fix vboxfs unterminated string handling.
      
      * tag 'vfs-6.14-rc7.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs:
        vboxsf: Add __nonstring annotations for unterminated strings
        afs: Fix afs_atcell_get_link() to handle RCU pathwalk
      8f7617f4
    • Kent Overstreet's avatar
      bcachefs: bch2_get_random_u64_below() · 9c18ea7f
      Kent Overstreet authored
      
      steal the (clever) algorithm from get_random_u32_below()
      
      this fixes a bug where we were passing roundup_pow_of_two() a 64 bit
      number - we're squaring device latencies now:
      
      [  +1.681698] ------------[ cut here ]------------
      [  +0.000010] UBSAN: shift-out-of-bounds in ./include/linux/log2.h:57:13
      [  +0.000011] shift exponent 64 is too large for 64-bit type 'long unsigned int'
      [  +0.000011] CPU: 1 UID: 0 PID: 196 Comm: kworker/u32:13 Not tainted 6.14.0-rc6-dave+ #10
      [  +0.000012] Hardware name: ASUS System Product Name/PRIME B460I-PLUS, BIOS 1301 07/13/2021
      [  +0.000005] Workqueue: events_unbound __bch2_read_endio [bcachefs]
      [  +0.000354] Call Trace:
      [  +0.000005]  <TASK>
      [  +0.000007]  dump_stack_lvl+0x5d/0x80
      [  +0.000018]  ubsan_epilogue+0x5/0x30
      [  +0.000008]  __ubsan_handle_shift_out_of_bounds.cold+0x61/0xe6
      [  +0.000011]  bch2_rand_range.cold+0x17/0x20 [bcachefs]
      [  +0.000231]  bch2_bkey_pick_read_device+0x547/0x920 [bcachefs]
      [  +0.000229]  __bch2_read_extent+0x1e4/0x18e0 [bcachefs]
      [  +0.000241]  ? bch2_btree_iter_peek_slot+0x3df/0x800 [bcachefs]
      [  +0.000180]  ? bch2_read_retry_nodecode+0x270/0x330 [bcachefs]
      [  +0.000230]  bch2_read_retry_nodecode+0x270/0x330 [bcachefs]
      [  +0.000230]  bch2_rbio_retry+0x1fa/0x600 [bcachefs]
      [  +0.000224]  ? bch2_printbuf_make_room+0x71/0xb0 [bcachefs]
      [  +0.000243]  ? bch2_read_csum_err+0x4a4/0x610 [bcachefs]
      [  +0.000278]  bch2_read_csum_err+0x4a4/0x610 [bcachefs]
      [  +0.000227]  ? __bch2_read_endio+0x58b/0x870 [bcachefs]
      [  +0.000220]  __bch2_read_endio+0x58b/0x870 [bcachefs]
      [  +0.000268]  ? try_to_wake_up+0x31c/0x7f0
      [  +0.000011]  ? process_one_work+0x176/0x330
      [  +0.000008]  process_one_work+0x176/0x330
      [  +0.000008]  worker_thread+0x252/0x390
      [  +0.000008]  ? __pfx_worker_thread+0x10/0x10
      [  +0.000006]  kthread+0xec/0x230
      [  +0.000011]  ? __pfx_kthread+0x10/0x10
      [  +0.000009]  ret_from_fork+0x31/0x50
      [  +0.000009]  ? __pfx_kthread+0x10/0x10
      [  +0.000008]  ret_from_fork_asm+0x1a/0x30
      [  +0.000012]  </TASK>
      [  +0.000046] ---[ end trace ]---
      
      Reported-by: default avatarRoland Vet <vet.roland@protonmail.com>
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      9c18ea7f
    • Kent Overstreet's avatar
      bcachefs: target_congested -> get_random_u32_below() · 69a5a13a
      Kent Overstreet authored
      
      get_random_u32_below() has a better algorithm than bch2_rand_range(),
      it just didn't exist at the time.
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      69a5a13a
    • Jens Axboe's avatar
      Merge tag 'nvme-6.14-2025-03-13' of git://git.infradead.org/nvme into block-6.14 · a9381351
      Jens Axboe authored
      Pull NVMe fixes from Keith:
      
      "nvme fixes for Linux 6.14
      
       - Concurrent pci error and hotplug handling fix (Keith)
       - Endpoint function fixes (Damien)"
      
      * tag 'nvme-6.14-2025-03-13' of git://git.infradead.org/nvme:
        nvmet: pci-epf: Do not add an IRQ vector if not needed
        nvmet: pci-epf: Set NVMET_PCI_EPF_Q_LIVE when a queue is fully created
        nvme-pci: fix stuck reset on concurrent DPC and HP
      a9381351
    • Paolo Abeni's avatar
      Merge tag 'nf-25-03-13' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · 2409fa66
      Paolo Abeni authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter/IPVS fixes for net
      
      The following patchset contains Netfilter/IPVS fixes for net:
      
      1) Missing initialization of cpu and jiffies32 fields in conncount,
         from Kohei Enju.
      
      2) Skip several tests in case kernel is tainted, otherwise tests bogusly
         report failure too as they also check for tainted kernel,
         from Florian Westphal.
      
      3) Fix a hyphothetical integer overflow in do_ip_vs_get_ctl() leading
         to bogus error logs, from Dan Carpenter.
      
      4) Fix incorrect offset in ipv4 option match in nft_exthdr, from
         Alexey Kashavkin.
      
      netfilter pull request 25-03-13
      
      * tag 'nf-25-03-13' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
        netfilter: nft_exthdr: fix offset with ipv4_find_option()
        ipvs: prevent integer overflow in do_ip_vs_get_ctl()
        selftests: netfilter: skip br_netfilter queue tests if kernel is tainted
        netfilter: nf_conncount: Fully initialize struct nf_conncount_tuple in insert_tree()
      ====================
      
      Link: https://patch.msgid.link/20250313095636.2186-1-pablo@netfilter.org
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      2409fa66
    • Mario Limonciello's avatar
      platform/x86/amd: pmf: Fix missing hidden options for Smart PC · 4490fe97
      Mario Limonciello authored
      
      amd_pmf_get_slider_info() checks the current profile to report correct
      value to the TA inputs.  If hidden options are in use then the wrong
      values will be reported to TA.
      
      Add the two compat options PLATFORM_PROFILE_BALANCED_PERFORMANCE and
      PLATFORM_PROFILE_QUIET for this use.
      
      Reported-by: default avatarYijun Shen <Yijun.Shen@dell.com>
      Fixes: 9a43102d ("platform/x86/amd: pmf: Add balanced-performance to hidden choices")
      Fixes: 44e94fec ("platform/x86/amd: pmf: Add 'quiet' to hidden choices")
      Signed-off-by: default avatarMario Limonciello <mario.limonciello@amd.com>
      Acked-by: default avatarShyam Sundar S K <Shyam-sundar.S-k@amd.com>
      Link: https://lore.kernel.org/r/20250306034402.50478-1-superm1@kernel.org
      
      
      Reviewed-by: default avatarIlpo Järvinen <ilpo.jarvinen@linux.intel.com>
      Signed-off-by: default avatarIlpo Järvinen <ilpo.jarvinen@linux.intel.com>
      4490fe97
    • Shradha Gupta's avatar
      net: mana: cleanup mana struct after debugfs_remove() · 3e64bb2a
      Shradha Gupta authored
      
      When on a MANA VM hibernation is triggered, as part of hibernate_snapshot(),
      mana_gd_suspend() and mana_gd_resume() are called. If during this
      mana_gd_resume(), a failure occurs with HWC creation, mana_port_debugfs
      pointer does not get reinitialized and ends up pointing to older,
      cleaned-up dentry.
      Further in the hibernation path, as part of power_down(), mana_gd_shutdown()
      is triggered. This call, unaware of the failures in resume, tries to cleanup
      the already cleaned up  mana_port_debugfs value and hits the following bug:
      
      [  191.359296] mana 7870:00:00.0: Shutdown was called
      [  191.359918] BUG: kernel NULL pointer dereference, address: 0000000000000098
      [  191.360584] #PF: supervisor write access in kernel mode
      [  191.361125] #PF: error_code(0x0002) - not-present page
      [  191.361727] PGD 1080ea067 P4D 0
      [  191.362172] Oops: Oops: 0002 [#1] SMP NOPTI
      [  191.362606] CPU: 11 UID: 0 PID: 1674 Comm: bash Not tainted 6.14.0-rc5+ #2
      [  191.363292] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 11/21/2024
      [  191.364124] RIP: 0010:down_write+0x19/0x50
      [  191.364537] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb e8 de cd ff ff 31 c0 ba 01 00 00 00 <f0> 48 0f b1 13 75 16 65 48 8b 05 88 24 4c 6a 48 89 43 08 48 8b 5d
      [  191.365867] RSP: 0000:ff45fbe0c1c037b8 EFLAGS: 00010246
      [  191.366350] RAX: 0000000000000000 RBX: 0000000000000098 RCX: ffffff8100000000
      [  191.366951] RDX: 0000000000000001 RSI: 0000000000000064 RDI: 0000000000000098
      [  191.367600] RBP: ff45fbe0c1c037c0 R08: 0000000000000000 R09: 0000000000000001
      [  191.368225] R10: ff45fbe0d2b01000 R11: 0000000000000008 R12: 0000000000000000
      [  191.368874] R13: 000000000000000b R14: ff43dc27509d67c0 R15: 0000000000000020
      [  191.369549] FS:  00007dbc5001e740(0000) GS:ff43dc663f380000(0000) knlGS:0000000000000000
      [  191.370213] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  191.370830] CR2: 0000000000000098 CR3: 0000000168e8e002 CR4: 0000000000b73ef0
      [  191.371557] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  191.372192] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
      [  191.372906] Call Trace:
      [  191.373262]  <TASK>
      [  191.373621]  ? show_regs+0x64/0x70
      [  191.374040]  ? __die+0x24/0x70
      [  191.374468]  ? page_fault_oops+0x290/0x5b0
      [  191.374875]  ? do_user_addr_fault+0x448/0x800
      [  191.375357]  ? exc_page_fault+0x7a/0x160
      [  191.375971]  ? asm_exc_page_fault+0x27/0x30
      [  191.376416]  ? down_write+0x19/0x50
      [  191.376832]  ? down_write+0x12/0x50
      [  191.377232]  simple_recursive_removal+0x4a/0x2a0
      [  191.377679]  ? __pfx_remove_one+0x10/0x10
      [  191.378088]  debugfs_remove+0x44/0x70
      [  191.378530]  mana_detach+0x17c/0x4f0
      [  191.378950]  ? __flush_work+0x1e2/0x3b0
      [  191.379362]  ? __cond_resched+0x1a/0x50
      [  191.379787]  mana_remove+0xf2/0x1a0
      [  191.380193]  mana_gd_shutdown+0x3b/0x70
      [  191.380642]  pci_device_shutdown+0x3a/0x80
      [  191.381063]  device_shutdown+0x13e/0x230
      [  191.381480]  kernel_power_off+0x35/0x80
      [  191.381890]  hibernate+0x3c6/0x470
      [  191.382312]  state_store+0xcb/0xd0
      [  191.382734]  kobj_attr_store+0x12/0x30
      [  191.383211]  sysfs_kf_write+0x3e/0x50
      [  191.383640]  kernfs_fop_write_iter+0x140/0x1d0
      [  191.384106]  vfs_write+0x271/0x440
      [  191.384521]  ksys_write+0x72/0xf0
      [  191.384924]  __x64_sys_write+0x19/0x20
      [  191.385313]  x64_sys_call+0x2b0/0x20b0
      [  191.385736]  do_syscall_64+0x79/0x150
      [  191.386146]  ? __mod_memcg_lruvec_state+0xe7/0x240
      [  191.386676]  ? __lruvec_stat_mod_folio+0x79/0xb0
      [  191.387124]  ? __pfx_lru_add+0x10/0x10
      [  191.387515]  ? queued_spin_unlock+0x9/0x10
      [  191.387937]  ? do_anonymous_page+0x33c/0xa00
      [  191.388374]  ? __handle_mm_fault+0xcf3/0x1210
      [  191.388805]  ? __count_memcg_events+0xbe/0x180
      [  191.389235]  ? handle_mm_fault+0xae/0x300
      [  191.389588]  ? do_user_addr_fault+0x559/0x800
      [  191.390027]  ? irqentry_exit_to_user_mode+0x43/0x230
      [  191.390525]  ? irqentry_exit+0x1d/0x30
      [  191.390879]  ? exc_page_fault+0x86/0x160
      [  191.391235]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
      [  191.391745] RIP: 0033:0x7dbc4ff1c574
      [  191.392111] Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d d5 ea 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89
      [  191.393412] RSP: 002b:00007ffd95a23ab8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
      [  191.393990] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007dbc4ff1c574
      [  191.394594] RDX: 0000000000000005 RSI: 00005a6eeadb0ce0 RDI: 0000000000000001
      [  191.395215] RBP: 00007ffd95a23ae0 R08: 00007dbc50003b20 R09: 0000000000000000
      [  191.395805] R10: 0000000000000001 R11: 0000000000000202 R12: 0000000000000005
      [  191.396404] R13: 00005a6eeadb0ce0 R14: 00007dbc500045c0 R15: 00007dbc50001ee0
      [  191.396987]  </TASK>
      
      To fix this, we explicitly set such mana debugfs variables to NULL after
      debugfs_remove() is called.
      
      Fixes: 6607c17c ("net: mana: Enable debugfs files for MANA device")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarShradha Gupta <shradhagupta@linux.microsoft.com>
      Reviewed-by: default avatarHaiyang Zhang <haiyangz@microsoft.com>
      Reviewed-by: default avatarMichal Kubiak <michal.kubiak@intel.com>
      Link: https://patch.msgid.link/1741688260-28922-1-git-send-email-shradhagupta@linux.microsoft.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      3e64bb2a
    • Paolo Abeni's avatar
      Merge branch 'mlx5-misc-fixes-2025-03-10' · e1af35d6
      Paolo Abeni authored
      Tariq Toukan says:
      
      ====================
      mlx5 misc fixes 2025-03-10
      
      This patchset provides misc bug fixes from the team to the mlx5 core and
      Eth drivers.
      ====================
      
      Link: https://patch.msgid.link/1741644104-97767-1-git-send-email-tariqt@nvidia.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      e1af35d6
    • Carolina Jubran's avatar
      net/mlx5e: Prevent bridge link show failure for non-eswitch-allowed devices · e92df790
      Carolina Jubran authored
      
      mlx5_eswitch_get_vepa returns -EPERM if the device lacks
      eswitch_manager capability, blocking mlx5e_bridge_getlink from
      retrieving VEPA mode. Since mlx5e_bridge_getlink implements
      ndo_bridge_getlink, returning -EPERM causes bridge link show to fail
      instead of skipping devices without this capability.
      
      To avoid this, return -EOPNOTSUPP from mlx5e_bridge_getlink when
      mlx5_eswitch_get_vepa fails, ensuring the command continues processing
      other devices while ignoring those without the necessary capability.
      
      Fixes: 4b89251d ("net/mlx5: Support ndo bridge_setlink and getlink")
      Signed-off-by: default avatarCarolina Jubran <cjubran@nvidia.com>
      Reviewed-by: default avatarJianbo Liu <jianbol@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarMichal Swiatkowski <michal.swiatkowski@linux.intel.com>
      Link: https://patch.msgid.link/1741644104-97767-7-git-send-email-tariqt@nvidia.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      e92df790
    • Jianbo Liu's avatar
      net/mlx5: Bridge, fix the crash caused by LAG state check · 4b8eeed4
      Jianbo Liu authored
      
      When removing LAG device from bridge, NETDEV_CHANGEUPPER event is
      triggered. Driver finds the lower devices (PFs) to flush all the
      offloaded entries. And mlx5_lag_is_shared_fdb is checked, it returns
      false if one of PF is unloaded. In such case,
      mlx5_esw_bridge_lag_rep_get() and its caller return NULL, instead of
      the alive PF, and the flush is skipped.
      
      Besides, the bridge fdb entry's lastuse is updated in mlx5 bridge
      event handler. But this SWITCHDEV_FDB_ADD_TO_BRIDGE event can be
      ignored in this case because the upper interface for bond is deleted,
      and the entry will never be aged because lastuse is never updated.
      
      To make things worse, as the entry is alive, mlx5 bridge workqueue
      keeps sending that event, which is then handled by kernel bridge
      notifier. It causes the following crash when accessing the passed bond
      netdev which is already destroyed.
      
      To fix this issue, remove such checks. LAG state is already checked in
      commit 15f8f168 ("net/mlx5: Bridge, verify LAG state when adding
      bond to bridge"), driver still need to skip offload if LAG becomes
      invalid state after initialization.
      
       Oops: stack segment: 0000 [#1] SMP
       CPU: 3 UID: 0 PID: 23695 Comm: kworker/u40:3 Tainted: G           OE      6.11.0_mlnx #1
       Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
       Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
       Workqueue: mlx5_bridge_wq mlx5_esw_bridge_update_work [mlx5_core]
       RIP: 0010:br_switchdev_event+0x2c/0x110 [bridge]
       Code: 44 00 00 48 8b 02 48 f7 00 00 02 00 00 74 69 41 54 55 53 48 83 ec 08 48 8b a8 08 01 00 00 48 85 ed 74 4a 48 83 fe 02 48 89 d3 <4c> 8b 65 00 74 23 76 49 48 83 fe 05 74 7e 48 83 fe 06 75 2f 0f b7
       RSP: 0018:ffffc900092cfda0 EFLAGS: 00010297
       RAX: ffff888123bfe000 RBX: ffffc900092cfe08 RCX: 00000000ffffffff
       RDX: ffffc900092cfe08 RSI: 0000000000000001 RDI: ffffffffa0c585f0
       RBP: 6669746f6e690a30 R08: 0000000000000000 R09: ffff888123ae92c8
       R10: 0000000000000000 R11: fefefefefefefeff R12: ffff888123ae9c60
       R13: 0000000000000001 R14: ffffc900092cfe08 R15: 0000000000000000
       FS:  0000000000000000(0000) GS:ffff88852c980000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 00007f15914c8734 CR3: 0000000002830005 CR4: 0000000000770ef0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       PKRU: 55555554
       Call Trace:
        <TASK>
        ? __die_body+0x1a/0x60
        ? die+0x38/0x60
        ? do_trap+0x10b/0x120
        ? do_error_trap+0x64/0xa0
        ? exc_stack_segment+0x33/0x50
        ? asm_exc_stack_segment+0x22/0x30
        ? br_switchdev_event+0x2c/0x110 [bridge]
        ? sched_balance_newidle.isra.149+0x248/0x390
        notifier_call_chain+0x4b/0xa0
        atomic_notifier_call_chain+0x16/0x20
        mlx5_esw_bridge_update+0xec/0x170 [mlx5_core]
        mlx5_esw_bridge_update_work+0x19/0x40 [mlx5_core]
        process_scheduled_works+0x81/0x390
        worker_thread+0x106/0x250
        ? bh_worker+0x110/0x110
        kthread+0xb7/0xe0
        ? kthread_park+0x80/0x80
        ret_from_fork+0x2d/0x50
        ? kthread_park+0x80/0x80
        ret_from_fork_asm+0x11/0x20
        </TASK>
      
      Fixes: ff9b7521 ("net/mlx5: Bridge, support LAG")
      Signed-off-by: default avatarJianbo Liu <jianbol@nvidia.com>
      Reviewed-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarMichal Swiatkowski <michal.swiatkowski@linux.intel.com>
      Link: https://patch.msgid.link/1741644104-97767-6-git-send-email-tariqt@nvidia.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      4b8eeed4
    • Shay Drory's avatar
      net/mlx5: Lag, Check shared fdb before creating MultiPort E-Switch · 32966984
      Shay Drory authored
      
      Currently, MultiPort E-Switch is requesting to create a LAG with shared
      FDB without checking the LAG is supporting shared FDB.
      Add the check.
      
      Fixes: a32327a3 ("net/mlx5: Lag, Control MultiPort E-Switch single FDB mode")
      Signed-off-by: default avatarShay Drory <shayd@nvidia.com>
      Reviewed-by: default avatarMark Bloch <mbloch@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarMichal Swiatkowski <michal.swiatkowski@linux.intel.com>
      Link: https://patch.msgid.link/1741644104-97767-5-git-send-email-tariqt@nvidia.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      32966984
    • Shay Drory's avatar
      net/mlx5: Fix incorrect IRQ pool usage when releasing IRQs · 32d2724d
      Shay Drory authored
      
      mlx5_irq_pool_get() is a getter for completion IRQ pool only.
      However, after the cited commit, mlx5_irq_pool_get() is called during
      ctrl IRQ release flow to retrieve the pool, resulting in the use of an
      incorrect IRQ pool.
      
      Hence, use the newly introduced mlx5_irq_get_pool() getter to retrieve
      the correct IRQ pool based on the IRQ itself. While at it, rename
      mlx5_irq_pool_get() to mlx5_irq_table_get_comp_irq_pool() which
      accurately reflects its purpose and improves code readability.
      
      Fixes: 0477d516 ("net/mlx5: Expose SFs IRQs")
      Signed-off-by: default avatarShay Drory <shayd@nvidia.com>
      Reviewed-by: default avatarMaher Sanalla <msanalla@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarMichal Swiatkowski <michal.swiatkowski@linux.intel.com>
      Link: https://patch.msgid.link/1741644104-97767-4-git-send-email-tariqt@nvidia.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      32d2724d
Loading