Skip to content
Snippets Groups Projects
  1. Jan 23, 2025
    • Luis Chamberlain's avatar
      nvmet: propagate npwg topology · cce9254a
      Luis Chamberlain authored
      
      [ Upstream commit b579d6fd ]
      
      Ensure we propagate npwg to the target as well instead
      of assuming its the same logical blocks per physical block.
      
      This ensures devices with large IUs information properly
      propagated on the target.
      
      Signed-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarKeith Busch <kbusch@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      cce9254a
    • Wolfram Sang's avatar
      i2c: rcar: fix NACK handling when being a target · 75505de0
      Wolfram Sang authored
      
      [ Upstream commit 093f70c1 ]
      
      When this controller is a target, the NACK handling had two issues.
      First, the return value from the backend was not checked on the initial
      WRITE_REQUESTED. So, the driver missed to send a NACK in this case.
      Also, the NACK always arrives one byte late on the bus, even in the
      WRITE_RECEIVED case. This seems to be a HW issue. We should then not
      rely on the backend to correctly NACK the superfluous byte as well. Fix
      both issues by introducing a flag which gets set whenever the backend
      requests a NACK and keep sending it until we get a STOP condition.
      
      Fixes: de20d185 ("i2c: rcar: add slave support")
      Signed-off-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      75505de0
    • Wolfram Sang's avatar
      i2c: mux: demux-pinctrl: check initial mux selection, too · 53336f33
      Wolfram Sang authored
      
      [ Upstream commit ca89f733 ]
      
      When misconfigured, the initial setup of the current mux channel can
      fail, too. It must be checked as well.
      
      Fixes: 50a5ba87 ("i2c: mux: demux-pinctrl: add driver")
      Signed-off-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      53336f33
    • Pratyush Yadav's avatar
      Revert "mtd: spi-nor: core: replace dummy buswidth from addr to data" · 4c833c36
      Pratyush Yadav authored
      
      [ Upstream commit d15638bf ]
      
      This reverts commit 98d1fb94.
      
      The commit uses data nbits instead of addr nbits for dummy phase. This
      causes a regression for all boards where spi-tx-bus-width is smaller
      than spi-rx-bus-width. It is a common pattern for boards to have
      spi-tx-bus-width == 1 and spi-rx-bus-width > 1. The regression causes
      all reads with a dummy phase to become unavailable for such boards,
      leading to a usually slower 0-dummy-cycle read being selected.
      
      Most controllers' supports_op hooks call spi_mem_default_supports_op().
      In spi_mem_default_supports_op(), spi_mem_check_buswidth() is called to
      check if the buswidths for the op can actually be supported by the
      board's wiring. This wiring information comes from (among other things)
      the spi-{tx,rx}-bus-width DT properties. Based on these properties,
      SPI_TX_* or SPI_RX_* flags are set by of_spi_parse_dt().
      spi_mem_check_buswidth() then uses these flags to make the decision
      whether an op can be supported by the board's wiring (in a way,
      indirectly checking against spi-{rx,tx}-bus-width).
      
      Now the tricky bit here is that spi_mem_check_buswidth() does:
      
      	if (op->dummy.nbytes &&
      	    spi_check_buswidth_req(mem, op->dummy.buswidth, true))
      		return false;
      
      The true argument to spi_check_buswidth_req() means the op is treated as
      a TX op. For a board that has say 1-bit TX and 4-bit RX, a 4-bit dummy
      TX is considered as unsupported, and the op gets rejected.
      
      The commit being reverted uses the data buswidth for dummy buswidth. So
      for reads, the RX buswidth gets used for the dummy phase, uncovering
      this issue. In reality, a dummy phase is neither RX nor TX. As the name
      suggests, these are just dummy cycles that send or receive no data, and
      thus don't really need to have any buswidth at all.
      
      Ideally, dummy phases should not be checked against the board's wiring
      capabilities at all, and should only be sanity-checked for having a sane
      buswidth value. Since we are now at rc7 and such a change might
      introduce many unexpected bugs, revert the commit for now. It can be
      sent out later along with the spi_mem_check_buswidth() fix.
      
      Fixes: 98d1fb94 ("mtd: spi-nor: core: replace dummy buswidth from addr to data")
      Reported-by: default avatarAlexander Stein <alexander.stein@ew.tq-group.com>
      Closes: https://lore.kernel.org/linux-mtd/3342163.44csPzL39Z@steina-w/
      
      
      Tested-by: default avatarAlexander Stein <alexander.stein@ew.tq-group.com>
      Reviewed-by: default avatarTudor Ambarus <tudor.ambarus@linaro.org>
      Signed-off-by: default avatarPratyush Yadav <pratyush@kernel.org>
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4c833c36
    • David Lechner's avatar
      hwmon: (tmp513) Fix division of negative numbers · 79fe53ed
      David Lechner authored
      
      [ Upstream commit e2c68cea ]
      
      Fix several issues with division of negative numbers in the tmp513
      driver.
      
      The docs on the DIV_ROUND_CLOSEST macro explain that dividing a negative
      value by an unsigned type is undefined behavior. The driver was doing
      this in several places, i.e. data->shunt_uohms has type of u32. The
      actual "undefined" behavior is that it converts both values to unsigned
      before doing the division, for example:
      
          int ret = DIV_ROUND_CLOSEST(-100, 3U);
      
      results in ret == 1431655732 instead of -33.
      
      Furthermore the MILLI macro has a type of unsigned long. Multiplying a
      signed long by an unsigned long results in an unsigned long.
      
      So, we need to cast both MILLI and data data->shunt_uohms to long when
      using the DIV_ROUND_CLOSEST macro.
      
      Fixes: f07f9d24 ("hwmon: (tmp513) Use SI constants from units.h")
      Fixes: 59dfa75e ("hwmon: Add driver for Texas Instruments TMP512/513 sensor chips.")
      Signed-off-by: default avatarDavid Lechner <dlechner@baylibre.com>
      Link: https://lore.kernel.org/r/20250114-fix-si-prefix-macro-sign-bugs-v1-1-696fd8d10f00@baylibre.com
      
      
      [groeck: Drop some continuation lines]
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      79fe53ed
    • Maíra Canal's avatar
      drm/v3d: Ensure job pointer is set to NULL after job completion · 2a1c88f7
      Maíra Canal authored
      
      [ Upstream commit e4b5ccd3 ]
      
      After a job completes, the corresponding pointer in the device must
      be set to NULL. Failing to do so triggers a warning when unloading
      the driver, as it appears the job is still active. To prevent this,
      assign the job pointer to NULL after completing the job, indicating
      the job has finished.
      
      Fixes: 14d1d190 ("drm/v3d: Remove the bad signaled() implementation.")
      Signed-off-by: default avatarMaíra Canal <mcanal@igalia.com>
      Reviewed-by: default avatarJose Maria Casanova Crespo <jmcasanova@igalia.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20250113154741.67520-1-mcanal@igalia.com
      
      
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2a1c88f7
    • Mark Zhang's avatar
      net/mlx5: Clear port select structure when fail to create · efc92a26
      Mark Zhang authored
      
      [ Upstream commit 5641e82c ]
      
      Clear the port select structure on error so no stale values left after
      definers are destroyed. That's because the mlx5_lag_destroy_definers()
      always try to destroy all lag definers in the tt_map, so in the flow
      below lag definers get double-destroyed and cause kernel crash:
      
        mlx5_lag_port_sel_create()
          mlx5_lag_create_definers()
            mlx5_lag_create_definer()     <- Failed on tt 1
              mlx5_lag_destroy_definers() <- definers[tt=0] gets destroyed
        mlx5_lag_port_sel_create()
          mlx5_lag_create_definers()
            mlx5_lag_create_definer()     <- Failed on tt 0
              mlx5_lag_destroy_definers() <- definers[tt=0] gets double-destroyed
      
       Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
       Mem abort info:
         ESR = 0x0000000096000005
         EC = 0x25: DABT (current EL), IL = 32 bits
         SET = 0, FnV = 0
         EA = 0, S1PTW = 0
         FSC = 0x05: level 1 translation fault
       Data abort info:
         ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
         CM = 0, WnR = 0, TnD = 0, TagAccess = 0
         GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
       user pgtable: 64k pages, 48-bit VAs, pgdp=0000000112ce2e00
       [0000000000000008] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
       Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
       Modules linked in: iptable_raw bonding ip_gre ip6_gre gre ip6_tunnel tunnel6 geneve ip6_udp_tunnel udp_tunnel ipip tunnel4 ip_tunnel rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_ib(OE) ib_uverbs(OE) mlx5_fwctl(OE) fwctl(OE) mlx5_core(OE) mlxdevm(OE) ib_core(OE) mlxfw(OE) memtrack(OE) mlx_compat(OE) openvswitch nsh nf_conncount psample xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc netconsole overlay efi_pstore sch_fq_codel zram ip_tables crct10dif_ce qemu_fw_cfg fuse ipv6 crc_ccitt [last unloaded: mlx_compat(OE)]
        CPU: 3 UID: 0 PID: 217 Comm: kworker/u53:2 Tainted: G           OE      6.11.0+ #2
        Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
        Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
        Workqueue: mlx5_lag mlx5_do_bond_work [mlx5_core]
        pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
        pc : mlx5_del_flow_rules+0x24/0x2c0 [mlx5_core]
        lr : mlx5_lag_destroy_definer+0x54/0x100 [mlx5_core]
        sp : ffff800085fafb00
        x29: ffff800085fafb00 x28: ffff0000da0c8000 x27: 0000000000000000
        x26: ffff0000da0c8000 x25: ffff0000da0c8000 x24: ffff0000da0c8000
        x23: ffff0000c31f81a0 x22: 0400000000000000 x21: ffff0000da0c8000
        x20: 0000000000000000 x19: 0000000000000001 x18: 0000000000000000
        x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffff8b0c9350
        x14: 0000000000000000 x13: ffff800081390d18 x12: ffff800081dc3cc0
        x11: 0000000000000001 x10: 0000000000000b10 x9 : ffff80007ab7304c
        x8 : ffff0000d00711f0 x7 : 0000000000000004 x6 : 0000000000000190
        x5 : ffff00027edb3010 x4 : 0000000000000000 x3 : 0000000000000000
        x2 : ffff0000d39b8000 x1 : ffff0000d39b8000 x0 : 0400000000000000
        Call trace:
         mlx5_del_flow_rules+0x24/0x2c0 [mlx5_core]
         mlx5_lag_destroy_definer+0x54/0x100 [mlx5_core]
         mlx5_lag_destroy_definers+0xa0/0x108 [mlx5_core]
         mlx5_lag_port_sel_create+0x2d4/0x6f8 [mlx5_core]
         mlx5_activate_lag+0x60c/0x6f8 [mlx5_core]
         mlx5_do_bond_work+0x284/0x5c8 [mlx5_core]
         process_one_work+0x170/0x3e0
         worker_thread+0x2d8/0x3e0
         kthread+0x11c/0x128
         ret_from_fork+0x10/0x20
        Code: a9025bf5 aa0003f6 a90363f7 f90023f9 (f9400400)
        ---[ end trace 0000000000000000 ]---
      
      Fixes: dc48516e ("net/mlx5: Lag, add support to create definers for LAG")
      Signed-off-by: default avatarMark Zhang <markzhang@nvidia.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Reviewed-by: default avatarMark Bloch <mbloch@nvidia.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      efc92a26
    • Patrisious Haddad's avatar
      net/mlx5: Fix RDMA TX steering prio · edb43b46
      Patrisious Haddad authored
      
      [ Upstream commit c08d3e62 ]
      
      User added steering rules at RDMA_TX were being added to the first prio,
      which is the counters prio.
      Fix that so that they are correctly added to the BYPASS_PRIO instead.
      
      Fixes: 24670b1a ("net/mlx5: Add support for RDMA TX steering")
      Signed-off-by: default avatarPatrisious Haddad <phaddad@nvidia.com>
      Reviewed-by: default avatarMark Bloch <mbloch@nvidia.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      edb43b46
    • Sean Anderson's avatar
      net: xilinx: axienet: Fix IRQ coalescing packet count overflow · 207c81e2
      Sean Anderson authored
      
      [ Upstream commit c17ff476 ]
      
      If coalesce_count is greater than 255 it will not fit in the register and
      will overflow. This can be reproduced by running
      
          # ethtool -C ethX rx-frames 256
      
      which will result in a timeout of 0us instead. Fix this by checking for
      invalid values and reporting an error.
      
      Fixes: 8a3b7a25 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
      Signed-off-by: default avatarSean Anderson <sean.anderson@linux.dev>
      Reviewed-by: default avatarShannon Nelson <shannon.nelson@amd.com>
      Reviewed-by: default avatarRadhey Shyam Pandey <radhey.shyam.pandey@amd.com>
      Link: https://patch.msgid.link/20250113163001.2335235-1-sean.anderson@linux.dev
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      207c81e2
    • Dan Carpenter's avatar
      nfp: bpf: prevent integer overflow in nfp_bpf_event_output() · c385389a
      Dan Carpenter authored
      
      [ Upstream commit 16ebb6f5 ]
      
      The "sizeof(struct cmsg_bpf_event) + pkt_size + data_size" math could
      potentially have an integer wrapping bug on 32bit systems.  Check for
      this and return an error.
      
      Fixes: 9816dd35 ("nfp: bpf: perf event output helpers support")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Link: https://patch.msgid.link/6074805b-e78d-4b8a-bf05-e929b5377c28@stanley.mountain
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c385389a
    • Kuniyuki Iwashima's avatar
      gtp: Destroy device along with udp socket's netns dismantle. · efec287c
      Kuniyuki Iwashima authored
      
      [ Upstream commit eb28fd76 ]
      
      gtp_newlink() links the device to a list in dev_net(dev) instead of
      src_net, where a udp tunnel socket is created.
      
      Even when src_net is removed, the device stays alive on dev_net(dev).
      Then, removing src_net triggers the splat below. [0]
      
      In this example, gtp0 is created in ns2, and the udp socket is created
      in ns1.
      
        ip netns add ns1
        ip netns add ns2
        ip -n ns1 link add netns ns2 name gtp0 type gtp role sgsn
        ip netns del ns1
      
      Let's link the device to the socket's netns instead.
      
      Now, gtp_net_exit_batch_rtnl() needs another netdev iteration to remove
      all gtp devices in the netns.
      
      [0]:
      ref_tracker: net notrefcnt@000000003d6e7d05 has 1/2 users at
           sk_alloc (./include/net/net_namespace.h:345 net/core/sock.c:2236)
           inet_create (net/ipv4/af_inet.c:326 net/ipv4/af_inet.c:252)
           __sock_create (net/socket.c:1558)
           udp_sock_create4 (net/ipv4/udp_tunnel_core.c:18)
           gtp_create_sock (./include/net/udp_tunnel.h:59 drivers/net/gtp.c:1423)
           gtp_create_sockets (drivers/net/gtp.c:1447)
           gtp_newlink (drivers/net/gtp.c:1507)
           rtnl_newlink (net/core/rtnetlink.c:3786 net/core/rtnetlink.c:3897 net/core/rtnetlink.c:4012)
           rtnetlink_rcv_msg (net/core/rtnetlink.c:6922)
           netlink_rcv_skb (net/netlink/af_netlink.c:2542)
           netlink_unicast (net/netlink/af_netlink.c:1321 net/netlink/af_netlink.c:1347)
           netlink_sendmsg (net/netlink/af_netlink.c:1891)
           ____sys_sendmsg (net/socket.c:711 net/socket.c:726 net/socket.c:2583)
           ___sys_sendmsg (net/socket.c:2639)
           __sys_sendmsg (net/socket.c:2669)
           do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
      
      WARNING: CPU: 1 PID: 60 at lib/ref_tracker.c:179 ref_tracker_dir_exit (lib/ref_tracker.c:179)
      Modules linked in:
      CPU: 1 UID: 0 PID: 60 Comm: kworker/u16:2 Not tainted 6.13.0-rc5-00147-g4c1224501e9d #5
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
      Workqueue: netns cleanup_net
      RIP: 0010:ref_tracker_dir_exit (lib/ref_tracker.c:179)
      Code: 00 00 00 fc ff df 4d 8b 26 49 bd 00 01 00 00 00 00 ad de 4c 39 f5 0f 85 df 00 00 00 48 8b 74 24 08 48 89 df e8 a5 cc 12 02 90 <0f> 0b 90 48 8d 6b 44 be 04 00 00 00 48 89 ef e8 80 de 67 ff 48 89
      RSP: 0018:ff11000009a07b60 EFLAGS: 00010286
      RAX: 0000000000002bd3 RBX: ff1100000f4e1aa0 RCX: 1ffffffff0e40ac6
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff8423ee3c
      RBP: ff1100000f4e1af0 R08: 0000000000000001 R09: fffffbfff0e395ae
      R10: 0000000000000001 R11: 0000000000036001 R12: ff1100000f4e1af0
      R13: dead000000000100 R14: ff1100000f4e1af0 R15: dffffc0000000000
      FS:  0000000000000000(0000) GS:ff1100006ce80000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f9b2464bd98 CR3: 0000000005286005 CR4: 0000000000771ef0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
      PKRU: 55555554
      Call Trace:
       <TASK>
       ? __warn (kernel/panic.c:748)
       ? ref_tracker_dir_exit (lib/ref_tracker.c:179)
       ? report_bug (lib/bug.c:201 lib/bug.c:219)
       ? handle_bug (arch/x86/kernel/traps.c:285)
       ? exc_invalid_op (arch/x86/kernel/traps.c:309 (discriminator 1))
       ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:621)
       ? _raw_spin_unlock_irqrestore (./arch/x86/include/asm/irqflags.h:42 ./arch/x86/include/asm/irqflags.h:97 ./arch/x86/include/asm/irqflags.h:155 ./include/linux/spinlock_api_smp.h:151 kernel/locking/spinlock.c:194)
       ? ref_tracker_dir_exit (lib/ref_tracker.c:179)
       ? __pfx_ref_tracker_dir_exit (lib/ref_tracker.c:158)
       ? kfree (mm/slub.c:4613 mm/slub.c:4761)
       net_free (net/core/net_namespace.c:476 net/core/net_namespace.c:467)
       cleanup_net (net/core/net_namespace.c:664 (discriminator 3))
       process_one_work (kernel/workqueue.c:3229)
       worker_thread (kernel/workqueue.c:3304 kernel/workqueue.c:3391)
       kthread (kernel/kthread.c:389)
       ret_from_fork (arch/x86/kernel/process.c:147)
       ret_from_fork_asm (arch/x86/entry/entry_64.S:257)
       </TASK>
      
      Fixes: 459aa660 ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
      Reported-by: default avatarXiao Liang <shaw.leon@gmail.com>
      Closes: https://lore.kernel.org/netdev/20250104125732.17335-1-shaw.leon@gmail.com/
      
      
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      efec287c
    • Kuniyuki Iwashima's avatar
      gtp: Use for_each_netdev_rcu() in gtp_genl_dump_pdp(). · c91e6946
      Kuniyuki Iwashima authored
      
      [ Upstream commit 46841c70 ]
      
      gtp_newlink() links the gtp device to a list in dev_net(dev).
      
      However, even after the gtp device is moved to another netns,
      it stays on the list but should be invisible.
      
      Let's use for_each_netdev_rcu() for netdev traversal in
      gtp_genl_dump_pdp().
      
      Note that gtp_dev_list is no longer used under RCU, so list
      helpers are converted to the non-RCU variant.
      
      Fixes: 459aa660 ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
      Reported-by: default avatarXiao Liang <shaw.leon@gmail.com>
      Closes: https://lore.kernel.org/netdev/CABAhCOQdBL6h9M2C+kd+bGivRJ9Q72JUxW+-gur0nub_=PmFPA@mail.gmail.com/
      
      
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c91e6946
    • Eric Dumazet's avatar
      gtp: use exit_batch_rtnl() method · a3fdd5f3
      Eric Dumazet authored
      
      [ Upstream commit 6eedda01 ]
      
      exit_batch_rtnl() is called while RTNL is held,
      and devices to be unregistered can be queued in the dev_kill_list.
      
      This saves one rtnl_lock()/rtnl_unlock() pair per netns
      and one unregister_netdevice_many() call per netns.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarAntoine Tenart <atenart@kernel.org>
      Link: https://lore.kernel.org/r/20240206144313.2050392-8-edumazet@google.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Stable-dep-of: 46841c70 ("gtp: Use for_each_netdev_rcu() in gtp_genl_dump_pdp().")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a3fdd5f3
    • Eric Dumazet's avatar
      net: add exit_batch_rtnl() method · 760f415e
      Eric Dumazet authored
      
      [ Upstream commit fd4f101e ]
      
      Many (struct pernet_operations)->exit_batch() methods have
      to acquire rtnl.
      
      In presence of rtnl mutex pressure, this makes cleanup_net()
      very slow.
      
      This patch adds a new exit_batch_rtnl() method to reduce
      number of rtnl acquisitions from cleanup_net().
      
      exit_batch_rtnl() handlers are called while rtnl is locked,
      and devices to be killed can be queued in a list provided
      as their second argument.
      
      A single unregister_netdevice_many() is called right
      before rtnl is released.
      
      exit_batch_rtnl() handlers are called before ->exit() and
      ->exit_batch() handlers.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarAntoine Tenart <atenart@kernel.org>
      Link: https://lore.kernel.org/r/20240206144313.2050392-2-edumazet@google.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Stable-dep-of: 46841c70 ("gtp: Use for_each_netdev_rcu() in gtp_genl_dump_pdp().")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      760f415e
    • Artem Chernyshev's avatar
      pktgen: Avoid out-of-bounds access in get_imix_entries · e5d24a70
      Artem Chernyshev authored
      
      [ Upstream commit 76201b59 ]
      
      Passing a sufficient amount of imix entries leads to invalid access to the
      pkt_dev->imix_entries array because of the incorrect boundary check.
      
      UBSAN: array-index-out-of-bounds in net/core/pktgen.c:874:24
      index 20 is out of range for type 'imix_pkt [20]'
      CPU: 2 PID: 1210 Comm: bash Not tainted 6.10.0-rc1 #121
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
      Call Trace:
      <TASK>
      dump_stack_lvl lib/dump_stack.c:117
      __ubsan_handle_out_of_bounds lib/ubsan.c:429
      get_imix_entries net/core/pktgen.c:874
      pktgen_if_write net/core/pktgen.c:1063
      pde_write fs/proc/inode.c:334
      proc_reg_write fs/proc/inode.c:346
      vfs_write fs/read_write.c:593
      ksys_write fs/read_write.c:644
      do_syscall_64 arch/x86/entry/common.c:83
      entry_SYSCALL_64_after_hwframe arch/x86/entry/entry_64.S:130
      
      Found by Linux Verification Center (linuxtesting.org) with SVACE.
      
      Fixes: 52a62f86 ("pktgen: Parse internet mix (imix) input")
      Signed-off-by: default avatarArtem Chernyshev <artem.chernyshev@red-soft.ru>
      [ fp: allow to fill the array completely; minor changelog cleanup ]
      Signed-off-by: default avatarFedor Pchelkin <pchelkin@ispras.ru>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e5d24a70
    • Ilya Maximets's avatar
      openvswitch: fix lockup on tx to unregistering netdev with carrier · ea9e9903
      Ilya Maximets authored
      
      [ Upstream commit 47e55e4b ]
      
      Commit in a fixes tag attempted to fix the issue in the following
      sequence of calls:
      
          do_output
          -> ovs_vport_send
             -> dev_queue_xmit
                -> __dev_queue_xmit
                   -> netdev_core_pick_tx
                      -> skb_tx_hash
      
      When device is unregistering, the 'dev->real_num_tx_queues' goes to
      zero and the 'while (unlikely(hash >= qcount))' loop inside the
      'skb_tx_hash' becomes infinite, locking up the core forever.
      
      But unfortunately, checking just the carrier status is not enough to
      fix the issue, because some devices may still be in unregistering
      state while reporting carrier status OK.
      
      One example of such device is a net/dummy.  It sets carrier ON
      on start, but it doesn't implement .ndo_stop to set the carrier off.
      And it makes sense, because dummy doesn't really have a carrier.
      Therefore, while this device is unregistering, it's still easy to hit
      the infinite loop in the skb_tx_hash() from the OVS datapath.  There
      might be other drivers that do the same, but dummy by itself is
      important for the OVS ecosystem, because it is frequently used as a
      packet sink for tcpdump while debugging OVS deployments.  And when the
      issue is hit, the only way to recover is to reboot.
      
      Fix that by also checking if the device is running.  The running
      state is handled by the net core during unregistering, so it covers
      unregistering case better, and we don't really need to send packets
      to devices that are not running anyway.
      
      While only checking the running state might be enough, the carrier
      check is preserved.  The running and the carrier states seem disjoined
      throughout the code and different drivers.  And other core functions
      like __dev_direct_xmit() check both before attempting to transmit
      a packet.  So, it seems safer to check both flags in OVS as well.
      
      Fixes: 066b8678 ("net: openvswitch: fix race on port output")
      Reported-by: default avatarFriedrich Weber <f.weber@proxmox.com>
      Closes: https://mail.openvswitch.org/pipermail/ovs-discuss/2025-January/053423.html
      
      
      Signed-off-by: default avatarIlya Maximets <i.maximets@ovn.org>
      Tested-by: default avatarFriedrich Weber <f.weber@proxmox.com>
      Reviewed-by: default avatarAaron Conole <aconole@redhat.com>
      Link: https://patch.msgid.link/20250109122225.4034688-1-i.maximets@ovn.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ea9e9903
    • Michal Luczaj's avatar
      bpf: Fix bpf_sk_select_reuseport() memory leak · d0a3b3d1
      Michal Luczaj authored
      
      [ Upstream commit b3af6092 ]
      
      As pointed out in the original comment, lookup in sockmap can return a TCP
      ESTABLISHED socket. Such TCP socket may have had SO_ATTACH_REUSEPORT_EBPF
      set before it was ESTABLISHED. In other words, a non-NULL sk_reuseport_cb
      does not imply a non-refcounted socket.
      
      Drop sk's reference in both error paths.
      
      unreferenced object 0xffff888101911800 (size 2048):
        comm "test_progs", pid 44109, jiffies 4297131437
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          80 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace (crc 9336483b):
          __kmalloc_noprof+0x3bf/0x560
          __reuseport_alloc+0x1d/0x40
          reuseport_alloc+0xca/0x150
          reuseport_attach_prog+0x87/0x140
          sk_reuseport_attach_bpf+0xc8/0x100
          sk_setsockopt+0x1181/0x1990
          do_sock_setsockopt+0x12b/0x160
          __sys_setsockopt+0x7b/0xc0
          __x64_sys_setsockopt+0x1b/0x30
          do_syscall_64+0x93/0x180
          entry_SYSCALL_64_after_hwframe+0x76/0x7e
      
      Fixes: 64d85290 ("bpf: Allow bpf_map_lookup_elem for SOCKMAP and SOCKHASH")
      Signed-off-by: default avatarMichal Luczaj <mhal@rbox.co>
      Reviewed-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      Link: https://patch.msgid.link/20250110-reuseport-memleak-v1-1-fa1ddab0adfe@rbox.co
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d0a3b3d1
    • Sudheer Kumar Doredla's avatar
      net: ethernet: ti: cpsw_ale: Fix cpsw_ale_get_field() · 07524817
      Sudheer Kumar Doredla authored
      
      [ Upstream commit 03d120f2 ]
      
      CPSW ALE has 75-bit ALE entries stored across three 32-bit words.
      The cpsw_ale_get_field() and cpsw_ale_set_field() functions support
      ALE field entries spanning up to two words at the most.
      
      The cpsw_ale_get_field() and cpsw_ale_set_field() functions work as
      expected when ALE field spanned across word1 and word2, but fails when
      ALE field spanned across word2 and word3.
      
      For example, while reading the ALE field spanned across word2 and word3
      (i.e. bits 62 to 64), the word3 data shifted to an incorrect position
      due to the index becoming zero while flipping.
      The same issue occurred when setting an ALE entry.
      
      This issue has not been seen in practice but will be an issue in the future
      if the driver supports accessing ALE fields spanning word2 and word3
      
      Fix the methods to handle getting/setting fields spanning up to two words.
      
      Fixes: b685f1a5 ("net: ethernet: ti: cpsw_ale: Fix cpsw_ale_get_field()/cpsw_ale_set_field()")
      Signed-off-by: default avatarSudheer Kumar Doredla <s-doredla@ti.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Reviewed-by: default avatarRoger Quadros <rogerq@kernel.org>
      Reviewed-by: default avatarSiddharth Vadapalli <s-vadapalli@ti.com>
      Link: https://patch.msgid.link/20250108172433.311694-1-s-doredla@ti.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      07524817
  2. Jan 19, 2025
  3. Jan 17, 2025
Loading