- Aug 31, 2022
-
-
Kuniyuki Iwashima authored
[ Upstream commit 7de6d09f ] While reading sysctl_optmem_max, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 1da177e4 ("Linux-2.6.12-rc2") Signed-off-by:
Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Martin KaFai Lau authored
[ Upstream commit 9e838b02 ] sk_storage_charge() is the only user of omem_charge(). This patch simplifies it by folding omem_charge() into sk_storage_charge(). Signed-off-by:
Martin KaFai Lau <kafai@fb.com> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Acked-by:
Song Liu <songliubraving@fb.com> Acked-by:
KP Singh <kpsingh@google.com> Link: https://lore.kernel.org/bpf/20201112211301.2586255-1-kafai@fb.com Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Kuniyuki Iwashima authored
[ Upstream commit 6bae8ceb ] While reading rs->interval and rs->burst, they can be changed concurrently via sysctl (e.g. net_ratelimit_state). Thus, we need to add READ_ONCE() to their readers. Fixes: 1da177e4 ("Linux-2.6.12-rc2") Signed-off-by:
Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Kuniyuki Iwashima authored
[ Upstream commit 61adf447 ] While reading netdev_tstamp_prequeue, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 3b098e2d ("net: Consistent skb timestamping") Signed-off-by:
Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Kuniyuki Iwashima authored
[ Upstream commit 5dcd08cd ] While reading netdev_max_backlog, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. While at it, we remove the unnecessary spaces in the doc. Fixes: 1da177e4 ("Linux-2.6.12-rc2") Signed-off-by:
Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Kuniyuki Iwashima authored
[ Upstream commit bf955b5a ] While reading weight_p, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Also, dev_[rt]x_weight can be read/written at the same time. So, we need to use READ_ONCE() and WRITE_ONCE() for its access. Moreover, to use the same weight_p while changing dev_[rt]x_weight, we add a mutex in proc_do_dev_weight(). Fixes: 3d48b53f ("net: dev_weight: TX/RX orthogonality") Fixes: 1da177e4 ("Linux-2.6.12-rc2") Signed-off-by:
Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Kuniyuki Iwashima authored
[ Upstream commit 1227c177 ] While reading sysctl_[rw]mem_(max|default), they can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 1da177e4 ("Linux-2.6.12-rc2") Signed-off-by:
Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Kuniyuki Iwashima authored
[ Upstream commit 02739545 ] While reading these sysctl variables, they can be changed concurrently. Thus, we need to add READ_ONCE() to their readers. - .sysctl_rmem - .sysctl_rwmem - .sysctl_rmem_offset - .sysctl_wmem_offset - sysctl_tcp_rmem[1, 2] - sysctl_tcp_wmem[1, 2] - sysctl_decnet_rmem[1] - sysctl_decnet_wmem[1] - sysctl_tipc_rmem[1] Fixes: 1da177e4 ("Linux-2.6.12-rc2") Signed-off-by:
Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Eric Dumazet authored
[ Upstream commit 240bfd13 ] tcp_grow_window() is using skb->len/skb->truesize to increase tp->rcv_ssthresh which has a direct impact on advertized window sizes. We added TCP coalescing in linux-3.4 & linux-3.5: Instead of storing skbs with one or two MSS in receive queue (or OFO queue), we try to append segments together to reduce memory overhead. High performance network drivers tend to cook skb with 3 parts : 1) sk_buff structure (256 bytes) 2) skb->head contains room to copy headers as needed, and skb_shared_info 3) page fragment(s) containing the ~1514 bytes frame (or more depending on MTU) Once coalesced into a previous skb, 1) and 2) are freed. We can therefore tweak the way we compute len/truesize ratio knowing that skb->truesize is inflated by 1) and 2) soon to be freed. This is done only for in-order skb, or skb coalesced into OFO queue. The result is that low rate flows no longer pay the memory price of having low GRO aggregation factor. Same result for drivers not using GRO. This is critical to allow a big enough receiver window, typically tcp_rmem[2] / 2. We have been using this at Google for about 5 years, it is due time to make it upstream. Signed-off-by:
Eric Dumazet <edumazet@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by:
Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Pablo Neira Ayuso authored
[ Upstream commit e02f0d39 ] Update nft_data_init() to report EINVAL if chain is already bound. Fixes: d0e2c7de ("netfilter: nf_tables: add NFT_CHAIN_BINDING") Reported-by:
Gwangun Jung <exsociety@gmail.com> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Pablo Neira Ayuso authored
[ Upstream commit f323ef3a ] Extend struct nft_data_desc to add a flag field that specifies nft_data_init() is being called for set element data. Use it to disallow jump to implicit chain from set element, only jump to chain via immediate expression is allowed. Fixes: d0e2c7de ("netfilter: nf_tables: add NFT_CHAIN_BINDING") Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Pablo Neira Ayuso authored
[ Upstream commit 341b6941 ] Instead of parsing the data and then validate that type and length are correct, pass a description of the expected data so it can be validated upfront before parsing it to bail out earlier. This patch adds a new .size field to specify the maximum size of the data area. The .len field is optional and it is used as an input/output field, it provides the specific length of the expected data in the input path. If then .len field is not specified, then obtained length from the netlink attribute is stored. This is required by cmp, bitwise, range and immediate, which provide no netlink attribute that describes the data length. The immediate expression uses the destination register type to infer the expected data type. Relying on opencoded validation of the expected data might lead to subtle bugs as described in 7e6bc1f6 ("netfilter: nf_tables: stricter validation of element data"). Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Jeremy Sowden authored
[ Upstream commit 00bd4352 ] Replace two labels (`err1` and `err2`) with more informative ones. Signed-off-by:
Jeremy Sowden <jeremy@azazel.net> Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Pablo Neira Ayuso authored
[ Upstream commit 23f68d46 ] Allow up to 16-byte comparisons with a new cmp fast version. Use two 64-bit words and calculate the mask representing the bits to be compared. Make sure the comparison is 64-bit aligned and avoid out-of-bound memory access on registers. Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Pablo Neira Ayuso authored
[ Upstream commit 4765473f ] Add function to consolidate verdict tracing. Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Colin Ian King authored
[ Upstream commit 626899a0 ] The variable err is being assigned a value that is never read, the same error number is being returned at the error return path via label err1. Clean up the code by removing the assignment. Addresses-Coverity: ("Unused value") Signed-off-by:
Colin Ian King <colin.king@canonical.com> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Pablo Neira Ayuso authored
[ Upstream commit 01e4092d ] Only allow to use this expression from NFPROTO_NETDEV family. Fixes: af308b94 ("netfilter: nf_tables: add tunnel support") Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Pablo Neira Ayuso authored
[ Upstream commit 5f3b7aae ] As it was originally intended, restrict extension to supported families. Fixes: b96af92d ("netfilter: nf_tables: implement Passive OS fingerprint module in nft_osf") Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Pablo Neira Ayuso authored
[ Upstream commit 43eb8949 ] Error might occur later in the nf_tables_addchain() codepath, enable static key only after transaction has been created. Fixes: 9f08ea84 ("netfilter: nf_tables: keep chain counters away from hot path") Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Pablo Neira Ayuso authored
[ Upstream commit 7044ab28 ] Instead report ERANGE if csum_offset is too long, and EOPNOTSUPP if type is not support. Fixes: 7ec3f7b4 ("netfilter: nft_payload: add packet mangling support") Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Pablo Neira Ayuso authored
[ Upstream commit 94254f99 ] Instead of offset and length are truncation to u8, report ERANGE. Fixes: 96518518 ("netfilter: add nftables") Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Vikas Gupta authored
[ Upstream commit 09a89cc5 ] There are 2 issues: 1. We should decrement hw_resc->max_nqs instead of hw_resc->max_irqs with the number of NQs assigned to the VFs. The IRQs are fixed on each function and cannot be re-assigned. Only the NQs are being assigned to the VFs. 2. vf_msix is the total number of NQs to be assigned to the VFs. So we should decrement vf_msix from hw_resc->max_nqs. Fixes: b16b6891 ("bnxt_en: Add SR-IOV support for 57500 chips.") Signed-off-by:
Vikas Gupta <vikas.gupta@broadcom.com> Signed-off-by:
Michael Chan <michael.chan@broadcom.com> Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Florian Westphal authored
[ Upstream commit 7997eff8 ] Harshit Mogalapalli says: In ebt_do_table() function dereferencing 'private->hook_entry[hook]' can lead to NULL pointer dereference. [..] Kernel panic: general protection fault, probably for non-canonical address 0xdffffc0000000005: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f] [..] RIP: 0010:ebt_do_table+0x1dc/0x1ce0 Code: 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 5c 16 00 00 48 b8 00 00 00 00 00 fc ff df 49 8b 6c df 08 48 8d 7d 2c 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 88 [..] Call Trace: nf_hook_slow+0xb1/0x170 __br_forward+0x289/0x730 maybe_deliver+0x24b/0x380 br_flood+0xc6/0x390 br_dev_xmit+0xa2e/0x12c0 For some reason ebtables rejects blobs that provide entry points that are not supported by the table, but what it should instead reject is the opposite: blobs that DO NOT provide an entry point supported by the table. t->valid_hooks is the bitmask of hooks (input, forward ...) that will see packets. Providing an entry point that is not support is harmless (never called/used), but the inverse isn't: it results in a crash because the ebtables traverser doesn't expect a NULL blob for a location its receiving packets for. Instead of fixing all the individual checks, do what iptables is doing and reject all blobs that differ from the expected hooks. Fixes: 1da177e4 ("Linux-2.6.12-rc2") Reported-by:
Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Reported-by:
syzkaller <syzkaller@googlegroups.com> Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Maciej Żenczykowski authored
[ Upstream commit 4b2e3a17 ] Looks to have been left out in an oversight. Cc: Mahesh Bandewar <maheshb@google.com> Cc: Sainath Grandhi <sainath.grandhi@intel.com> Fixes: 235a9d89 ('ipvtap: IP-VLAN based tap driver') Signed-off-by:
Maciej Żenczykowski <maze@google.com> Link: https://lore.kernel.org/r/20220821130808.12143-1-zenczykowski@gmail.com Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Jonathan Toppins authored
[ Upstream commit d745b506 ] This is caused by the global variable ad_ticks_per_sec being zero as demonstrated by the reproducer script discussed below. This causes all timer values in __ad_timer_to_ticks to be zero, resulting in the periodic timer to never fire. To reproduce: Run the script in `tools/testing/selftests/drivers/net/bonding/bond-break-lacpdu-tx.sh` which puts bonding into a state where it never transmits LACPDUs. line 44: ip link add fbond type bond mode 4 miimon 200 \ xmit_hash_policy 1 ad_actor_sys_prio 65535 lacp_rate fast setting bond param: ad_actor_sys_prio given: params.ad_actor_system = 0 call stack: bond_option_ad_actor_sys_prio() -> bond_3ad_update_ad_actor_settings() -> set ad.system.sys_priority = bond->params.ad_actor_sys_prio -> ad.system.sys_mac_addr = bond->dev->dev_addr; because params.ad_actor_system == 0 results: ad.system.sys_mac_addr = bond->dev->dev_addr line 48: ip link set fbond address 52:54:00:3B:7C:A6 setting bond MAC addr call stack: bond->dev->dev_addr = new_mac line 52: ip link set fbond type bond ad_actor_sys_prio 65535 setting bond param: ad_actor_sys_prio given: params.ad_actor_system = 0 call stack: bond_option_ad_actor_sys_prio() -> bond_3ad_update_ad_actor_settings() -> set ad.system.sys_priority = bond->params.ad_actor_sys_prio -> ad.system.sys_mac_addr = bond->dev->dev_addr; because params.ad_actor_system == 0 results: ad.system.sys_mac_addr = bond->dev->dev_addr line 60: ip link set veth1-bond down master fbond given: params.ad_actor_system = 0 params.mode = BOND_MODE_8023AD ad.system.sys_mac_addr == bond->dev->dev_addr call stack: bond_enslave -> bond_3ad_initialize(); because first slave -> if ad.system.sys_mac_addr != bond->dev->dev_addr return results: Nothing is run in bond_3ad_initialize() because dev_addr equals sys_mac_addr leaving the global ad_ticks_per_sec zero as it is never initialized anywhere else. The if check around the contents of bond_3ad_initialize() is no longer needed due to commit 5ee14e6d ("bonding: 3ad: apply ad_actor settings changes immediately") which sets ad.system.sys_mac_addr if any one of the bonding parameters whos set function calls bond_3ad_update_ad_actor_settings(). This is because if ad.system.sys_mac_addr is zero it will be set to the current bond mac address, this causes the if check to never be true. Fixes: 5ee14e6d ("bonding: 3ad: apply ad_actor settings changes immediately") Signed-off-by:
Jonathan Toppins <jtoppins@redhat.com> Acked-by:
Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Sergei Antonov authored
[ Upstream commit 0ee7828d ] Since priv->rx_mapping[i] is maped in moxart_mac_open(), we should unmap it from moxart_mac_stop(). Fixes 2 warnings. 1. During error unwinding in moxart_mac_probe(): "goto init_fail;", then moxart_mac_free_memory() calls dma_unmap_single() with priv->rx_mapping[i] pointers zeroed. WARNING: CPU: 0 PID: 1 at kernel/dma/debug.c:963 check_unmap+0x704/0x980 DMA-API: moxart-ethernet 92000000.mac: device driver tries to free DMA memory it has not allocated [device address=0x0000000000000000] [size=1600 bytes] CPU: 0 PID: 1 Comm: swapper Not tainted 5.19.0+ #60 Hardware name: Generic DT based system unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x34/0x44 dump_stack_lvl from __warn+0xbc/0x1f0 __warn from warn_slowpath_fmt+0x94/0xc8 warn_slowpath_fmt from check_unmap+0x704/0x980 check_unmap from debug_dma_unmap_page+0x8c/0x9c debug_dma_unmap_page from moxart_mac_free_memory+0x3c/0xa8 moxart_mac_free_memory from moxart_mac_probe+0x190/0x218 moxart_mac_probe from platform_probe+0x48/0x88 platform_probe from really_probe+0xc0/0x2e4 2. After commands: ip link set dev eth0 down ip link set dev eth0 up WARNING: CPU: 0 PID: 55 at kernel/dma/debug.c:570 add_dma_entry+0x204/0x2ec DMA-API: moxart-ethernet 92000000.mac: cacheline tracking EEXIST, overlapping mappings aren't supported CPU: 0 PID: 55 Comm: ip Not tainted 5.19.0+ #57 Hardware name: Generic DT based system unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x34/0x44 dump_stack_lvl from __warn+0xbc/0x1f0 __warn from warn_slowpath_fmt+0x94/0xc8 warn_slowpath_fmt from add_dma_entry+0x204/0x2ec add_dma_entry from dma_map_page_attrs+0x110/0x328 dma_map_page_attrs from moxart_mac_open+0x134/0x320 moxart_mac_open from __dev_open+0x11c/0x1ec __dev_open from __dev_change_flags+0x194/0x22c __dev_change_flags from dev_change_flags+0x14/0x44 dev_change_flags from devinet_ioctl+0x6d4/0x93c devinet_ioctl from inet_ioctl+0x1ac/0x25c v1 -> v2: Extraneous change removed. Fixes: 6c821bd9 ("net: Add MOXA ART SoCs ethernet driver") Signed-off-by:
Sergei Antonov <saproj@gmail.com> Reviewed-by:
Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220819110519.1230877-1-saproj@gmail.com Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Alex Elder authored
[ Upstream commit b8d43803 ] In ipa_smem_init(), a Qualcomm SMEM region is allocated (if needed) and then its virtual address is fetched using qcom_smem_get(). The physical address associated with that region is also fetched. The physical address is adjusted so that it is page-aligned, and an attempt is made to update the size of the region to compensate for any non-zero adjustment. But that adjustment isn't done properly. The physical address is aligned twice, and as a result the size is never actually adjusted. Fix this by *not* aligning the "addr" local variable, and instead making the "phys" local variable be the adjusted "addr" value. Fixes: a0036bb4 ("net: ipa: define SMEM memory region for IPA") Signed-off-by:
Alex Elder <elder@linaro.org> Link: https://lore.kernel.org/r/20220818134206.567618-1-elder@linaro.org Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Vlad Buslov authored
[ Upstream commit f37044fd ] When querying mlx5 non-uplink representors capabilities with ethtool rx-vlan-offload is marked as "off [fixed]". However, it is actually always enabled because mlx5e_params->vlan_strip_disable is 0 by default when initializing struct mlx5e_params instance. Fix the issue by explicitly setting the vlan_strip_disable to 'true' for non-uplink representors. Fixes: cb67b832 ("net/mlx5e: Introduce SRIOV VF representors") Signed-off-by:
Vlad Buslov <vladbu@nvidia.com> Reviewed-by:
Roi Dayan <roid@nvidia.com> Signed-off-by:
Saeed Mahameed <saeedm@nvidia.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Maciej Fijalkowski authored
[ Upstream commit 5a42f112 ] Fix the following scenario: 1. ethtool -L $IFACE rx 8 tx 96 2. xdpsock -q 10 -t -z Above refers to a case where user would like to attach XSK socket in txonly mode at a queue id that does not have a corresponding Rx queue. At this moment ice's XSK logic is tightly bound to act on a "queue pair", e.g. both Tx and Rx queues at a given queue id are disabled/enabled and both of them will get XSK pool assigned, which is broken for the presented queue configuration. This results in the splat included at the bottom, which is basically an OOB access to Rx ring array. To fix this, allow using the ids only in scope of "combined" queues reported by ethtool. However, logic should be rewritten to allow such configurations later on, which would end up as a complete rewrite of the control path, so let us go with this temporary fix. [420160.558008] BUG: kernel NULL pointer dereference, address: 0000000000000082 [420160.566359] #PF: supervisor read access in kernel mode [420160.572657] #PF: error_code(0x0000) - not-present page [420160.579002] PGD 0 P4D 0 [420160.582756] Oops: 0000 [#1] PREEMPT SMP NOPTI [420160.588396] CPU: 10 PID: 21232 Comm: xdpsock Tainted: G OE 5.19.0-rc7+ #10 [420160.597893] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019 [420160.609894] RIP: 0010:ice_xsk_pool_setup+0x44/0x7d0 [ice] [420160.616968] Code: f3 48 83 ec 40 48 8b 4f 20 48 8b 3f 65 48 8b 04 25 28 00 00 00 48 89 44 24 38 31 c0 48 8d 04 ed 00 00 00 00 48 01 c1 48 8b 11 <0f> b7 92 82 00 00 00 48 85 d2 0f 84 2d 75 00 00 48 8d 72 ff 48 85 [420160.639421] RSP: 0018:ffffc9002d2afd48 EFLAGS: 00010282 [420160.646650] RAX: 0000000000000050 RBX: ffff88811d8bdd00 RCX: ffff888112c14ff8 [420160.655893] RDX: 0000000000000000 RSI: ffff88811d8bdd00 RDI: ffff888109861000 [420160.665166] RBP: 000000000000000a R08: 000000000000000a R09: 0000000000000000 [420160.674493] R10: 000000000000889f R11: 0000000000000000 R12: 000000000000000a [420160.683833] R13: 000000000000000a R14: 0000000000000000 R15: ffff888117611828 [420160.693211] FS: 00007fa869fc1f80(0000) GS:ffff8897e0880000(0000) knlGS:0000000000000000 [420160.703645] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [420160.711783] CR2: 0000000000000082 CR3: 00000001d076c001 CR4: 00000000007706e0 [420160.721399] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [420160.731045] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [420160.740707] PKRU: 55555554 [420160.745960] Call Trace: [420160.750962] <TASK> [420160.755597] ? kmalloc_large_node+0x79/0x90 [420160.762703] ? __kmalloc_node+0x3f5/0x4b0 [420160.769341] xp_assign_dev+0xfd/0x210 [420160.775661] ? shmem_file_read_iter+0x29a/0x420 [420160.782896] xsk_bind+0x152/0x490 [420160.788943] __sys_bind+0xd0/0x100 [420160.795097] ? exit_to_user_mode_prepare+0x20/0x120 [420160.802801] __x64_sys_bind+0x16/0x20 [420160.809298] do_syscall_64+0x38/0x90 [420160.815741] entry_SYSCALL_64_after_hwframe+0x63/0xcd [420160.823731] RIP: 0033:0x7fa86a0dd2fb [420160.830264] Code: c3 66 0f 1f 44 00 00 48 8b 15 69 8b 0c 00 f7 d8 64 89 02 b8 ff ff ff ff eb bc 0f 1f 44 00 00 f3 0f 1e fa b8 31 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3d 8b 0c 00 f7 d8 64 89 01 48 [420160.855410] RSP: 002b:00007ffc1146f618 EFLAGS: 00000246 ORIG_RAX: 0000000000000031 [420160.866366] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fa86a0dd2fb [420160.876957] RDX: 0000000000000010 RSI: 00007ffc1146f680 RDI: 0000000000000003 [420160.887604] RBP: 000055d7113a0520 R08: 00007fa868fb8000 R09: 0000000080000000 [420160.898293] R10: 0000000000008001 R11: 0000000000000246 R12: 000055d7113a04e0 [420160.909038] R13: 000055d7113a0320 R14: 000000000000000a R15: 0000000000000000 [420160.919817] </TASK> [420160.925659] Modules linked in: ice(OE) af_packet binfmt_misc nls_iso8859_1 ipmi_ssif intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp mei_me coretemp ioatdma mei ipmi_si wmi ipmi_msghandler acpi_pad acpi_power_meter ip_tables x_tables autofs4 ixgbe i40e crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd ahci mdio dca libahci lpc_ich [last unloaded: ice] [420160.977576] CR2: 0000000000000082 [420160.985037] ---[ end trace 0000000000000000 ]--- [420161.097724] RIP: 0010:ice_xsk_pool_setup+0x44/0x7d0 [ice] [420161.107341] Code: f3 48 83 ec 40 48 8b 4f 20 48 8b 3f 65 48 8b 04 25 28 00 00 00 48 89 44 24 38 31 c0 48 8d 04 ed 00 00 00 00 48 01 c1 48 8b 11 <0f> b7 92 82 00 00 00 48 85 d2 0f 84 2d 75 00 00 48 8d 72 ff 48 85 [420161.134741] RSP: 0018:ffffc9002d2afd48 EFLAGS: 00010282 [420161.144274] RAX: 0000000000000050 RBX: ffff88811d8bdd00 RCX: ffff888112c14ff8 [420161.155690] RDX: 0000000000000000 RSI: ffff88811d8bdd00 RDI: ffff888109861000 [420161.168088] RBP: 000000000000000a R08: 000000000000000a R09: 0000000000000000 [420161.179295] R10: 000000000000889f R11: 0000000000000000 R12: 000000000000000a [420161.190420] R13: 000000000000000a R14: 0000000000000000 R15: ffff888117611828 [420161.201505] FS: 00007fa869fc1f80(0000) GS:ffff8897e0880000(0000) knlGS:0000000000000000 [420161.213628] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [420161.223413] CR2: 0000000000000082 CR3: 00000001d076c001 CR4: 00000000007706e0 [420161.234653] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [420161.245893] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [420161.257052] PKRU: 55555554 Fixes: 2d4238f5 ("ice: Add support for AF_XDP") Signed-off-by:
Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by:
George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Maciej Fijalkowski authored
[ Upstream commit 296f13ff ] With the upcoming introduction of batching to XSK data path, performance wise it will be the best to have the ring descriptor count to be aligned to power of 2. Check if ring sizes that user is going to attach the XSK socket fulfill the condition above. For Tx side, although check is being done against the Tx queue and in the end the socket will be attached to the XDP queue, it is fine since XDP queues get the ring->count setting from Tx queues. Suggested-by:
Alexander Lobakin <alexandr.lobakin@intel.com> Signed-off-by:
Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Reviewed-by:
Alexander Lobakin <alexandr.lobakin@intel.com> Acked-by:
Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20220125160446.78976-3-maciej.fijalkowski@intel.com Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Duoming Zhou authored
[ Upstream commit f1e941db ] When the pn532 uart device is detaching, the pn532_uart_remove() is called. But there are no functions in pn532_uart_remove() that could delete the cmd_timeout timer, which will cause use-after-free bugs. The process is shown below: (thread 1) | (thread 2) | pn532_uart_send_frame pn532_uart_remove | mod_timer(&pn532->cmd_timeout,...) ... | (wait a time) kfree(pn532) //FREE | pn532_cmd_timeout | pn532_uart_send_frame | pn532->... //USE This patch adds del_timer_sync() in pn532_uart_remove() in order to prevent the use-after-free bugs. What's more, the pn53x_unregister_nfc() is well synchronized, it sets nfc_dev->shutting_down to true and there are no syscalls could restart the cmd_timeout timer. Fixes: c656aa4c ("nfc: pn533: add UART phy driver") Signed-off-by:
Duoming Zhou <duoming@zju.edu.cn> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Bernard Pidoux authored
[ Upstream commit 3c53cd65 ] Commit 3b3fd068 added NULL check for `rose_loopback_neigh->dev` in rose_loopback_timer() but omitted to check rose_loopback_neigh->loopback. It thus prevents *all* rose connect. The reason is that a special rose_neigh loopback has a NULL device. /proc/net/rose_neigh illustrates it via rose_neigh_show() function : [...] seq_printf(seq, "%05d %-9s %-4s %3d %3d %3s %3s %3lu %3lu", rose_neigh->number, (rose_neigh->loopback) ? "RSLOOP-0" : ax2asc(buf, &rose_neigh->callsign), rose_neigh->dev ? rose_neigh->dev->name : "???", rose_neigh->count, /proc/net/rose_neigh displays special rose_loopback_neigh->loopback as callsign RSLOOP-0: addr callsign dev count use mode restart t0 tf digipeaters 00001 RSLOOP-0 ??? 1 2 DCE yes 0 0 By checking rose_loopback_neigh->loopback, rose_rx_call_request() is called even in case rose_loopback_neigh->dev is NULL. This repairs rose connections. Verification with rose client application FPAC: FPAC-Node v 4.1.3 (built Aug 5 2022) for LINUX (help = h) F6BVP-4 (Commands = ?) : u Users - AX.25 Level 2 sessions : Port Callsign Callsign AX.25 state ROSE state NetRom status axudp F6BVP-5 -> F6BVP-9 Connected Connected --------- Fixes: 3b3fd068 ("rose: Fix Null pointer dereference in rose_send_frame()") Signed-off-by:
Bernard Pidoux <f6bvp@free.fr> Suggested-by:
Francois Romieu <romieu@fr.zoreil.com> Cc: Thomas DL9SAU Osterried <thomas@osterried.de> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Peter Xu authored
[ Upstream commit efd41493 ] These bits should only be valid when the ptes are present. Introducing two booleans for it and set it to false when !pte_present() for both pte and pmd accountings. The bug is found during code reading and no real world issue reported, but logically such an error can cause incorrect readings for either smaps or smaps_rollup output on quite a few fields. For example, it could cause over-estimate on values like Shared_Dirty, Private_Dirty, Referenced. Or it could also cause under-estimate on values like LazyFree, Shared_Clean, Private_Clean. Link: https://lkml.kernel.org/r/20220805160003.58929-1-peterx@redhat.com Fixes: b1d4d9e0 ("proc/smaps: carefully handle migration entries") Fixes: c94b6923 ("/proc/PID/smaps: Add PMD migration entry parsing") Signed-off-by:
Peter Xu <peterx@redhat.com> Reviewed-by:
Vlastimil Babka <vbabka@suse.cz> Reviewed-by:
David Hildenbrand <david@redhat.com> Reviewed-by:
Yang Shi <shy828301@gmail.com> Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Huang Ying <ying.huang@intel.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Miaohe Lin authored
[ Upstream commit a44f89dc ] It's more recommended to use helper function migration_entry_to_page() to get the page via migration entry. We can also enjoy the PageLocked() check there. Link: https://lkml.kernel.org/r/20210318122722.13135-7-linmiaohe@huawei.com Signed-off-by:
Miaohe Lin <linmiaohe@huawei.com> Reviewed-by:
Peter Xu <peterx@redhat.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michel Lespinasse <walken@google.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Thomas Hellstrm (Intel) <thomas_os@shipmail.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: William Kucharski <william.kucharski@oracle.com> Cc: Yang Shi <yang.shi@linux.alibaba.com> Cc: yuleixzhang <yulei.kernel@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Trond Myklebust authored
[ Upstream commit ed06fce0 ] Fix up a case in call_encode() where we're failing to set task->tk_rpc_status when an RPC level error occurred. Fixes: 9c5948c2 ("SUNRPC: task should be exit if encode return EKEYEXPIRED more times") Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Olga Kornievskaia authored
[ Upstream commit fcfc8be1 ] A destination server while doing a COPY shouldn't accept using the passed in filehandle if its not a regular filehandle. If alloc_file_pseudo() has failed, we need to decrement a reference on the newly created inode, otherwise it leaks. Reported-by:
Al Viro <viro@zeniv.linux.org.uk> Fixes: ec4b0925 ("NFS: inter ssc open") Signed-off-by:
Olga Kornievskaia <kolga@netapp.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Trond Myklebust authored
[ Upstream commit 156cd285 ] The preferred behaviour is always to allocate struct nfs_fattr from the slab. Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Nikolay Aleksandrov authored
[ Upstream commit 17ecd4a4 ] When we try to transmit an skb with metadata_dst attached (i.e. dst->dev == NULL) through xfrm interface we can hit a null pointer dereference[1] in xfrmi_xmit2() -> xfrm_lookup_with_ifid() due to the check for a loopback skb device when there's no policy which dereferences dst->dev unconditionally. Not having dst->dev can be interepreted as it not being a loopback device, so just add a check for a null dst_orig->dev. With this fix xfrm interface's Tx error counters go up as usual. [1] net-next calltrace captured via netconsole: BUG: kernel NULL pointer dereference, address: 00000000000000c0 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP CPU: 1 PID: 7231 Comm: ping Kdump: loaded Not tainted 5.19.0+ #24 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-1.fc36 04/01/2014 RIP: 0010:xfrm_lookup_with_ifid+0x5eb/0xa60 Code: 8d 74 24 38 e8 26 a4 37 00 48 89 c1 e9 12 fc ff ff 49 63 ed 41 83 fd be 0f 85 be 01 00 00 41 be ff ff ff ff 45 31 ed 48 8b 03 <f6> 80 c0 00 00 00 08 75 0f 41 80 bc 24 19 0d 00 00 01 0f 84 1e 02 RSP: 0018:ffffb0db82c679f0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffd0db7fcad430 RCX: ffffb0db82c67a10 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffb0db82c67a80 RBP: ffffb0db82c67a80 R08: ffffb0db82c67a14 R09: 0000000000000000 R10: 0000000000000000 R11: ffff8fa449667dc8 R12: ffffffff966db880 R13: 0000000000000000 R14: 00000000ffffffff R15: 0000000000000000 FS: 00007ff35c83f000(0000) GS:ffff8fa478480000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000c0 CR3: 000000001ebb7000 CR4: 0000000000350ee0 Call Trace: <TASK> xfrmi_xmit+0xde/0x460 ? tcf_bpf_act+0x13d/0x2a0 dev_hard_start_xmit+0x72/0x1e0 __dev_queue_xmit+0x251/0xd30 ip_finish_output2+0x140/0x550 ip_push_pending_frames+0x56/0x80 raw_sendmsg+0x663/0x10a0 ? try_charge_memcg+0x3fd/0x7a0 ? __mod_memcg_lruvec_state+0x93/0x110 ? sock_sendmsg+0x30/0x40 sock_sendmsg+0x30/0x40 __sys_sendto+0xeb/0x130 ? handle_mm_fault+0xae/0x280 ? do_user_addr_fault+0x1e7/0x680 ? kvm_read_and_reset_apf_flags+0x3b/0x50 __x64_sys_sendto+0x20/0x30 do_syscall_64+0x34/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7ff35cac1366 Code: eb 0b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 11 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 72 c3 90 55 48 83 ec 30 44 89 4c 24 2c 4c 89 RSP: 002b:00007fff738e4028 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007fff738e57b0 RCX: 00007ff35cac1366 RDX: 0000000000000040 RSI: 0000557164e4b450 RDI: 0000000000000003 RBP: 0000557164e4b450 R08: 00007fff738e7a2c R09: 0000000000000010 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000040 R13: 00007fff738e5770 R14: 00007fff738e4030 R15: 0000001d00000001 </TASK> Modules linked in: netconsole veth br_netfilter bridge bonding virtio_net [last unloaded: netconsole] CR2: 00000000000000c0 CC: Steffen Klassert <steffen.klassert@secunet.com> CC: Daniel Borkmann <daniel@iogearbox.net> Fixes: 2d151d39 ("xfrm: Add possibility to set the default to block if we have no policy") Signed-off-by:
Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by:
Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Herbert Xu authored
[ Upstream commit ba953a9d ] When namespace support was added to xfrm/afkey, it caused the previously single-threaded call to xfrm_probe_algs to become multi-threaded. This is buggy and needs to be fixed with a mutex. Reported-by:
Abhishek Shah <abhishek.shah@columbia.edu> Fixes: 283bc9f3 ("xfrm: Namespacify xfrm state/policy locks") Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by:
Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Antony Antony authored
[ Upstream commit 6aa811ac ] x->lastused was not cloned in xfrm_do_migrate. Add it to clone during migrate. Fixes: 80c9abaa ("[XFRM]: Extension for dynamic update of endpoint address(es)") Signed-off-by:
Antony Antony <antony.antony@secunet.com> Signed-off-by:
Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-