commit 77ba2b9b46f8acead2606759e8196b7076eaeeea Author: Greg Kroah-Hartman Date: Fri Jul 29 17:14:20 2022 +0200 Linux 5.4.208 Link: https://lore.kernel.org/r/20220727161008.993711844@linuxfoundation.org Tested-by: Florian Fainelli Tested-by: Linux Kernel Functional Testing Tested-by: Jon Hunter Tested-by: Sudip Mukherjee Tested-by: Shuah Khan Tested-by: Guenter Roeck Signed-off-by: Greg Kroah-Hartman commit ca5762c5896e315a8bc29499a6b826394d86ea19 Author: Jan Beulich Date: Tue Jun 7 17:00:53 2022 +0200 x86: drop bogus "cc" clobber from __try_cmpxchg_user_asm() commit 1df931d95f4dc1c11db1123e85d4e08156e46ef9 upstream. As noted (and fixed) a couple of times in the past, "=@cc" outputs and clobbering of "cc" don't work well together. The compiler appears to mean to reject such, but doesn't - in its upstream form - quite manage to yet for "cc". Furthermore two similar macros don't clobber "cc", and clobbering "cc" is pointless in asm()-s for x86 anyway - the compiler always assumes status flags to be clobbered there. Fixes: 989b5db215a2 ("x86/uaccess: Implement macros for CMPXCHG on user addresses") Signed-off-by: Jan Beulich Message-Id: <485c0c0b-a3a7-0b7c-5264-7d00c01de032@suse.com> Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit f88d8c18822963d75a2b85d0ea6bb5d791dd95bf Author: Jose Alonso Date: Mon Jun 13 15:32:44 2022 -0300 net: usb: ax88179_178a needs FLAG_SEND_ZLP commit 36a15e1cb134c0395261ba1940762703f778438c upstream. The extra byte inserted by usbnet.c when (length % dev->maxpacket == 0) is causing problems to device. This patch sets FLAG_SEND_ZLP to avoid this. Tested with: 0b95:1790 ASIX Electronics Corp. AX88179 Gigabit Ethernet Problems observed: ====================================================================== 1) Using ssh/sshfs. The remote sshd daemon can abort with the message: "message authentication code incorrect" This happens because the tcp message sent is corrupted during the USB "Bulk out". The device calculate the tcp checksum and send a valid tcp message to the remote sshd. Then the encryption detects the error and aborts. 2) NETDEV WATCHDOG: ... (ax88179_178a): transmit queue 0 timed out 3) Stop normal work without any log message. The "Bulk in" continue receiving packets normally. The host sends "Bulk out" and the device responds with -ECONNRESET. (The netusb.c code tx_complete ignore -ECONNRESET) Under normal conditions these errors take days to happen and in intense usage take hours. A test with ping gives packet loss, showing that something is wrong: ping -4 -s 462 {destination} # 462 = 512 - 42 - 8 Not all packets fail. My guess is that the device tries to find another packet starting at the extra byte and will fail or not depending on the next bytes (old buffer content). ====================================================================== Signed-off-by: Jose Alonso Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit f7785092cb7f022f59ebdaa181651f7c877df132 Author: Jiri Slaby Date: Thu Jul 7 10:25:58 2022 +0200 tty: use new tty_insert_flip_string_and_push_buffer() in pty_write() commit a501ab75e7624d133a5a3c7ec010687c8b961d23 upstream. There is a race in pty_write(). pty_write() can be called in parallel with e.g. ioctl(TIOCSTI) or ioctl(TCXONC) which also inserts chars to the buffer. Provided, tty_flip_buffer_push() in pty_write() is called outside the lock, it can commit inconsistent tail. This can lead to out of bounds writes and other issues. See the Link below. To fix this, we have to introduce a new helper called tty_insert_flip_string_and_push_buffer(). It does both tty_insert_flip_string() and tty_flip_buffer_commit() under the port lock. It also calls queue_work(), but outside the lock. See 71a174b39f10 (pty: do tty_flip_buffer_push without port->lock in pty_write) for the reasons. Keep the helper internal-only (in drivers' tty.h). It is not intended to be used widely. Link: https://seclists.org/oss-sec/2022/q2/155 Fixes: 71a174b39f10 (pty: do tty_flip_buffer_push without port->lock in pty_write) Cc: 一只狗 Cc: Dan Carpenter Suggested-by: Hillf Danton Signed-off-by: Jiri Slaby Link: https://lore.kernel.org/r/20220707082558.9250-2-jslaby@suse.cz Signed-off-by: Greg Kroah-Hartman commit 815d936e92f9012da625ebb42936c63893765a34 Author: Jiri Slaby Date: Thu Jul 7 10:25:57 2022 +0200 tty: extract tty_flip_buffer_commit() from tty_flip_buffer_push() commit 716b10580283fda66f2b88140e3964f8a7f9da89 upstream. We will need this new helper in the next patch. Cc: Hillf Danton Cc: 一只狗 Cc: Dan Carpenter Signed-off-by: Jiri Slaby Link: https://lore.kernel.org/r/20220707082558.9250-1-jslaby@suse.cz Signed-off-by: Greg Kroah-Hartman commit 2ea77b0b6d22ad993af113e34d09998d2546e733 Author: Jiri Slaby Date: Mon Nov 22 12:16:48 2021 +0100 tty: drop tty_schedule_flip() commit 5db96ef23bda6c2a61a51693c85b78b52d03f654 upstream. Since commit a9c3f68f3cd8d (tty: Fix low_latency BUG) in 2014, tty_flip_buffer_push() is only a wrapper to tty_schedule_flip(). All users were converted in the previous patches, so remove tty_schedule_flip() completely while inlining its body into tty_flip_buffer_push(). One less exported function. Reviewed-by: Johan Hovold Signed-off-by: Jiri Slaby Link: https://lore.kernel.org/r/20211122111648.30379-4-jslaby@suse.cz Signed-off-by: Greg Kroah-Hartman commit f20912215c9ca3403b2dbbbeac26e98663defb67 Author: Jiri Slaby Date: Mon Nov 22 12:16:47 2021 +0100 tty: the rest, stop using tty_schedule_flip() commit b68b914494df4f79b4e9b58953110574af1cb7a2 upstream. Since commit a9c3f68f3cd8d (tty: Fix low_latency BUG) in 2014, tty_flip_buffer_push() is only a wrapper to tty_schedule_flip(). We are going to remove the latter (as it is used less), so call the former in the rest of the users. Cc: Richard Henderson Cc: Ivan Kokshaysky Cc: Matt Turner Cc: William Hubbs Cc: Chris Brannon Cc: Kirk Reiser Cc: Samuel Thibault Cc: Heiko Carstens Cc: Vasily Gorbik Cc: Christian Borntraeger Cc: Alexander Gordeev Reviewed-by: Johan Hovold Signed-off-by: Jiri Slaby Link: https://lore.kernel.org/r/20211122111648.30379-3-jslaby@suse.cz Signed-off-by: Greg Kroah-Hartman commit aa60c0cce8b4599921e341c3bb7fb326e2779388 Author: Jiri Slaby Date: Mon Nov 22 12:16:46 2021 +0100 tty: drivers/tty/, stop using tty_schedule_flip() commit 5f6a85158ccacc3f09744b3aafe8b11ab3b6c6f6 upstream. Since commit a9c3f68f3cd8d (tty: Fix low_latency BUG) in 2014, tty_flip_buffer_push() is only a wrapper to tty_schedule_flip(). We are going to remove the latter (as it is used less), so call the former in drivers/tty/. Cc: Vladimir Zapolskiy Reviewed-by: Johan Hovold Signed-off-by: Jiri Slaby Link: https://lore.kernel.org/r/20211122111648.30379-2-jslaby@suse.cz Signed-off-by: Greg Kroah-Hartman commit 126137a53d7e9cc1614f0ec7347af1630b7fbcac Author: Luiz Augusto von Dentz Date: Mon Feb 14 17:59:38 2022 -0800 Bluetooth: Fix bt_skb_sendmmsg not allocating partial chunks commit 29fb608396d6a62c1b85acc421ad7a4399085b9f upstream. Since bt_skb_sendmmsg can be used with the likes of SOCK_STREAM it shall return the partial chunks it could allocate instead of freeing everything as otherwise it can cause problems like bellow. Fixes: 81be03e026dc ("Bluetooth: RFCOMM: Replace use of memcpy_from_msg with bt_skb_sendmmsg") Reported-by: Paul Menzel Link: https://lore.kernel.org/r/d7206e12-1b99-c3be-84f4-df22af427ef5@molgen.mpg.de BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215594 Signed-off-by: Luiz Augusto von Dentz Tested-by: Paul Menzel (Nokia N9 (MeeGo/Harmattan) Signed-off-by: Marcel Holtmann Cc: Harshit Mogalapalli Signed-off-by: Greg Kroah-Hartman commit 836b47e6436bcab6e17f9b0cbf887b03f1fcb3d8 Author: Luiz Augusto von Dentz Date: Thu Sep 16 13:10:49 2021 -0700 Bluetooth: SCO: Fix sco_send_frame returning skb->len commit 037ce005af6b8a3e40ee07c6e9266c8997e6a4d6 upstream. The skb in modified by hci_send_sco which pushes SCO headers thus changing skb->len causing sco_sock_sendmsg to fail. Fixes: 0771cbb3b97d ("Bluetooth: SCO: Replace use of memcpy_from_msg with bt_skb_sendmsg") Tested-by: Tedd Ho-Jeong An Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Marcel Holtmann Cc: Harshit Mogalapalli Signed-off-by: Greg Kroah-Hartman commit aa2d34cab3e656e1738c9c4cf0c47f29baaf3679 Author: Luiz Augusto von Dentz Date: Thu Sep 16 13:10:48 2021 -0700 Bluetooth: Fix passing NULL to PTR_ERR commit 266191aa8d14b84958aaeb5e96ee4e97839e3d87 upstream. Passing NULL to PTR_ERR will result in 0 (success), also since the likes of bt_skb_sendmsg does never return NULL it is safe to replace the instances of IS_ERR_OR_NULL with IS_ERR when checking its return. Reported-by: Dan Carpenter Tested-by: Tedd Ho-Jeong An Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Marcel Holtmann Cc: Harshit Mogalapalli Signed-off-by: Greg Kroah-Hartman commit 10bacb891722e040b89c2f34d1dcfe63344a6561 Author: Luiz Augusto von Dentz Date: Fri Sep 3 15:27:32 2021 -0700 Bluetooth: RFCOMM: Replace use of memcpy_from_msg with bt_skb_sendmmsg commit 81be03e026dc0c16dc1c64e088b2a53b73caa895 upstream. This makes use of bt_skb_sendmmsg instead using memcpy_from_msg which is not considered safe to be used when lock_sock is held. Also make rfcomm_dlc_send handle skb with fragments and queue them all atomically. Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Marcel Holtmann Cc: Harshit Mogalapalli Signed-off-by: Greg Kroah-Hartman commit bf46574d46555aaf4811b29777ac4a23d4ecfc11 Author: Luiz Augusto von Dentz Date: Fri Sep 3 15:27:31 2021 -0700 Bluetooth: SCO: Replace use of memcpy_from_msg with bt_skb_sendmsg commit 0771cbb3b97d3c1d68eecd7f00055f599954c34e upstream. This makes use of bt_skb_sendmsg instead of allocating a different buffer to be used with memcpy_from_msg which cause one extra copy. Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Marcel Holtmann Cc: Harshit Mogalapalli Signed-off-by: Greg Kroah-Hartman commit f00b06003b11e54c9a8df422e0db5aa38d7aeecc Author: Luiz Augusto von Dentz Date: Fri Sep 3 15:27:30 2021 -0700 Bluetooth: Add bt_skb_sendmmsg helper commit 97e4e80299844bb5f6ce5a7540742ffbffae3d97 upstream. This works similarly to bt_skb_sendmsg but can split the msg into multiple skb fragments which is useful for stream sockets. Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Marcel Holtmann Cc: Harshit Mogalapalli Signed-off-by: Greg Kroah-Hartman commit 55bf99849be0fca0c5ea0ee43d87c85880e84981 Author: Luiz Augusto von Dentz Date: Fri Sep 3 15:27:29 2021 -0700 Bluetooth: Add bt_skb_sendmsg helper commit 38f64f650dc0e44c146ff88d15a7339efa325918 upstream. bt_skb_sendmsg helps takes care of allocation the skb and copying the the contents of msg over to the skb while checking for possible errors so it should be safe to call it without holding lock_sock. Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Marcel Holtmann Cc: Harshit Mogalapalli Signed-off-by: Greg Kroah-Hartman commit 015af30d373d33548c9afcffbbaaf266459731de Author: Takashi Iwai Date: Fri Dec 18 15:56:24 2020 +0100 ALSA: memalloc: Align buffer allocations in page size commit 5c1733e33c888a3cb7f576564d8ad543d5ad4a9e upstream. Currently the standard memory allocator (snd_dma_malloc_pages*()) passes the byte size to allocate as is. Most of the backends allocates real pages, hence the actual allocations are aligned in page size. However, the genalloc doesn't seem assuring the size alignment, hence it may result in the access outside the buffer when the whole memory pages are exposed via mmap. For avoiding such inconsistencies, this patch makes the allocation size always to be aligned in page size. Note that, after this change, snd_dma_buffer.bytes field contains the aligned size, not the originally requested size. This value is also used for releasing the pages in return. Reviewed-by: Lars-Peter Clausen Link: https://lore.kernel.org/r/20201218145625.2045-2-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 352affc31e269ee6c3ec1c4d2fe65b72b7367df6 Author: Peter Zijlstra Date: Wed Nov 10 11:01:03 2021 +0100 bitfield.h: Fix "type of reg too small for mask" test [ Upstream commit bff8c3848e071d387d8b0784dc91fa49cd563774 ] The test: 'mask > (typeof(_reg))~0ull' only works correctly when both sides are unsigned, consider: - 0xff000000 vs (int)~0ull - 0x000000ff vs (int)~0ull Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Josh Poimboeuf Link: https://lore.kernel.org/r/20211110101324.950210584@infradead.org Signed-off-by: Sasha Levin commit 0a0fbbd6cb654f748d87c0db10c31741101f2ab2 Author: Thomas Gleixner Date: Wed Sep 8 15:29:15 2021 +0200 x86/mce: Deduplicate exception handling [ Upstream commit e42404afc4ca856c48f1e05752541faa3587c472 ] Prepare code for further simplification. No functional change. Signed-off-by: Thomas Gleixner Signed-off-by: Borislav Petkov Link: https://lkml.kernel.org/r/20210908132525.096452100@linutronix.de Signed-off-by: Sasha Levin commit b524137fa1d8cd9aabb65fbf34da10e24f8fc9d7 Author: Michel Lespinasse Date: Mon Jun 8 21:33:14 2020 -0700 mmap locking API: initial implementation as rwsem wrappers [ Upstream commit 9740ca4e95b43b91a4a848694a20d01ba6818f7b ] This patch series adds a new mmap locking API replacing the existing mmap_sem lock and unlocks. Initially the API is just implemente in terms of inlined rwsem calls, so it doesn't provide any new functionality. There are two justifications for the new API: - At first, it provides an easy hooking point to instrument mmap_sem locking latencies independently of any other rwsems. - In the future, it may be a starting point for replacing the rwsem implementation with a different one, such as range locks. This is something that is being explored, even though there is no wide concensus about this possible direction yet. (see https://patchwork.kernel.org/cover/11401483/) This patch (of 12): This change wraps the existing mmap_sem related rwsem calls into a new mmap locking API. There are two justifications for the new API: - At first, it provides an easy hooking point to instrument mmap_sem locking latencies independently of any other rwsems. - In the future, it may be a starting point for replacing the rwsem implementation with a different one, such as range locks. Signed-off-by: Michel Lespinasse Signed-off-by: Andrew Morton Reviewed-by: Daniel Jordan Reviewed-by: Davidlohr Bueso Reviewed-by: Laurent Dufour Reviewed-by: Vlastimil Babka Cc: Peter Zijlstra Cc: Matthew Wilcox Cc: Liam Howlett Cc: Jerome Glisse Cc: David Rientjes Cc: Hugh Dickins Cc: Ying Han Cc: Jason Gunthorpe Cc: John Hubbard Cc: Michel Lespinasse Link: http://lkml.kernel.org/r/20200520052908.204642-1-walken@google.com Link: http://lkml.kernel.org/r/20200520052908.204642-2-walken@google.com Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 592a1c6066dd3a29df417bc1cee756b653ec06c9 Author: Peter Zijlstra Date: Wed Feb 2 00:49:42 2022 +0000 x86/uaccess: Implement macros for CMPXCHG on user addresses [ Upstream commit 989b5db215a2f22f89d730b607b071d964780f10 ] Add support for CMPXCHG loops on userspace addresses. Provide both an "unsafe" version for tight loops that do their own uaccess begin/end, as well as a "safe" version for use cases where the CMPXCHG is not buried in a loop, e.g. KVM will resume the guest instead of looping when emulation of a guest atomic accesses fails the CMPXCHG. Provide 8-byte versions for 32-bit kernels so that KVM can do CMPXCHG on guest PAE PTEs, which are accessed via userspace addresses. Guard the asm_volatile_goto() variation with CC_HAS_ASM_GOTO_TIED_OUTPUT, the "+m" constraint fails on some compilers that otherwise support CC_HAS_ASM_GOTO_OUTPUT. Cc: stable@vger.kernel.org Signed-off-by: Peter Zijlstra (Intel) Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Message-Id: <20220202004945.2540433-3-seanjc@google.com> Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit 1d778b54a5c003b1786d7dc46976d30746a66913 Author: Al Viro Date: Sat Feb 15 11:46:30 2020 -0500 x86: get rid of small constant size cases in raw_copy_{to,from}_user() [ Upstream commit 4b842e4e25b12951fa10dedb4bc16bc47e3b850c ] Very few call sites where that would be triggered remain, and none of those is anywhere near hot enough to bother. Signed-off-by: Al Viro Signed-off-by: Sasha Levin commit d0d583484d2ed9f5903edbbfa7e2a68f78b950b0 Author: Will Deacon Date: Thu Nov 21 11:59:00 2019 +0000 locking/refcount: Consolidate implementations of refcount_t [ Upstream commit fb041bb7c0a918b95c6889fc965cdc4a75b4c0ca ] The generic implementation of refcount_t should be good enough for everybody, so remove ARCH_HAS_REFCOUNT and REFCOUNT_FULL entirely, leaving the generic implementation enabled unconditionally. Signed-off-by: Will Deacon Reviewed-by: Ard Biesheuvel Acked-by: Kees Cook Tested-by: Hanjun Guo Cc: Ard Biesheuvel Cc: Elena Reshetova Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: https://lkml.kernel.org/r/20191121115902.2551-9-will@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin commit dab787c73f6e38d8e7ed3c1e683385e8f0fe28a2 Author: Will Deacon Date: Thu Nov 21 11:58:59 2019 +0000 locking/refcount: Consolidate REFCOUNT_{MAX,SATURATED} definitions [ Upstream commit 65b008552469f1c37f5e06e0016924502e40b4f5 ] The definitions of REFCOUNT_MAX and REFCOUNT_SATURATED are the same, regardless of CONFIG_REFCOUNT_FULL, so consolidate them into a single pair of definitions. Signed-off-by: Will Deacon Reviewed-by: Ard Biesheuvel Reviewed-by: Kees Cook Tested-by: Hanjun Guo Cc: Ard Biesheuvel Cc: Elena Reshetova Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: https://lkml.kernel.org/r/20191121115902.2551-8-will@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin commit 0d3182fbe689e3808c03b6cde6be98237f9e0a4a Author: Will Deacon Date: Thu Nov 21 11:58:58 2019 +0000 locking/refcount: Move saturation warnings out of line [ Upstream commit 1eb085d94256aaa69b00cf5a86e3c5f5bb2bc460 ] Having the refcount saturation and warnings inline bloats the text, despite the fact that these paths should never be executed in normal operation. Move the refcount saturation and warnings out of line to reduce the image size when refcount_t checking is enabled. Relative to an x86_64 defconfig, the sizes reported by bloat-o-meter are: # defconfig+REFCOUNT_FULL, inline saturation (i.e. before this patch) Total: Before=14762076, After=14915442, chg +1.04% # defconfig+REFCOUNT_FULL, out-of-line saturation (i.e. after this patch) Total: Before=14762076, After=14835497, chg +0.50% A side-effect of this change is that we now only get one warning per refcount saturation type, rather than one per problematic call-site. Signed-off-by: Will Deacon Reviewed-by: Ard Biesheuvel Reviewed-by: Kees Cook Tested-by: Hanjun Guo Cc: Ard Biesheuvel Cc: Elena Reshetova Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: https://lkml.kernel.org/r/20191121115902.2551-7-will@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin commit 809554147d609163cfbaf815c443c575b538a7ef Author: Will Deacon Date: Thu Nov 21 11:58:57 2019 +0000 locking/refcount: Improve performance of generic REFCOUNT_FULL code [ Upstream commit dcb786493f3e48da3272b710028d42ec608cfda1 ] Rewrite the generic REFCOUNT_FULL implementation so that the saturation point is moved to INT_MIN / 2. This allows us to defer the sanity checks until after the atomic operation, which removes many uses of cmpxchg() in favour of atomic_fetch_{add,sub}(). Some crude perf results obtained from lkdtm show substantially less overhead, despite the checking: $ perf stat -r 3 -B -- echo {ATOMIC,REFCOUNT}_TIMING >/sys/kernel/debug/provoke-crash/DIRECT # arm64 ATOMIC_TIMING: 46.50451 +- 0.00134 seconds time elapsed ( +- 0.00% ) REFCOUNT_TIMING (REFCOUNT_FULL, mainline): 77.57522 +- 0.00982 seconds time elapsed ( +- 0.01% ) REFCOUNT_TIMING (REFCOUNT_FULL, this series): 48.7181 +- 0.0256 seconds time elapsed ( +- 0.05% ) # x86 ATOMIC_TIMING: 31.6225 +- 0.0776 seconds time elapsed ( +- 0.25% ) REFCOUNT_TIMING (!REFCOUNT_FULL, mainline/x86 asm): 31.6689 +- 0.0901 seconds time elapsed ( +- 0.28% ) REFCOUNT_TIMING (REFCOUNT_FULL, mainline): 53.203 +- 0.138 seconds time elapsed ( +- 0.26% ) REFCOUNT_TIMING (REFCOUNT_FULL, this series): 31.7408 +- 0.0486 seconds time elapsed ( +- 0.15% ) Signed-off-by: Will Deacon Reviewed-by: Ard Biesheuvel Reviewed-by: Kees Cook Tested-by: Hanjun Guo Tested-by: Jan Glauber Cc: Ard Biesheuvel Cc: Elena Reshetova Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: https://lkml.kernel.org/r/20191121115902.2551-6-will@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin commit 9c9269977f03ab9c448c8b71581a951e0eb4fb7b Author: Will Deacon Date: Thu Nov 21 11:58:56 2019 +0000 locking/refcount: Move the bulk of the REFCOUNT_FULL implementation into the header [ Upstream commit 77e9971c79c29542ab7dd4140f9343bf2ff36158 ] In an effort to improve performance of the REFCOUNT_FULL implementation, move the bulk of its functions into linux/refcount.h. This allows them to be inlined in the same way as if they had been provided via CONFIG_ARCH_HAS_REFCOUNT. Signed-off-by: Will Deacon Reviewed-by: Ard Biesheuvel Reviewed-by: Kees Cook Tested-by: Hanjun Guo Cc: Ard Biesheuvel Cc: Elena Reshetova Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: https://lkml.kernel.org/r/20191121115902.2551-5-will@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin commit 04bff7d7b8081c4bb2e8171be31d33df297eee5b Author: Will Deacon Date: Thu Nov 21 11:58:55 2019 +0000 locking/refcount: Remove unused refcount_*_checked() variants [ Upstream commit 7221762c48c6bbbcc6cc51d8b803c06930215e34 ] The full-fat refcount implementation is exposed via a set of functions suffixed with "_checked()", the idea being that code can choose to use the more expensive, yet more secure implementation on a case-by-case basis. In reality, this hasn't happened, so with a grand total of zero users, let's remove the checked variants for now by simply dropping the suffix and predicating the out-of-line functions on CONFIG_REFCOUNT_FULL=y. Signed-off-by: Will Deacon Reviewed-by: Ard Biesheuvel Reviewed-by: Kees Cook Tested-by: Hanjun Guo Cc: Ard Biesheuvel Cc: Elena Reshetova Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: https://lkml.kernel.org/r/20191121115902.2551-4-will@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin commit 513b19a43becee5f7af6d283bb9d3d241a8a21a8 Author: Will Deacon Date: Thu Nov 21 11:58:54 2019 +0000 locking/refcount: Ensure integer operands are treated as signed [ Upstream commit 97a1420adf0cdf0cf6f41bab0b2acf658c96b94b ] In preparation for changing the saturation point of REFCOUNT_FULL to INT_MIN/2, change the type of integer operands passed into the API from 'unsigned int' to 'int' so that we can avoid casting during comparisons when we don't want to fall foul of C integral conversion rules for signed and unsigned types. Since the kernel is compiled with '-fno-strict-overflow', we don't need to worry about the UB introduced by signed overflow here. Furthermore, we're already making heavy use of the atomic_t API, which operates exclusively on signed types. Signed-off-by: Will Deacon Reviewed-by: Ard Biesheuvel Reviewed-by: Kees Cook Tested-by: Hanjun Guo Cc: Ard Biesheuvel Cc: Elena Reshetova Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: https://lkml.kernel.org/r/20191121115902.2551-3-will@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin commit 68b4ee68e8c8800cf8d6b61cc74b4031a0742a4c Author: Will Deacon Date: Thu Nov 21 11:58:53 2019 +0000 locking/refcount: Define constants for saturation and max refcount values [ Upstream commit 23e6b169c9917fbd77534f8c5f378cb073f548bd ] The REFCOUNT_FULL implementation uses a different saturation point than the x86 implementation, which means that the shared refcount code in lib/refcount.c (e.g. refcount_dec_not_one()) needs to be aware of the difference. Rather than duplicate the definitions from the lkdtm driver, instead move them into and update all references accordingly. Signed-off-by: Will Deacon Reviewed-by: Ard Biesheuvel Reviewed-by: Kees Cook Tested-by: Hanjun Guo Cc: Ard Biesheuvel Cc: Elena Reshetova Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: https://lkml.kernel.org/r/20191121115902.2551-2-will@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin commit 3f71d0e292eb144ea1a5524e503ede6a33a86696 Author: GUO Zihua Date: Thu Apr 7 10:16:19 2022 +0800 ima: remove the IMA_TEMPLATE Kconfig option [ Upstream commit 891163adf180bc369b2f11c9dfce6d2758d2a5bd ] The original 'ima' measurement list template contains a hash, defined as 20 bytes, and a null terminated pathname, limited to 255 characters. Other measurement list templates permit both larger hashes and longer pathnames. When the "ima" template is configured as the default, a new measurement list template (ima_template=) must be specified before specifying a larger hash algorithm (ima_hash=) on the boot command line. To avoid this boot command line ordering issue, remove the legacy "ima" template configuration option, allowing it to still be specified on the boot command line. The root cause of this issue is that during the processing of ima_hash, we would try to check whether the hash algorithm is compatible with the template. If the template is not set at the moment we do the check, we check the algorithm against the configured default template. If the default template is "ima", then we reject any hash algorithm other than sha1 and md5. For example, if the compiled default template is "ima", and the default algorithm is sha1 (which is the current default). In the cmdline, we put in "ima_hash=sha256 ima_template=ima-ng". The expected behavior would be that ima starts with ima-ng as the template and sha256 as the hash algorithm. However, during the processing of "ima_hash=", "ima_template=" has not been processed yet, and hash_setup would check the configured hash algorithm against the compiled default: ima, and reject sha256. So at the end, the hash algorithm that is actually used will be sha1. With template "ima" removed from the configured default, we ensure that the default tempalte would at least be "ima-ng" which allows for basically any hash algorithm. This change would not break the algorithm compatibility checks for IMA. Fixes: 4286587dccd43 ("ima: add Kconfig default measurement list template") Signed-off-by: GUO Zihua Cc: Signed-off-by: Mimi Zohar Signed-off-by: Sasha Levin commit bc7581e36d40891110179251e6447a9483d6f40e Author: Alexander Aring Date: Wed Apr 6 13:34:16 2022 -0400 dlm: fix pending remove if msg allocation fails [ Upstream commit ba58995909b5098ca4003af65b0ccd5a8d13dd25 ] This patch unsets ls_remove_len and ls_remove_name if a message allocation of a remove messages fails. In this case we never send a remove message out but set the per ls ls_remove_len ls_remove_name variable for a pending remove. Unset those variable should indicate possible waiters in wait_pending_remove() that no pending remove is going on at this moment. Cc: stable@vger.kernel.org Signed-off-by: Alexander Aring Signed-off-by: David Teigland Signed-off-by: Sasha Levin commit 4f1d21c77b15b6eb60f2929bca33e34f95383b24 Author: Eric Dumazet Date: Thu Jul 7 12:39:00 2022 +0000 bpf: Make sure mac_header was set before using it commit 0326195f523a549e0a9d7fd44c70b26fd7265090 upstream. Classic BPF has a way to load bytes starting from the mac header. Some skbs do not have a mac header, and skb_mac_header() in this case is returning a pointer that 65535 bytes after skb->head. Existing range check in bpf_internal_load_pointer_neg_helper() was properly kicking and no illegal access was happening. New sanity check in skb_mac_header() is firing, so we need to avoid it. WARNING: CPU: 1 PID: 28990 at include/linux/skbuff.h:2785 skb_mac_header include/linux/skbuff.h:2785 [inline] WARNING: CPU: 1 PID: 28990 at include/linux/skbuff.h:2785 bpf_internal_load_pointer_neg_helper+0x1b1/0x1c0 kernel/bpf/core.c:74 Modules linked in: CPU: 1 PID: 28990 Comm: syz-executor.0 Not tainted 5.19.0-rc4-syzkaller-00865-g4874fb9484be #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/29/2022 RIP: 0010:skb_mac_header include/linux/skbuff.h:2785 [inline] RIP: 0010:bpf_internal_load_pointer_neg_helper+0x1b1/0x1c0 kernel/bpf/core.c:74 Code: ff ff 45 31 f6 e9 5a ff ff ff e8 aa 27 40 00 e9 3b ff ff ff e8 90 27 40 00 e9 df fe ff ff e8 86 27 40 00 eb 9e e8 2f 2c f3 ff <0f> 0b eb b1 e8 96 27 40 00 e9 79 fe ff ff 90 41 57 41 56 41 55 41 RSP: 0018:ffffc9000309f668 EFLAGS: 00010216 RAX: 0000000000000118 RBX: ffffffffffeff00c RCX: ffffc9000e417000 RDX: 0000000000040000 RSI: ffffffff81873f21 RDI: 0000000000000003 RBP: ffff8880842878c0 R08: 0000000000000003 R09: 000000000000ffff R10: 000000000000ffff R11: 0000000000000001 R12: 0000000000000004 R13: ffff88803ac56c00 R14: 000000000000ffff R15: dffffc0000000000 FS: 00007f5c88a16700(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fdaa9f6c058 CR3: 000000003a82c000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ____bpf_skb_load_helper_32 net/core/filter.c:276 [inline] bpf_skb_load_helper_32+0x191/0x220 net/core/filter.c:264 Fixes: f9aefd6b2aa3 ("net: warn if mac header was not set") Reported-by: syzbot Signed-off-by: Eric Dumazet Signed-off-by: Daniel Borkmann Link: https://lore.kernel.org/bpf/20220707123900.945305-1-edumazet@google.com Signed-off-by: Greg Kroah-Hartman commit a1f8765f68bc9bf5744b365bb9f5e0b6db93edfe Author: Wang Cheng Date: Thu May 19 14:08:54 2022 -0700 mm/mempolicy: fix uninit-value in mpol_rebind_policy() commit 018160ad314d75b1409129b2247b614a9f35894c upstream. mpol_set_nodemask()(mm/mempolicy.c) does not set up nodemask when pol->mode is MPOL_LOCAL. Check pol->mode before access pol->w.cpuset_mems_allowed in mpol_rebind_policy()(mm/mempolicy.c). BUG: KMSAN: uninit-value in mpol_rebind_policy mm/mempolicy.c:352 [inline] BUG: KMSAN: uninit-value in mpol_rebind_task+0x2ac/0x2c0 mm/mempolicy.c:368 mpol_rebind_policy mm/mempolicy.c:352 [inline] mpol_rebind_task+0x2ac/0x2c0 mm/mempolicy.c:368 cpuset_change_task_nodemask kernel/cgroup/cpuset.c:1711 [inline] cpuset_attach+0x787/0x15e0 kernel/cgroup/cpuset.c:2278 cgroup_migrate_execute+0x1023/0x1d20 kernel/cgroup/cgroup.c:2515 cgroup_migrate kernel/cgroup/cgroup.c:2771 [inline] cgroup_attach_task+0x540/0x8b0 kernel/cgroup/cgroup.c:2804 __cgroup1_procs_write+0x5cc/0x7a0 kernel/cgroup/cgroup-v1.c:520 cgroup1_tasks_write+0x94/0xb0 kernel/cgroup/cgroup-v1.c:539 cgroup_file_write+0x4c2/0x9e0 kernel/cgroup/cgroup.c:3852 kernfs_fop_write_iter+0x66a/0x9f0 fs/kernfs/file.c:296 call_write_iter include/linux/fs.h:2162 [inline] new_sync_write fs/read_write.c:503 [inline] vfs_write+0x1318/0x2030 fs/read_write.c:590 ksys_write+0x28b/0x510 fs/read_write.c:643 __do_sys_write fs/read_write.c:655 [inline] __se_sys_write fs/read_write.c:652 [inline] __x64_sys_write+0xdb/0x120 fs/read_write.c:652 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x44/0xae Uninit was created at: slab_post_alloc_hook mm/slab.h:524 [inline] slab_alloc_node mm/slub.c:3251 [inline] slab_alloc mm/slub.c:3259 [inline] kmem_cache_alloc+0x902/0x11c0 mm/slub.c:3264 mpol_new mm/mempolicy.c:293 [inline] do_set_mempolicy+0x421/0xb70 mm/mempolicy.c:853 kernel_set_mempolicy mm/mempolicy.c:1504 [inline] __do_sys_set_mempolicy mm/mempolicy.c:1510 [inline] __se_sys_set_mempolicy+0x44c/0xb60 mm/mempolicy.c:1507 __x64_sys_set_mempolicy+0xd8/0x110 mm/mempolicy.c:1507 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x44/0xae KMSAN: uninit-value in mpol_rebind_task (2) https://syzkaller.appspot.com/bug?id=d6eb90f952c2a5de9ea718a1b873c55cb13b59dc This patch seems to fix below bug too. KMSAN: uninit-value in mpol_rebind_mm (2) https://syzkaller.appspot.com/bug?id=f2fecd0d7013f54ec4162f60743a2b28df40926b The uninit-value is pol->w.cpuset_mems_allowed in mpol_rebind_policy(). When syzkaller reproducer runs to the beginning of mpol_new(), mpol_new() mm/mempolicy.c do_mbind() mm/mempolicy.c kernel_mbind() mm/mempolicy.c `mode` is 1(MPOL_PREFERRED), nodes_empty(*nodes) is `true` and `flags` is 0. Then mode = MPOL_LOCAL; ... policy->mode = mode; policy->flags = flags; will be executed. So in mpol_set_nodemask(), mpol_set_nodemask() mm/mempolicy.c do_mbind() kernel_mbind() pol->mode is 4 (MPOL_LOCAL), that `nodemask` in `pol` is not initialized, which will be accessed in mpol_rebind_policy(). Link: https://lkml.kernel.org/r/20220512123428.fq3wofedp6oiotd4@ppc.localdomain Signed-off-by: Wang Cheng Reported-by: Tested-by: Cc: David Rientjes Cc: Vlastimil Babka Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 76668d2a2f367d25ff448e6d7087406af7d7bb2b Author: Marc Kleine-Budde Date: Tue Jul 19 09:22:35 2022 +0200 spi: bcm2835: bcm2835_spi_handle_err(): fix NULL pointer deref for non DMA transfers commit 4ceaa684459d414992acbefb4e4c31f2dfc50641 upstream. In case a IRQ based transfer times out the bcm2835_spi_handle_err() function is called. Since commit 1513ceee70f2 ("spi: bcm2835: Drop dma_pending flag") the TX and RX DMA transfers are unconditionally canceled, leading to NULL pointer derefs if ctlr->dma_tx or ctlr->dma_rx are not set. Fix the NULL pointer deref by checking that ctlr->dma_tx and ctlr->dma_rx are valid pointers before accessing them. Fixes: 1513ceee70f2 ("spi: bcm2835: Drop dma_pending flag") Cc: Lukas Wunner Signed-off-by: Marc Kleine-Budde Link: https://lore.kernel.org/r/20220719072234.2782764-1-mkl@pengutronix.de Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 50a1d3d097503a90cf84ebe120afcde37e9c33b3 Author: Kuniyuki Iwashima Date: Mon Jul 18 10:26:53 2022 -0700 tcp: Fix data-races around sysctl_tcp_max_reordering. [ Upstream commit a11e5b3e7a59fde1a90b0eaeaa82320495cf8cae ] While reading sysctl_tcp_max_reordering, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: dca145ffaa8d ("tcp: allow for bigger reordering level") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit c64b99819de44c83dc04960a98c76eb552a7b841 Author: Kuniyuki Iwashima Date: Mon Jul 18 10:26:51 2022 -0700 tcp: Fix a data-race around sysctl_tcp_rfc1337. [ Upstream commit 0b484c91911e758e53656d570de58c2ed81ec6f2 ] While reading sysctl_tcp_rfc1337, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 6cc566df6806fcc583511be0ef073a209346ef36 Author: Kuniyuki Iwashima Date: Mon Jul 18 10:26:50 2022 -0700 tcp: Fix a data-race around sysctl_tcp_stdurg. [ Upstream commit 4e08ed41cb1194009fc1a916a59ce3ed4afd77cd ] While reading sysctl_tcp_stdurg, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 7f68bed16c7bae60ebce659010e67a85c9dca4f0 Author: Kuniyuki Iwashima Date: Mon Jul 18 10:26:49 2022 -0700 tcp: Fix a data-race around sysctl_tcp_retrans_collapse. [ Upstream commit 1a63cb91f0c2fcdeced6d6edee8d1d886583d139 ] While reading sysctl_tcp_retrans_collapse, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 369d99c2b89f54473adcf9acdf40ea562b5a6e0e Author: Kuniyuki Iwashima Date: Mon Jul 18 10:26:48 2022 -0700 tcp: Fix data-races around sysctl_tcp_slow_start_after_idle. [ Upstream commit 4845b5713ab18a1bb6e31d1fbb4d600240b8b691 ] While reading sysctl_tcp_slow_start_after_idle, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 35089bb203f4 ("[TCP]: Add tcp_slow_start_after_idle sysctl.") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 492f3713b282c0e67e951cd804edd22eccc25412 Author: Kuniyuki Iwashima Date: Mon Jul 18 10:26:47 2022 -0700 tcp: Fix a data-race around sysctl_tcp_thin_linear_timeouts. [ Upstream commit 7c6f2a86ca590d5187a073d987e9599985fb1c7c ] While reading sysctl_tcp_thin_linear_timeouts, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 36e31b0af587 ("net: TCP thin linear timeouts") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 92c35113c63306091df9211375eebd0abd8c2160 Author: Kuniyuki Iwashima Date: Mon Jul 18 10:26:46 2022 -0700 tcp: Fix data-races around sysctl_tcp_recovery. [ Upstream commit e7d2ef837e14a971a05f60ea08c47f3fed1a36e4 ] While reading sysctl_tcp_recovery, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 4f41b1c58a32 ("tcp: use RACK to detect losses") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 83767fe800a311370330d4ec83aa76093b744a80 Author: Kuniyuki Iwashima Date: Mon Jul 18 10:26:45 2022 -0700 tcp: Fix a data-race around sysctl_tcp_early_retrans. [ Upstream commit 52e65865deb6a36718a463030500f16530eaab74 ] While reading sysctl_tcp_early_retrans, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: eed530b6c676 ("tcp: early retransmit") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 795aee11fda4d73615bc56b9111052f32da64830 Author: Kuniyuki Iwashima Date: Mon Jul 18 10:26:44 2022 -0700 tcp: Fix data-races around sysctl knobs related to SYN option. [ Upstream commit 3666f666e99600518ab20982af04a078bbdad277 ] While reading these knobs, they can be changed concurrently. Thus, we need to add READ_ONCE() to their readers. - tcp_sack - tcp_window_scaling - tcp_timestamps Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit f39b03bd727a8fea62e82f10fe2e0d753b9930ff Author: Kuniyuki Iwashima Date: Mon Jul 18 10:26:43 2022 -0700 udp: Fix a data-race around sysctl_udp_l3mdev_accept. [ Upstream commit 3d72bb4188c708bb16758c60822fc4dda7a95174 ] While reading sysctl_udp_l3mdev_accept, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 63a6fff353d0 ("net: Avoid receiving packets with an l3mdev on unbound UDP sockets") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 6727f39e99e0f545d815edebb6c94228485427ec Author: Kuniyuki Iwashima Date: Mon Jul 18 10:26:39 2022 -0700 ipv4: Fix a data-race around sysctl_fib_multipath_use_neigh. [ Upstream commit 87507bcb4f5de16bb419e9509d874f4db6c0ad0f ] While reading sysctl_fib_multipath_use_neigh, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: a6db4494d218 ("net: ipv4: Consider failed nexthops in multipath routes") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit a8569f76df7ec5b4b51155c57523a0b356db5741 Author: Hristo Venev Date: Sat Jul 16 11:51:34 2022 +0300 be2net: Fix buffer overflow in be_get_module_eeprom [ Upstream commit d7241f679a59cfe27f92cb5c6272cb429fb1f7ec ] be_cmd_read_port_transceiver_data assumes that it is given a buffer that is at least PAGE_DATA_LEN long, or twice that if the module supports SFF 8472. However, this is not always the case. Fix this by passing the desired offset and length to be_cmd_read_port_transceiver_data so that we only copy the bytes once. Fixes: e36edd9d26cf ("be2net: add ethtool "-m" option support") Signed-off-by: Hristo Venev Link: https://lore.kernel.org/r/20220716085134.6095-1-hristo@venev.name Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 91d6aa19dd72ed5389f2310382a97cb49a69af1e Author: Haibo Chen Date: Mon Jul 18 16:31:41 2022 +0800 gpio: pca953x: only use single read/write for No AI mode [ Upstream commit db8edaa09d7461ec08672a92a2eef63d5882bb79 ] For the device use NO AI mode(not support auto address increment), only use the single read/write when config the regmap. We meet issue on PCA9557PW on i.MX8QXP/DXL evk board, this device do not support AI mode, but when do the regmap sync, regmap will sync 3 byte data to register 1, logically this means write first data to register 1, write second data to register 2, write third data to register 3. But this device do not support AI mode, finally, these three data write only into register 1 one by one. the reault is the value of register 1 alway equal to the latest data, here is the third data, no operation happened on register 2 and register 3. This is not what we expect. Fixes: 49427232764d ("gpio: pca953x: Perform basic regmap conversion") Signed-off-by: Haibo Chen Reviewed-by: Andy Shevchenko Signed-off-by: Bartosz Golaszewski Signed-off-by: Sasha Levin commit 031af9e617a6f51075d97e56fc9e712c7dde2508 Author: Piotr Skajewski Date: Fri Jul 15 14:44:56 2022 -0700 ixgbe: Add locking to prevent panic when setting sriov_numvfs to zero [ Upstream commit 1e53834ce541d4fe271cdcca7703e50be0a44f8a ] It is possible to disable VFs while the PF driver is processing requests from the VF driver. This can result in a panic. BUG: unable to handle kernel paging request at 000000000000106c PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI CPU: 8 PID: 0 Comm: swapper/8 Kdump: loaded Tainted: G I --------- - Hardware name: Dell Inc. PowerEdge R740/06WXJT, BIOS 2.8.2 08/27/2020 RIP: 0010:ixgbe_msg_task+0x4c8/0x1690 [ixgbe] Code: 00 00 48 8d 04 40 48 c1 e0 05 89 7c 24 24 89 fd 48 89 44 24 10 83 ff 01 0f 84 b8 04 00 00 4c 8b 64 24 10 4d 03 a5 48 22 00 00 <41> 80 7c 24 4c 00 0f 84 8a 03 00 00 0f b7 c7 83 f8 08 0f 84 8f 0a RSP: 0018:ffffb337869f8df8 EFLAGS: 00010002 RAX: 0000000000001020 RBX: 0000000000000000 RCX: 000000000000002b RDX: 0000000000000002 RSI: 0000000000000008 RDI: 0000000000000006 RBP: 0000000000000006 R08: 0000000000000002 R09: 0000000000029780 R10: 00006957d8f42832 R11: 0000000000000000 R12: 0000000000001020 R13: ffff8a00e8978ac0 R14: 000000000000002b R15: ffff8a00e8979c80 FS: 0000000000000000(0000) GS:ffff8a07dfd00000(0000) knlGS:00000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000106c CR3: 0000000063e10004 CR4: 00000000007726e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: ? ttwu_do_wakeup+0x19/0x140 ? try_to_wake_up+0x1cd/0x550 ? ixgbevf_update_xcast_mode+0x71/0xc0 [ixgbevf] ixgbe_msix_other+0x17e/0x310 [ixgbe] __handle_irq_event_percpu+0x40/0x180 handle_irq_event_percpu+0x30/0x80 handle_irq_event+0x36/0x53 handle_edge_irq+0x82/0x190 handle_irq+0x1c/0x30 do_IRQ+0x49/0xd0 common_interrupt+0xf/0xf This can be eventually be reproduced with the following script: while : do echo 63 > /sys/class/net//device/sriov_numvfs sleep 1 echo 0 > /sys/class/net//device/sriov_numvfs sleep 1 done Add lock when disabling SR-IOV to prevent process VF mailbox communication. Fixes: d773d1310625 ("ixgbe: Fix memory leak when SR-IOV VFs are direct assigned") Signed-off-by: Piotr Skajewski Tested-by: Marek Szlosek Signed-off-by: Tony Nguyen Link: https://lore.kernel.org/r/20220715214456.2968711-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 55a2a28b32854d3768f0fb8dfe0c3e37460513db Author: Dawid Lukwinski Date: Fri Jul 15 14:45:41 2022 -0700 i40e: Fix erroneous adapter reinitialization during recovery process [ Upstream commit f838a63369818faadec4ad1736cfbd20ab5da00e ] Fix an issue when driver incorrectly detects state of recovery process and erroneously reinitializes interrupts, which results in a kernel error and call trace message. The issue was caused by a combination of two factors: 1. Assuming the EMP reset issued after completing firmware recovery means the whole recovery process is complete. 2. Erroneous reinitialization of interrupt vector after detecting the above mentioned EMP reset. Fixes (1) by changing how recovery state change is detected and (2) by adjusting the conditional expression to ensure using proper interrupt reinitialization method, depending on the situation. Fixes: 4ff0ee1af016 ("i40e: Introduce recovery mode support") Signed-off-by: Dawid Lukwinski Signed-off-by: Jan Sokolowski Tested-by: Konrad Jankowski Signed-off-by: Tony Nguyen Link: https://lore.kernel.org/r/20220715214542.2968762-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit d88d59faf4e6f9cc4767664206afdb999b10ec77 Author: Przemyslaw Patynowski Date: Fri Jun 24 17:33:01 2022 -0700 iavf: Fix handling of dummy receive descriptors [ Upstream commit a9f49e0060301a9bfebeca76739158d0cf91cdf6 ] Fix memory leak caused by not handling dummy receive descriptor properly. iavf_get_rx_buffer now sets the rx_buffer return value for dummy receive descriptors. Without this patch, when the hardware writes a dummy descriptor, iavf would not free the page allocated for the previous receive buffer. This is an unlikely event but can still happen. [Jesse: massaged commit message] Fixes: efa14c398582 ("iavf: allow null RX descriptors") Signed-off-by: Przemyslaw Patynowski Signed-off-by: Jesse Brandeburg Tested-by: Konrad Jankowski Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin commit 25d53d858a6c0b89a6e69e376c2a57c4f4c2c8cc Author: Kuniyuki Iwashima Date: Fri Jul 15 10:17:54 2022 -0700 tcp: Fix data-races around sysctl_tcp_fastopen. [ Upstream commit 5a54213318c43f4009ae158347aa6016e3b9b55a ] While reading sysctl_tcp_fastopen, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 2100c8d2d9db ("net-tcp: Fast Open base") Signed-off-by: Kuniyuki Iwashima Acked-by: Yuchung Cheng Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 78420d8e46dfd15dcbfce566de9fd4d0f218304a Author: Kuniyuki Iwashima Date: Fri Jul 15 10:17:53 2022 -0700 tcp: Fix data-races around sysctl_max_syn_backlog. [ Upstream commit 79539f34743d3e14cc1fa6577d326a82cc64d62f ] While reading sysctl_max_syn_backlog, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit dc58e68d1e26f8da477b0ddad575cb2d7cf7e38f Author: Kuniyuki Iwashima Date: Fri Jul 15 10:17:52 2022 -0700 tcp: Fix a data-race around sysctl_tcp_tw_reuse. [ Upstream commit cbfc6495586a3f09f6f07d9fb3c7cafe807e3c55 ] While reading sysctl_tcp_tw_reuse, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit e9362a993886613ef0284c2a4911c6017c97d803 Author: Kuniyuki Iwashima Date: Fri Jul 15 10:17:51 2022 -0700 tcp: Fix a data-race around sysctl_tcp_notsent_lowat. [ Upstream commit 55be873695ed8912eb77ff46d1d1cadf028bd0f3 ] While reading sysctl_tcp_notsent_lowat, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: c9bee3b7fdec ("tcp: TCP_NOTSENT_LOWAT socket option") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit b0d9f04c870e46e6447fde854094b16159c79222 Author: Kuniyuki Iwashima Date: Fri Jul 15 10:17:50 2022 -0700 tcp: Fix data-races around some timeout sysctl knobs. [ Upstream commit 39e24435a776e9de5c6dd188836cf2523547804b ] While reading these sysctl knobs, they can be changed concurrently. Thus, we need to add READ_ONCE() to their readers. - tcp_retries1 - tcp_retries2 - tcp_orphan_retries - tcp_fin_timeout Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit ea309c467dac4f8b0a9ab01424ea7e9e29073ca7 Author: Kuniyuki Iwashima Date: Fri Jul 15 10:17:49 2022 -0700 tcp: Fix data-races around sysctl_tcp_reordering. [ Upstream commit 46778cd16e6a5ad1b2e3a91f6c057c907379418e ] While reading sysctl_tcp_reordering, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit b222de2560ab56ca30452d35b7cc0fdd7be98ed2 Author: Kuniyuki Iwashima Date: Fri Jul 15 10:17:47 2022 -0700 tcp: Fix data-races around sysctl_tcp_syncookies. [ Upstream commit f2e383b5bb6bbc60a0b94b87b3e49a2b1aefd11e ] While reading sysctl_tcp_syncookies, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit ff55c025e64730114b6e43041f7ecd013d9fe3ca Author: Kuniyuki Iwashima Date: Fri Jul 15 10:17:42 2022 -0700 igmp: Fix a data-race around sysctl_igmp_max_memberships. [ Upstream commit 6305d821e3b9b5379d348528e5b5faf316383bc2 ] While reading sysctl_igmp_max_memberships, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 1656ecaddf90e2a070ec2d2404cdae3edf80faca Author: Kuniyuki Iwashima Date: Fri Jul 15 10:17:41 2022 -0700 igmp: Fix data-races around sysctl_igmp_llm_reports. [ Upstream commit f6da2267e71106474fbc0943dc24928b9cb79119 ] While reading sysctl_igmp_llm_reports, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. This test can be packed into a helper, so such changes will be in the follow-up series after net is merged into net-next. if (ipv4_is_local_multicast(pmc->multiaddr) && !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports)) Fixes: df2cf4a78e48 ("IGMP: Inhibit reports for local multicast groups") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 2aad2c5745ecb59819bcf6181cd0153a1cf680c0 Author: Tariq Toukan Date: Fri Jul 15 11:42:16 2022 +0300 net/tls: Fix race in TLS device down flow [ Upstream commit f08d8c1bb97c48f24a82afaa2fd8c140f8d3da8b ] Socket destruction flow and tls_device_down function sync against each other using tls_device_lock and the context refcount, to guarantee the device resources are freed via tls_dev_del() by the end of tls_device_down. In the following unfortunate flow, this won't happen: - refcount is decreased to zero in tls_device_sk_destruct. - tls_device_down starts, skips the context as refcount is zero, going all the way until it flushes the gc work, and returns without freeing the device resources. - only then, tls_device_queue_ctx_destruction is called, queues the gc work and frees the context's device resources. Solve it by decreasing the refcount in the socket's destruction flow under the tls_device_lock, for perfect synchronization. This does not slow down the common likely destructor flow, in which both the refcount is decreased and the spinlock is acquired, anyway. Fixes: e8f69799810c ("net/tls: Add generic NIC offload infrastructure") Reviewed-by: Maxim Mikityanskiy Signed-off-by: Tariq Toukan Reviewed-by: Jakub Kicinski Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 573768dede0e2b7de38ecbc11cb3ee47643902dc Author: Junxiao Chang Date: Fri Jul 15 15:47:01 2022 +0800 net: stmmac: fix dma queue left shift overflow issue [ Upstream commit 613b065ca32e90209024ec4a6bb5ca887ee70980 ] When queue number is > 4, left shift overflows due to 32 bits integer variable. Mask calculation is wrong for MTL_RXQ_DMA_MAP1. If CONFIG_UBSAN is enabled, kernel dumps below warning: [ 10.363842] ================================================================== [ 10.363882] UBSAN: shift-out-of-bounds in /build/linux-intel-iotg-5.15-8e6Tf4/ linux-intel-iotg-5.15-5.15.0/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c:224:12 [ 10.363929] shift exponent 40 is too large for 32-bit type 'unsigned int' [ 10.363953] CPU: 1 PID: 599 Comm: NetworkManager Not tainted 5.15.0-1003-intel-iotg [ 10.363956] Hardware name: ADLINK Technology Inc. LEC-EL/LEC-EL, BIOS 0.15.11 12/22/2021 [ 10.363958] Call Trace: [ 10.363960] [ 10.363963] dump_stack_lvl+0x4a/0x5f [ 10.363971] dump_stack+0x10/0x12 [ 10.363974] ubsan_epilogue+0x9/0x45 [ 10.363976] __ubsan_handle_shift_out_of_bounds.cold+0x61/0x10e [ 10.363979] ? wake_up_klogd+0x4a/0x50 [ 10.363983] ? vprintk_emit+0x8f/0x240 [ 10.363986] dwmac4_map_mtl_dma.cold+0x42/0x91 [stmmac] [ 10.364001] stmmac_mtl_configuration+0x1ce/0x7a0 [stmmac] [ 10.364009] ? dwmac410_dma_init_channel+0x70/0x70 [stmmac] [ 10.364020] stmmac_hw_setup.cold+0xf/0xb14 [stmmac] [ 10.364030] ? page_pool_alloc_pages+0x4d/0x70 [ 10.364034] ? stmmac_clear_tx_descriptors+0x6e/0xe0 [stmmac] [ 10.364042] stmmac_open+0x39e/0x920 [stmmac] [ 10.364050] __dev_open+0xf0/0x1a0 [ 10.364054] __dev_change_flags+0x188/0x1f0 [ 10.364057] dev_change_flags+0x26/0x60 [ 10.364059] do_setlink+0x908/0xc40 [ 10.364062] ? do_setlink+0xb10/0xc40 [ 10.364064] ? __nla_validate_parse+0x4c/0x1a0 [ 10.364068] __rtnl_newlink+0x597/0xa10 [ 10.364072] ? __nla_reserve+0x41/0x50 [ 10.364074] ? __kmalloc_node_track_caller+0x1d0/0x4d0 [ 10.364079] ? pskb_expand_head+0x75/0x310 [ 10.364082] ? nla_reserve_64bit+0x21/0x40 [ 10.364086] ? skb_free_head+0x65/0x80 [ 10.364089] ? security_sock_rcv_skb+0x2c/0x50 [ 10.364094] ? __cond_resched+0x19/0x30 [ 10.364097] ? kmem_cache_alloc_trace+0x15a/0x420 [ 10.364100] rtnl_newlink+0x49/0x70 This change fixes MTL_RXQ_DMA_MAP1 mask issue and channel/queue mapping warning. Fixes: d43042f4da3e ("net: stmmac: mapping mtl rx to dma channel") BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=216195 Reported-by: Cedric Wassenaar Signed-off-by: Junxiao Chang Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 911b81fca2d7d17668b4062df771486034e343b1 Author: Robert Hancock Date: Tue Jun 14 17:29:19 2022 -0600 i2c: cadence: Change large transfer count reset logic to be unconditional [ Upstream commit 4ca8ca873d454635c20d508261bfc0081af75cf8 ] Problems were observed on the Xilinx ZynqMP platform with large I2C reads. When a read of 277 bytes was performed, the controller NAKed the transfer after only 252 bytes were transferred and returned an ENXIO error on the transfer. There is some code in cdns_i2c_master_isr to handle this case by resetting the transfer count in the controller before it reaches 0, to allow larger transfers to work, but it was conditional on the CDNS_I2C_BROKEN_HOLD_BIT quirk being set on the controller, and ZynqMP uses the r1p14 version of the core where this quirk is not being set. The requirement to do this to support larger reads seems like an inherently required workaround due to the core only having an 8-bit transfer size register, so it does not appear that this should be conditional on the broken HOLD bit quirk which is used elsewhere in the driver. Remove the dependency on the CDNS_I2C_BROKEN_HOLD_BIT for this transfer size reset logic to fix this problem. Fixes: 63cab195bf49 ("i2c: removed work arounds in i2c driver for Zynq Ultrascale+ MPSoC") Signed-off-by: Robert Hancock Reviewed-by: Shubhrajyoti Datta Acked-by: Michal Simek Signed-off-by: Wolfram Sang Signed-off-by: Sasha Levin commit 73a11588751a2c13f25d9da8117efc9a79b1843f Author: Kuniyuki Iwashima Date: Wed Jul 13 13:52:05 2022 -0700 tcp: Fix a data-race around sysctl_tcp_probe_interval. [ Upstream commit 2a85388f1d94a9f8b5a529118a2c5eaa0520d85c ] While reading sysctl_tcp_probe_interval, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 05cbc0db03e8 ("ipv4: Create probe timer for tcp PMTU as per RFC4821") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit b04817c94fbd285a967d9b830b274fe9998c9c0b Author: Kuniyuki Iwashima Date: Wed Jul 13 13:52:04 2022 -0700 tcp: Fix a data-race around sysctl_tcp_probe_threshold. [ Upstream commit 92c0aa4175474483d6cf373314343d4e624e882a ] While reading sysctl_tcp_probe_threshold, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 6b58e0a5f32d ("ipv4: Use binary search to choose tcp PMTU probe_size") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 033963b220633ed1602d458e7e4ac06afa9fefb2 Author: Kuniyuki Iwashima Date: Wed Jul 13 13:52:03 2022 -0700 tcp: Fix a data-race around sysctl_tcp_mtu_probe_floor. [ Upstream commit 8e92d4423615a5257d0d871fc067aa561f597deb ] While reading sysctl_tcp_mtu_probe_floor, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: c04b79b6cfd7 ("tcp: add new tcp_mtu_probe_floor sysctl") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit fdb96b69f5909ffcdd6f1e0902219fc6d7689ff7 Author: Kuniyuki Iwashima Date: Wed Jul 13 13:52:02 2022 -0700 tcp: Fix data-races around sysctl_tcp_min_snd_mss. [ Upstream commit 78eb166cdefcc3221c8c7c1e2d514e91a2eb5014 ] While reading sysctl_tcp_min_snd_mss, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 5f3e2bf008c2 ("tcp: add tcp_min_snd_mss sysctl") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 30b73edc1d2459ba2c71cb58fbf84a1a6e640fbf Author: Kuniyuki Iwashima Date: Wed Jul 13 13:52:01 2022 -0700 tcp: Fix data-races around sysctl_tcp_base_mss. [ Upstream commit 88d78bc097cd8ebc6541e93316c9d9bf651b13e8 ] While reading sysctl_tcp_base_mss, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 5d424d5a674f ("[TCP]: MTU probing") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit f966773e13cdd3f12baa90071b7b660f6c633ccb Author: Kuniyuki Iwashima Date: Wed Jul 13 13:52:00 2022 -0700 tcp: Fix data-races around sysctl_tcp_mtu_probing. [ Upstream commit f47d00e077e7d61baf69e46dde3210c886360207 ] While reading sysctl_tcp_mtu_probing, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 5d424d5a674f ("[TCP]: MTU probing") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit a7386602a2fe2f6192477e8ede291a815da09d81 Author: Kuniyuki Iwashima Date: Wed Jul 13 13:51:58 2022 -0700 tcp/dccp: Fix a data-race around sysctl_tcp_fwmark_accept. [ Upstream commit 1a0008f9df59451d0a17806c1ee1a19857032fa8 ] While reading sysctl_tcp_fwmark_accept, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 84f39b08d786 ("net: support marking accepting TCP sockets") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 25a635a67c830766110410fea88ec4e6ee29684b Author: Kuniyuki Iwashima Date: Wed Jul 13 13:51:57 2022 -0700 ip: Fix a data-race around sysctl_fwmark_reflect. [ Upstream commit 85d0b4dbd74b95cc492b1f4e34497d3f894f5d9a ] While reading sysctl_fwmark_reflect, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: e110861f8609 ("net: add a sysctl to reflect the fwmark on replies") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 281de3719986af659186e2bbef2f169753391a05 Author: Kuniyuki Iwashima Date: Wed Jul 13 13:51:55 2022 -0700 ip: Fix data-races around sysctl_ip_nonlocal_bind. [ Upstream commit 289d3b21fb0bfc94c4e98f10635bba1824e5f83c ] While reading sysctl_ip_nonlocal_bind, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 7828309df0f89419a9349761a37c7d1b0da45697 Author: Kuniyuki Iwashima Date: Wed Jul 13 13:51:53 2022 -0700 ip: Fix data-races around sysctl_ip_fwd_use_pmtu. [ Upstream commit 60c158dc7b1f0558f6cadd5b50d0386da0000d50 ] While reading sysctl_ip_fwd_use_pmtu, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: f87c10a8aa1e ("ipv4: introduce ip_dst_mtu_maybe_forward and protect forwarding path against pmtu spoofing") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 5af6d9226376fbd0dba6876139385e770342726f Author: Kuniyuki Iwashima Date: Wed Jul 13 13:51:52 2022 -0700 ip: Fix data-races around sysctl_ip_no_pmtu_disc. [ Upstream commit 0968d2a441bf6afb551fd99e60fa65ed67068963 ] While reading sysctl_ip_no_pmtu_disc, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 16cb6717f4f42487ef10583eb8bc98e7d1e33d65 Author: Lennert Buytenhek Date: Thu Jun 2 18:58:11 2022 +0300 igc: Reinstate IGC_REMOVED logic and implement it properly [ Upstream commit 7c1ddcee5311f3315096217881d2dbe47cc683f9 ] The initially merged version of the igc driver code (via commit 146740f9abc4, "igc: Add support for PF") contained the following IGC_REMOVED checks in the igc_rd32/wr32() MMIO accessors: u32 igc_rd32(struct igc_hw *hw, u32 reg) { u8 __iomem *hw_addr = READ_ONCE(hw->hw_addr); u32 value = 0; if (IGC_REMOVED(hw_addr)) return ~value; value = readl(&hw_addr[reg]); /* reads should not return all F's */ if (!(~value) && (!reg || !(~readl(hw_addr)))) hw->hw_addr = NULL; return value; } And: #define wr32(reg, val) \ do { \ u8 __iomem *hw_addr = READ_ONCE((hw)->hw_addr); \ if (!IGC_REMOVED(hw_addr)) \ writel((val), &hw_addr[(reg)]); \ } while (0) E.g. igb has similar checks in its MMIO accessors, and has a similar macro E1000_REMOVED, which is implemented as follows: #define E1000_REMOVED(h) unlikely(!(h)) These checks serve to detect and take note of an 0xffffffff MMIO read return from the device, which can be caused by a PCIe link flap or some other kind of PCI bus error, and to avoid performing MMIO reads and writes from that point onwards. However, the IGC_REMOVED macro was not originally implemented: #ifndef IGC_REMOVED #define IGC_REMOVED(a) (0) #endif /* IGC_REMOVED */ This led to the IGC_REMOVED logic to be removed entirely in a subsequent commit (commit 3c215fb18e70, "igc: remove IGC_REMOVED function"), with the rationale that such checks matter only for virtualization and that igc does not support virtualization -- but a PCIe device can become detached even without virtualization being in use, and without proper checks, a PCIe bus error affecting an igc adapter will lead to various NULL pointer dereferences, as the first access after the error will set hw->hw_addr to NULL, and subsequent accesses will blindly dereference this now-NULL pointer. This patch reinstates the IGC_REMOVED checks in igc_rd32/wr32(), and implements IGC_REMOVED the way it is done for igb, by checking for the unlikely() case of hw_addr being NULL. This change prevents the oopses seen when a PCIe link flap occurs on an igc adapter. Fixes: 146740f9abc4 ("igc: Add support for PF") Signed-off-by: Lennert Buytenhek Tested-by: Naama Meir Acked-by: Sasha Neftin Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin commit 98c3c8fd0d4c560e0f8335b79c407bbf7fc9462c Author: Peter Zijlstra Date: Tue Jul 5 15:07:26 2022 +0200 perf/core: Fix data race between perf_event_set_output() and perf_mmap_close() [ Upstream commit 68e3c69803dada336893640110cb87221bb01dcf ] Yang Jihing reported a race between perf_event_set_output() and perf_mmap_close(): CPU1 CPU2 perf_mmap_close(e2) if (atomic_dec_and_test(&e2->rb->mmap_count)) // 1 - > 0 detach_rest = true ioctl(e1, IOC_SET_OUTPUT, e2) perf_event_set_output(e1, e2) ... list_for_each_entry_rcu(e, &e2->rb->event_list, rb_entry) ring_buffer_attach(e, NULL); // e1 isn't yet added and // therefore not detached ring_buffer_attach(e1, e2->rb) list_add_rcu(&e1->rb_entry, &e2->rb->event_list) After this; e1 is attached to an unmapped rb and a subsequent perf_mmap() will loop forever more: again: mutex_lock(&e->mmap_mutex); if (event->rb) { ... if (!atomic_inc_not_zero(&e->rb->mmap_count)) { ... mutex_unlock(&e->mmap_mutex); goto again; } } The loop in perf_mmap_close() holds e2->mmap_mutex, while the attach in perf_event_set_output() holds e1->mmap_mutex. As such there is no serialization to avoid this race. Change perf_event_set_output() to take both e1->mmap_mutex and e2->mmap_mutex to alleviate that problem. Additionally, have the loop in perf_mmap() detach the rb directly, this avoids having to wait for the concurrent perf_mmap_close() to get around to doing it to make progress. Fixes: 9bb5d40cd93c ("perf: Fix mmap() accounting hole") Reported-by: Yang Jihong Signed-off-by: Peter Zijlstra (Intel) Tested-by: Yang Jihong Link: https://lkml.kernel.org/r/YsQ3jm2GR38SW7uD@worktop.programming.kicks-ass.net Signed-off-by: Sasha Levin commit 6194c021496addc11763d1ffa89ce5751889fe3c Author: William Dean Date: Sun Jul 10 23:49:22 2022 +0800 pinctrl: ralink: Check for null return of devm_kcalloc [ Upstream commit c3b821e8e406d5650e587b7ac624ac24e9b780a8 ] Because of the possible failure of the allocation, data->domains might be NULL pointer and will cause the dereference of the NULL pointer later. Therefore, it might be better to check it and directly return -ENOMEM without releasing data manually if fails, because the comment of the devm_kmalloc() says "Memory allocated with this function is automatically freed on driver detach.". Fixes: a86854d0c599b ("treewide: devm_kzalloc() -> devm_kcalloc()") Reported-by: Hacash Robot Signed-off-by: William Dean Link: https://lore.kernel.org/r/20220710154922.2610876-1-williamsukatube@163.com Signed-off-by: Linus Walleij Signed-off-by: Sasha Levin commit 78bdf732cf5d74d1c6ecda06830a91f80a4aef6f Author: Miaoqian Lin Date: Mon May 23 18:10:09 2022 +0400 power/reset: arm-versatile: Fix refcount leak in versatile_reboot_probe [ Upstream commit 80192eff64eee9b3bc0594a47381937b94b9d65a ] of_find_matching_node_and_match() returns a node pointer with refcount incremented, we should use of_node_put() on it when not need anymore. Add missing of_node_put() to avoid refcount leak. Fixes: 0e545f57b708 ("power: reset: driver for the Versatile syscon reboot") Signed-off-by: Miaoqian Lin Reviewed-by: Linus Walleij Signed-off-by: Sebastian Reichel Signed-off-by: Sasha Levin commit f4248bdb7d5c1150a2a6f8c3d3b6da0b71f62a20 Author: Hangyu Hua Date: Wed Jun 1 14:46:25 2022 +0800 xfrm: xfrm_policy: fix a possible double xfrm_pols_put() in xfrm_bundle_lookup() [ Upstream commit f85daf0e725358be78dfd208dea5fd665d8cb901 ] xfrm_policy_lookup() will call xfrm_pol_hold_rcu() to get a refcount of pols[0]. This refcount can be dropped in xfrm_expand_policies() when xfrm_expand_policies() return error. pols[0]'s refcount is balanced in here. But xfrm_bundle_lookup() will also call xfrm_pols_put() with num_pols == 1 to drop this refcount when xfrm_expand_policies() return error. This patch also fix an illegal address access. pols[0] will save a error point when xfrm_policy_lookup fails. This lead to xfrm_pols_put to resolve an illegal address in xfrm_bundle_lookup's error path. Fix these by setting num_pols = 0 in xfrm_expand_policies()'s error path. Fixes: 80c802f3073e ("xfrm: cache bundles instead of policies for outgoing flows") Signed-off-by: Hangyu Hua Signed-off-by: Steffen Klassert Signed-off-by: Sasha Levin commit c68f6e2e4fda5f36d403f64fa57a2804ec59a87d Author: Pali Rohár Date: Tue Jun 28 12:09:22 2022 +0200 serial: mvebu-uart: correctly report configured baudrate value commit 4f532c1e25319e42996ec18a1f473fd50c8e575d upstream. Functions tty_termios_encode_baud_rate() and uart_update_timeout() should be called with the baudrate value which was set to hardware. Linux then report exact values via ioctl(TCGETS2) to userspace. Change mvebu_uart_baud_rate_set() function to return baudrate value which was set to hardware and propagate this value to above mentioned functions. With this change userspace would see precise value in termios c_ospeed field. Fixes: 68a0db1d7da2 ("serial: mvebu-uart: add function to change baudrate") Cc: stable Reviewed-by: Ilpo Järvinen Signed-off-by: Pali Rohár Link: https://lore.kernel.org/r/20220628100922.10717-1-pali@kernel.org Signed-off-by: Greg Kroah-Hartman commit 2230428fb866cb2a7a62e35b7aedfd43d3afa797 Author: Jeffrey Hugo Date: Fri Jul 15 21:36:11 2022 +0000 PCI: hv: Fix interrupt mapping for multi-MSI commit a2bad844a67b1c7740bda63e87453baf63c3a7f7 upstream. According to Dexuan, the hypervisor folks beleive that multi-msi allocations are not correct. compose_msi_msg() will allocate multi-msi one by one. However, multi-msi is a block of related MSIs, with alignment requirements. In order for the hypervisor to allocate properly aligned and consecutive entries in the IOMMU Interrupt Remapping Table, there should be a single mapping request that requests all of the multi-msi vectors in one shot. Dexuan suggests detecting the multi-msi case and composing a single request related to the first MSI. Then for the other MSIs in the same block, use the cached information. This appears to be viable, so do it. 5.4 backport - add hv_msi_get_int_vector helper function. Fixed merge conflict due to delivery_mode name change (APIC_DELIVERY_MODE_FIXED is the value given to dest_Fixed). Removed unused variable in hv_compose_msi_msg. Fixed reference to msi_desc->pci to point to the same is_msix variable. Removed changes to compose_msi_req_v3 since it doesn't exist yet. Suggested-by: Dexuan Cui Signed-off-by: Jeffrey Hugo Reviewed-by: Dexuan Cui Tested-by: Michael Kelley Link: https://lore.kernel.org/r/1652282599-21643-1-git-send-email-quic_jhugo@quicinc.com Signed-off-by: Wei Liu Signed-off-by: Carl Vanderlip Signed-off-by: Greg Kroah-Hartman commit 7121d7120fd48115a158720f3424de71d44c77a4 Author: Jeffrey Hugo Date: Fri Jul 15 21:36:10 2022 +0000 PCI: hv: Reuse existing IRTE allocation in compose_msi_msg() commit b4b77778ecc5bfbd4e77de1b2fd5c1dd3c655f1f upstream. Currently if compose_msi_msg() is called multiple times, it will free any previous IRTE allocation, and generate a new allocation. While nothing prevents this from occurring, it is extraneous when Linux could just reuse the existing allocation and avoid a bunch of overhead. However, when future IRTE allocations operate on blocks of MSIs instead of a single line, freeing the allocation will impact all of the lines. This could cause an issue where an allocation of N MSIs occurs, then some of the lines are retargeted, and finally the allocation is freed/reallocated. The freeing of the allocation removes all of the configuration for the entire block, which requires all the lines to be retargeted, which might not happen since some lines might already be unmasked/active. Signed-off-by: Jeffrey Hugo Reviewed-by: Dexuan Cui Tested-by: Dexuan Cui Tested-by: Michael Kelley Link: https://lore.kernel.org/r/1652282582-21595-1-git-send-email-quic_jhugo@quicinc.com Signed-off-by: Wei Liu Signed-off-by: Carl Vanderlip Signed-off-by: Greg Kroah-Hartman commit 584c9d41800ba53e705205213f6d1bafeacc24a8 Author: Jeffrey Hugo Date: Fri Jul 15 21:36:09 2022 +0000 PCI: hv: Fix hv_arch_irq_unmask() for multi-MSI commit 455880dfe292a2bdd3b4ad6a107299fce610e64b upstream. In the multi-MSI case, hv_arch_irq_unmask() will only operate on the first MSI of the N allocated. This is because only the first msi_desc is cached and it is shared by all the MSIs of the multi-MSI block. This means that hv_arch_irq_unmask() gets the correct address, but the wrong data (always 0). This can break MSIs. Lets assume MSI0 is vector 34 on CPU0, and MSI1 is vector 33 on CPU0. hv_arch_irq_unmask() is called on MSI0. It uses a hypercall to configure the MSI address and data (0) to vector 34 of CPU0. This is correct. Then hv_arch_irq_unmask is called on MSI1. It uses another hypercall to configure the MSI address and data (0) to vector 33 of CPU0. This is wrong, and results in both MSI0 and MSI1 being routed to vector 33. Linux will observe extra instances of MSI1 and no instances of MSI0 despite the endpoint device behaving correctly. For the multi-MSI case, we need unique address and data info for each MSI, but the cached msi_desc does not provide that. However, that information can be gotten from the int_desc cached in the chip_data by compose_msi_msg(). Fix the multi-MSI case to use that cached information instead. Since hv_set_msi_entry_from_desc() is no longer applicable, remove it. 5.4 backport - hv_set_msi_entry_from_desc doesn't exist to be removed. msi_desc replaces msi_entry for location int_desc is written to. Signed-off-by: Jeffrey Hugo Reviewed-by: Michael Kelley Link: https://lore.kernel.org/r/1651068453-29588-1-git-send-email-quic_jhugo@quicinc.com Signed-off-by: Wei Liu Signed-off-by: Carl Vanderlip Signed-off-by: Greg Kroah-Hartman commit 8e94cc8830117690801b4fac47aad2643a366c14 Author: Jeffrey Hugo Date: Fri Jul 15 21:36:08 2022 +0000 PCI: hv: Fix multi-MSI to allow more than one MSI vector commit 08e61e861a0e47e5e1a3fb78406afd6b0cea6b6d upstream. If the allocation of multiple MSI vectors for multi-MSI fails in the core PCI framework, the framework will retry the allocation as a single MSI vector, assuming that meets the min_vecs specified by the requesting driver. Hyper-V advertises that multi-MSI is supported, but reuses the VECTOR domain to implement that for x86. The VECTOR domain does not support multi-MSI, so the alloc will always fail and fallback to a single MSI allocation. In short, Hyper-V advertises a capability it does not implement. Hyper-V can support multi-MSI because it coordinates with the hypervisor to map the MSIs in the IOMMU's interrupt remapper, which is something the VECTOR domain does not have. Therefore the fix is simple - copy what the x86 IOMMU drivers (AMD/Intel-IR) do by removing X86_IRQ_ALLOC_CONTIGUOUS_VECTORS after calling the VECTOR domain's pci_msi_prepare(). 5.4 backport - adds the hv_msi_prepare wrapper function. X86_IRQ_ALLOC_TYPE_PCI_MSI changed to X86_IRQ_ALLOC_TYPE_MSI (same value). Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs") Signed-off-by: Jeffrey Hugo Reviewed-by: Dexuan Cui Link: https://lore.kernel.org/r/1649856981-14649-1-git-send-email-quic_jhugo@quicinc.com Signed-off-by: Wei Liu Signed-off-by: Carl Vanderlip Signed-off-by: Greg Kroah-Hartman commit 3048666143be83807b1250045be3bb919a989981 Author: Demi Marie Obenour Date: Sun Jul 10 19:05:22 2022 -0400 xen/gntdev: Ignore failure to unmap INVALID_GRANT_HANDLE commit 166d3863231667c4f64dee72b77d1102cdfad11f upstream. The error paths of gntdev_mmap() can call unmap_grant_pages() even though not all of the pages have been successfully mapped. This will trigger the WARN_ON()s in __unmap_grant_pages_done(). The number of warnings can be very large; I have observed thousands of lines of warnings in the systemd journal. Avoid this problem by only warning on unmapping failure if the handle being unmapped is not INVALID_GRANT_HANDLE. The handle field of any page that was not successfully mapped will be INVALID_GRANT_HANDLE, so this catches all cases where unmapping can legitimately fail. Fixes: dbe97cff7dd9 ("xen/gntdev: Avoid blocking in unmap_grant_pages()") Cc: stable@vger.kernel.org Suggested-by: Juergen Gross Signed-off-by: Demi Marie Obenour Reviewed-by: Oleksandr Tyshchenko Reviewed-by: Juergen Gross Link: https://lore.kernel.org/r/20220710230522.1563-1-demi@invisiblethingslab.com Signed-off-by: Juergen Gross Signed-off-by: Greg Kroah-Hartman commit ed3fea55066b4e054c4d212e54f9965abcac9685 Author: Eric Snowberg Date: Wed Jul 20 12:40:27 2022 -0400 lockdown: Fix kexec lockdown bypass with ima policy commit 543ce63b664e2c2f9533d089a4664b559c3e6b5b upstream. The lockdown LSM is primarily used in conjunction with UEFI Secure Boot. This LSM may also be used on machines without UEFI. It can also be enabled when UEFI Secure Boot is disabled. One of lockdown's features is to prevent kexec from loading untrusted kernels. Lockdown can be enabled through a bootparam or after the kernel has booted through securityfs. If IMA appraisal is used with the "ima_appraise=log" boot param, lockdown can be defeated with kexec on any machine when Secure Boot is disabled or unavailable. IMA prevents setting "ima_appraise=log" from the boot param when Secure Boot is enabled, but this does not cover cases where lockdown is used without Secure Boot. To defeat lockdown, boot without Secure Boot and add ima_appraise=log to the kernel command line; then: $ echo "integrity" > /sys/kernel/security/lockdown $ echo "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig" > \ /sys/kernel/security/ima/policy $ kexec -ls unsigned-kernel Add a call to verify ima appraisal is set to "enforce" whenever lockdown is enabled. This fixes CVE-2022-21505. Cc: stable@vger.kernel.org Fixes: 29d3c1c8dfe7 ("kexec: Allow kexec_file() with appropriate IMA policy when locked down") Signed-off-by: Eric Snowberg Acked-by: Mimi Zohar Reviewed-by: John Haxby Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit c3856fe718ad870f2640acf1cecc71941c3f337d Author: Ido Schimmel Date: Tue Jul 19 15:26:26 2022 +0300 mlxsw: spectrum_router: Fix IPv4 nexthop gateway indication commit e5ec6a2513383fe2ecc2ee3b5f51d97acbbcd4d8 upstream. mlxsw needs to distinguish nexthops with a gateway from connected nexthops in order to write the former to the adjacency table of the device. The check used to rely on the fact that nexthops with a gateway have a 'link' scope whereas connected nexthops have a 'host' scope. This is no longer correct after commit 747c14307214 ("ip: fix dflt addr selection for connected nexthop"). Fix that by instead checking the address family of the gateway IP. This is a more direct way and also consistent with the IPv6 counterpart in mlxsw_sp_rt6_is_gateway(). Cc: stable@vger.kernel.org Fixes: 747c14307214 ("ip: fix dflt addr selection for connected nexthop") Fixes: 597cfe4fc339 ("nexthop: Add support for IPv4 nexthops") Signed-off-by: Ido Schimmel Reviewed-by: Amit Cohen Reviewed-by: Nicolas Dichtel Reviewed-by: David Ahern Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit c3dc751184454e4a2f21b7e9d3df4e7f526f20aa Author: Ben Dooks Date: Sun May 29 16:22:00 2022 +0100 riscv: add as-options for modules with assembly compontents commit c1f6eff304e4dfa4558b6a8c6b2d26a91db6c998 upstream. When trying to load modules built for RISC-V which include assembly files the kernel loader errors with "unexpected relocation type 'R_RISCV_ALIGN'" due to R_RISCV_ALIGN relocations being generated by the assembler. The R_RISCV_ALIGN relocations can be removed at the expense of code space by adding -mno-relax to gcc and as. In commit 7a8e7da42250138 ("RISC-V: Fixes to module loading") -mno-relax is added to the build variable KBUILD_CFLAGS_MODULE. See [1] for more info. The issue is that when kbuild builds a .S file, it invokes gcc with the -mno-relax flag, but this is not being passed through to the assembler. Adding -Wa,-mno-relax to KBUILD_AFLAGS_MODULE ensures that the assembler is invoked correctly. This may have now been fixed in gcc[2] and this addition should not stop newer gcc and as from working. [1] https://github.com/riscv/riscv-elf-psabi-doc/issues/183 [2] https://github.com/gcc-mirror/gcc/commit/3b0a7d624e64eeb81e4d5e8c62c46d86ef521857 Signed-off-by: Ben Dooks Reviewed-by: Bin Meng Link: https://lore.kernel.org/r/20220529152200.609809-1-ben.dooks@codethink.co.uk Fixes: ab1ef68e5401 ("RISC-V: Add sections of PLT and GOT for kernel module") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt Signed-off-by: Greg Kroah-Hartman commit e5a6b05d0c68435b7ef6aa51a400a3592770fdc4 Author: Fabien Dessenne Date: Mon Jun 27 16:23:50 2022 +0200 pinctrl: stm32: fix optional IRQ support to gpios commit a1d4ef1adf8bbd302067534ead671a94759687ed upstream. To act as an interrupt controller, a gpio bank relies on the "interrupt-parent" of the pin controller. When this optional "interrupt-parent" misses, do not create any IRQ domain. This fixes a "NULL pointer in stm32_gpio_domain_alloc()" kernel crash when the interrupt-parent = property is not declared in the Device Tree. Fixes: 0eb9f683336d ("pinctrl: Add IRQ support to STM32 gpios") Signed-off-by: Fabien Dessenne Link: https://lore.kernel.org/r/20220627142350.742973-1-fabien.dessenne@foss.st.com Signed-off-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman