forked from torvalds/linux
-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update 5.4-1.0.0-imx to v5.4.47 from stable #86
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[ Upstream commit 79a1f0c ] Socket option IPV6_ADDRFORM supports UDP/UDPLITE and TCP at present. Previously the checking logic looks like: if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE) do_some_check; else if (sk->sk_protocol != IPPROTO_TCP) break; After commit b6f6118 ("ipv6: restrict IPV6_ADDRFORM operation"), TCP was blocked as the logic changed to: if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE) do_some_check; else if (sk->sk_protocol == IPPROTO_TCP) do_some_check; break; else break; Then after commit 82c9ae4 ("ipv6: fix restrict IPV6_ADDRFORM operation") UDP/UDPLITE were blocked as the logic changed to: if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE) do_some_check; if (sk->sk_protocol == IPPROTO_TCP) do_some_check; if (sk->sk_protocol != IPPROTO_TCP) break; Fix it by using Eric's code and simply remove the break in TCP check, which looks like: if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE) do_some_check; else if (sk->sk_protocol == IPPROTO_TCP) do_some_check; else break; Fixes: 82c9ae4 ("ipv6: fix restrict IPV6_ADDRFORM operation") Signed-off-by: Hangbin Liu <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
…l zones [ Upstream commit 2dc2f76 ] The driver registers three different types of thermal zones: For the ASIC itself, for port modules and for gearboxes. Currently, all three types use the same get_trend() callback which does not work correctly for the ASIC thermal zone. The callback assumes that the device data is of type 'struct mlxsw_thermal_module', whereas for the ASIC thermal zone 'struct mlxsw_thermal' is passed as device data. Fix this by using one get_trend() callback for the ASIC thermal zone and another for the other two types. Fixes: 6f73862 ("mlxsw: core: Add the hottest thermal zone detection") Signed-off-by: Vadim Pasternak <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
[ Upstream commit e8224bf ] found by smatch: drivers/net/net_failover.c:65 net_failover_open() error: we previously assumed 'primary_dev' could be null (see line 43) Fixes: cfc80d9 ("net: Introduce net_failover driver") Signed-off-by: Vasily Averin <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
[ Upstream commit 96aa1b2 ] Tun in IFF_NAPI_FRAGS mode calls napi_gro_frags. Unlike netif_rx and netif_gro_receive, this expects skb->data to point to the mac layer. But skb_probe_transport_header, __skb_get_hash_symmetric, and xdp_do_generic in tun_get_user need skb->data to point to the network header. Flow dissection also needs skb->protocol set, so eth_type_trans has to be called. Ensure the link layer header lies in linear as eth_type_trans pulls ETH_HLEN. Then take the same code paths for frags as for not frags. Push the link layer header back just before calling napi_gro_frags. By pulling up to ETH_HLEN from frag0 into linear, this disables the frag0 optimization in the special case when IFF_NAPI_FRAGS is used with zero length iov[0] (and thus empty skb->linear). Fixes: 90e33d4 ("tun: enable napi_gro_frags() for TUN/TAP driver") Signed-off-by: Willem de Bruijn <[email protected]> Acked-by: Petar Penkov <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
… options [ Upstream commit 53fc685 ] When neighbor suppression is enabled the bridge device might reply to Neighbor Solicitation (NS) messages on behalf of remote hosts. In case the NS message includes the "Source link-layer address" option [1], the bridge device will use the specified address as the link-layer destination address in its reply. To avoid an infinite loop, break out of the options parsing loop when encountering an option with length zero and disregard the NS message. This is consistent with the IPv6 ndisc code and RFC 4886 which states that "Nodes MUST silently discard an ND packet that contains an option with length zero" [2]. [1] https://tools.ietf.org/html/rfc4861#section-4.3 [2] https://tools.ietf.org/html/rfc4861#section-4.6 Fixes: ed842fa ("bridge: suppress nd pkts on BR_NEIGH_SUPPRESS ports") Signed-off-by: Ido Schimmel <[email protected]> Reported-by: Alla Segal <[email protected]> Tested-by: Alla Segal <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
…options [ Upstream commit 8066e6b ] When proxy mode is enabled the vxlan device might reply to Neighbor Solicitation (NS) messages on behalf of remote hosts. In case the NS message includes the "Source link-layer address" option [1], the vxlan device will use the specified address as the link-layer destination address in its reply. To avoid an infinite loop, break out of the options parsing loop when encountering an option with length zero and disregard the NS message. This is consistent with the IPv6 ndisc code and RFC 4886 which states that "Nodes MUST silently discard an ND packet that contains an option with length zero" [2]. [1] https://tools.ietf.org/html/rfc4861#section-4.3 [2] https://tools.ietf.org/html/rfc4861#section-4.6 Fixes: 4b29dba ("vxlan: fix nonfunctional neigh_reduce()") Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 90ceddc upstream. Simplify gen_btf logic to make it work with llvm-objcopy. The existing 'file format' and 'architecture' parsing logic is brittle and does not work with llvm-objcopy/llvm-objdump. 'file format' output of llvm-objdump>=11 will match GNU objdump, but 'architecture' (bfdarch) may not. .BTF in .tmp_vmlinux.btf is non-SHF_ALLOC. Add the SHF_ALLOC flag because it is part of vmlinux image used for introspection. C code can reference the section via linker script defined __start_BTF and __stop_BTF. This fixes a small problem that previous .BTF had the SHF_WRITE flag (objcopy -I binary -O elf* synthesized .data). Additionally, `objcopy -I binary` synthesized symbols _binary__btf_vmlinux_bin_start and _binary__btf_vmlinux_bin_stop (not used elsewhere) are replaced with more commonplace __start_BTF and __stop_BTF. Add 2>/dev/null because GNU objcopy (but not llvm-objcopy) warns "empty loadable segment detected at vaddr=0xffffffff81000000, is this intentional?" We use a dd command to change the e_type field in the ELF header from ET_EXEC to ET_REL so that lld will accept .btf.vmlinux.bin.o. Accepting ET_EXEC as an input file is an extremely rare GNU ld feature that lld does not intend to support, because this is error-prone. The output section description .BTF in include/asm-generic/vmlinux.lds.h avoids potential subtle orphan section placement issues and suppresses --orphan-handling=warn warnings. Fixes: df786c9 ("bpf: Force .BTF section start to zero when dumping from vmlinux") Fixes: cb0cc63 ("powerpc: Include .BTF section") Reported-by: Nathan Chancellor <[email protected]> Signed-off-by: Fangrui Song <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Tested-by: Stanislav Fomichev <[email protected]> Tested-by: Andrii Nakryiko <[email protected]> Reviewed-by: Stanislav Fomichev <[email protected]> Reviewed-by: Kees Cook <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Acked-by: Michael Ellerman <[email protected]> (powerpc) Link: ClangBuiltLinux#871 Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Maria Teguiani <[email protected]> Tested-by: Matthias Maennich <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 51da9df upstream. ELFNOTE_START allows callers to specify flags for .pushsection assembler directives. All callsites but ELF_NOTE use "a" for SHF_ALLOC. For vdso's that explicitly use ELF_NOTE_START and BUILD_SALT, the same section is specified twice after preprocessing, once with "a" flag, once without. Example: .pushsection .note.Linux, "a", @note ; .pushsection .note.Linux, "", @note ; While GNU as allows this ordering, it warns for the opposite ordering, making these directives position dependent. We'd prefer not to precisely match this behavior in Clang's integrated assembler. Instead, the non __ASSEMBLY__ definition of ELF_NOTE uses __attribute__((section(".note.Linux"))) which is created with SHF_ALLOC, so let's make the __ASSEMBLY__ definition of ELF_NOTE consistent with C and just always use "a" flag. This allows Clang to assemble a working mainline (5.6) kernel via: $ make CC=clang AS=clang Signed-off-by: Nick Desaulniers <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Nathan Chancellor <[email protected]> Reviewed-by: Fangrui Song <[email protected]> Cc: Jeremy Fitzhardinge <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vincenzo Frascino <[email protected]> Link: ClangBuiltLinux#913 Link: http://lkml.kernel.org/r/[email protected] Debugged-by: Ilie Halip <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Cc: Jian Cai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
[ Upstream commit 3f8f770 ] MMS345L is another first generation touch screen from Melfas, which uses the same registers as MMS152. However, using I2C_M_NOSTART for it causes errors when reading: i2c i2c-0: sendbytes: NAK bailout. mms114 0-0048: __mms114_read_reg: i2c transfer failed (-5) The driver works fine as soon as I2C_M_NOSTART is removed. Reviewed-by: Andi Shyti <[email protected]> Signed-off-by: Stephan Gerhold <[email protected]> Link: https://lore.kernel.org/r/[email protected] [dtor: removed separate mms345l handling, made everyone use standard transfer mode, propagated the 10bit addressing flag to the read part of the transfer as well.] Signed-off-by: Dmitry Torokhov <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 3866f21 ] call_undef_hook() in traps.c applies the same instr_mask for both 16-bit and 32-bit thumb instructions. If instr_mask then is only 16 bits wide (0xffff as opposed to 0xffffffff), the first half-word of 32-bit thumb instructions will be masked out. This makes the function match 32-bit thumb instructions where the second half-word is equal to instr_val, regardless of the first half-word. The result in this case is that all undefined 32-bit thumb instructions with the second half-word equal to 0xde01 (udf Freescale#1) work as breakpoints and will raise a SIGTRAP instead of a SIGILL, instead of just the one intended 16-bit instruction. An example of such an instruction is 0xeaa0de01, which is unallocated according to Arm ARM and should raise a SIGILL, but instead raises a SIGTRAP. This patch fixes the issue by setting all the bits in instr_mask, which will still match the intended 16-bit thumb instruction (where the upper half is always 0), but not any 32-bit thumb instructions. Cc: Oleg Nesterov <[email protected]> Signed-off-by: Fredrik Strupe <[email protected]> Signed-off-by: Russell King <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 18f855e ] Stefano reported a crash with using SQPOLL with io_uring: BUG: kernel NULL pointer dereference, address: 00000000000003b0 CPU: 2 PID: 1307 Comm: io_uring-sq Not tainted 5.7.0-rc7 Freescale#11 RIP: 0010:task_numa_work+0x4f/0x2c0 Call Trace: task_work_run+0x68/0xa0 io_sq_thread+0x252/0x3d0 kthread+0xf9/0x130 ret_from_fork+0x35/0x40 which is task_numa_work() oopsing on current->mm being NULL. The task work is queued by task_tick_numa(), which checks if current->mm is NULL at the time of the call. But this state isn't necessarily persistent, if the kthread is using use_mm() to temporarily adopt the mm of a task. Change the task_tick_numa() check to exclude kernel threads in general, as it doesn't make sense to attempt ot balance for kthreads anyway. Reported-by: Stefano Garzarella <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 642aa86 ] The Lenovo Thinkpad T470s I own has a different touchpad with "LEN007a" instead of the already included PNP ID "LEN006c". However, my touchpad seems to work well without any problems using RMI. So this patch adds the other PNP ID. Signed-off-by: Dennis Kadioglu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Dmitry Torokhov <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit e0bbb53 ] Current implementation could destory a4 & a5 when strace, so we need to get them from pt_regs by SAVE_ALL. Signed-off-by: Guo Ren <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 20be493 ] Fix several issues in the previous gfs2_find_jhead fix: * When updating @blocks_submitted, @block refers to the first block block not submitted yet, not the last block submitted, so fix an off-by-one error. * We want to ensure that @blocks_submitted is far enough ahead of @blocks_read to guarantee that there is in-flight I/O. Otherwise, we'll eventually end up waiting for pages that haven't been submitted, yet. * It's much easier to compare the number of blocks added with the number of blocks submitted to limit the maximum bio size. * Even with bio chaining, we can keep adding blocks until we reach the maximum bio size, as long as we stop at a page boundary. This simplifies the logic. Signed-off-by: Andreas Gruenbacher <[email protected]> Reviewed-by: Bob Peterson <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 7846889 ] VNIC protocol version is reported in big-endian format, but it is not byteswapped before logging. Fix that, and remove version comparison as only one protocol version exists at this time. Signed-off-by: Thomas Falcon <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit a101950 ] Commit 1ca3dec ("powerpc/xive: Prevent page fault issues in the machine crash handler") fixed an issue in the FW assisted dump of machines using hash MMU and the XIVE interrupt mode under the POWER hypervisor. It forced the mapping of the ESB page of interrupts being mapped in the Linux IRQ number space to make sure the 'crash kexec' sequence worked during such an event. But it didn't handle the un-mapping. This mapping is now blocking the removal of a passthrough IO adapter under the POWER hypervisor because it expects the guest OS to have cleared all page table entries related to the adapter. If some are still present, the RTAS call which isolates the PCI slot returns error 9001 "valid outstanding translations". Remove these mapping in the IRQ data cleanup routine. Under KVM, this cleanup is not required because the ESB pages for the adapter interrupts are un-mapped from the guest by the hypervisor in the KVM XIVE native device. This is now redundant but it's harmless. Fixes: 1ca3dec ("powerpc/xive: Prevent page fault issues in the machine crash handler") Cc: [email protected] # v5.5+ Signed-off-by: Cédric Le Goater <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 9aea644 ] Commit 6e0a32d ("spi: dw: Fix default polarity of native chipselect") attempted to fix the problem when GPIO active-high chip-select is utilized to communicate with some SPI slave. It fixed the problem, but broke the normal native CS support. At the same time the reversion commit ada9e3f ("spi: dw: Correct handling of native chipselect") didn't solve the problem either, since it just inverted the set_cs() polarity perception without taking into account that CS-high might be applicable. Here is what is done to finally fix the problem. DW SPI controller demands any native CS being set in order to proceed with data transfer. So in order to activate the SPI communications we must set any bit in the Slave Select DW SPI controller register no matter whether the platform requests the GPIO- or native CS. Preferably it should be the bit corresponding to the SPI slave CS number. But currently the dw_spi_set_cs() method activates the chip-select only if the second argument is false. Since the second argument of the set_cs callback is expected to be a boolean with "is-high" semantics (actual chip-select pin state value), the bit in the DW SPI Slave Select register will be set only if SPI core requests the driver to set the CS in the low state. So this will work for active-low GPIO-based CS case, and won't work for active-high CS setting the bit when SPI core actually needs to deactivate the CS. This commit fixes the problem for all described cases. So no matter whether an SPI slave needs GPIO- or native-based CS with active-high or low signal the corresponding bit will be set in SER. Signed-off-by: Serge Semin <[email protected]> Fixes: ada9e3f ("spi: dw: Correct handling of native chipselect") Fixes: 6e0a32d ("spi: dw: Fix default polarity of native chipselect") Reviewed-by: Charles Keepax <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Acked-by: Linus Walleij <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 450edd2 ] Some devices like TP-Link TL-WN722N produces this kind of messages frequently. kernel: ath: phy0: Short RX data len, dropping (dlen: 4) This warning is useful for developers to recognize that the device (Wi-Fi dongle or USB hub etc) is noisy but not for general users. So this patch make this warning to debug message. Reported-By: Denis <[email protected]> Ref: https://bugzilla.kernel.org/show_bug.cgi?id=207539 Fixes: cd486e6 ("ath9k_htc: Discard undersized packets") Signed-off-by: Masashi Honma <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 00720f0 ] The mix of IS_ENABLED() and #ifdef checks has left a combination that causes a warning about an unused variable: security/smack/smack_lsm.c: In function 'smack_socket_connect': security/smack/smack_lsm.c:2838:24: error: unused variable 'sip' [-Werror=unused-variable] 2838 | struct sockaddr_in6 *sip = (struct sockaddr_in6 *)sap; Change the code to use C-style checks consistently so the compiler can handle it correctly. Fixes: 87fbfff ("broken ping to ipv6 linklocal addresses on debian buster") Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Casey Schaufler <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit eb356e6 ] If is_closed is set, and the event list is empty, then read() will return -EIO without blocking. After setting is_closed in ib_uverbs_free_event_queue(), we do trigger a wake_up on the poll_wait, but the fops->poll() function does not check it, so poll will continue to sleep on an empty list. Fixes: 14e23bd ("RDMA/core: Fix locking in ib_uverbs_event_read") Link: https://lore.kernel.org/r/0-v1-ace813388969+48859-uverbs_poll_fix%[email protected] Reviewed-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 3c2214b ] Removing the pcrypt module triggers this: general protection fault, probably for non-canonical address 0xdead000000000122 CPU: 5 PID: 264 Comm: modprobe Not tainted 5.6.0+ Freescale#2 Hardware name: QEMU Standard PC RIP: 0010:__cpuhp_state_remove_instance+0xcc/0x120 Call Trace: padata_sysfs_release+0x74/0xce kobject_put+0x81/0xd0 padata_free+0x12/0x20 pcrypt_exit+0x43/0x8ee [pcrypt] padata instances wrongly use the same hlist node for the online and dead states, so __padata_free()'s second cpuhp remove call chokes on the node that the first poisoned. cpuhp multi-instance callbacks only walk forward in cpuhp_step->list and the same node is linked in both the online and dead lists, so the list corruption that results from padata_alloc() adding the node to a second list without removing it from the first doesn't cause problems as long as no instances are freed. Avoid the issue by giving each state its own node. Fixes: 894c9ef ("padata: validate cpumask without removed CPU during offline") Signed-off-by: Daniel Jordan <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Steffen Klassert <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] # v5.4+ Signed-off-by: Herbert Xu <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit e1750a3 ] After disabling a function, the original handle is logged instead of the disabled handle. Link: https://lkml.kernel.org/r/[email protected] Fixes: 17cdec9 ("s390/pci: Recover handle in clp_set_pci_fn()") Reviewed-by: Pierre Morel <[email protected]> Signed-off-by: Petr Tesarik <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit e2abfc0 ] Commit 21b5ee5 ("x86/cpu/amd: Enable the fixed Instructions Retired counter IRPERF") mistakenly added erratum torvalds#1054 as an OS Visible Workaround (OSVW) ID 0. Erratum torvalds#1054 is not OSVW ID 0 [1], so make it a legacy erratum. There would never have been a false positive on older hardware that has OSVW bit 0 set, since the IRPERF feature was not available. However, save a couple of RDMSR executions per thread, on modern system configurations that correctly set non-zero values in their OSVW_ID_Length MSRs. [1] Revision Guide for AMD Family 17h Models 00h-0Fh Processors. The revision guide is available from the bugzilla link below. Fixes: 21b5ee5 ("x86/cpu/amd: Enable the fixed Instructions Retired counter IRPERF") Reported-by: Andrew Cooper <[email protected]> Signed-off-by: Kim Phillips <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit d43e267 ] KVM stores the gfn in MMIO SPTEs as a caching optimization. These are split in two parts, as in "[high 11111 low]", to thwart any attempt to use these bits in an L1TF attack. This works as long as there are 5 free bits between MAXPHYADDR and bit 50 (inclusive), leaving bit 51 free so that the MMIO access triggers a reserved-bit-set page fault. The bit positions however were computed wrongly for AMD processors that have encryption support. In this case, x86_phys_bits is reduced (for example from 48 to 43, to account for the C bit at position 47 and four bits used internally to store the SEV ASID and other stuff) while x86_cache_bits in would remain set to 48, and _all_ bits between the reduced MAXPHYADDR and bit 51 are set. Then low_phys_bits would also cover some of the bits that are set in the shadow_mmio_value, terribly confusing the gfn caching mechanism. To fix this, avoid splitting gfns as long as the processor does not have the L1TF bug (which includes all AMD processors). When there is no splitting, low_phys_bits can be set to the reduced MAXPHYADDR removing the overlap. This fixes "npt=0" operation on EPYC processors. Thanks to Maxim Levitsky for bisecting this bug. Cc: [email protected] Fixes: 52918ed ("KVM: SVM: Override default MMIO mask if memory encryption is enabled") Signed-off-by: Paolo Bonzini <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f044baa ] The caller of pcie_wait_for_link_delay() specifies the time to wait after the link becomes active. When the downstream port doesn't support link active reporting, obviously we can't tell when the link becomes active, so we waited the worst-case time (1000 ms) plus 100 ms, ignoring the delay from the caller. Instead, wait for 1000 ms + the delay from the caller. Fixes: 4827d63 ("PCI/PM: Add pcie_wait_for_link_delay()") Signed-off-by: Bjorn Helgaas <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit c6aab66 ] Since the commit 6a13a0d ("ftrace/kprobe: Show the maxactive number on kprobe_events") introduced to show the instance number of kretprobe events, the length of the 1st format of the kprobe event will not 1, but it can be longer. This caused a parser error in perf-probe. Skip the length check the 1st format of the kprobe event to accept this instance number. Without this fix: # perf probe -a vfs_read%return Added new event: probe:vfs_read__return (on vfs_read%return) You can now use it in all perf tools, such as: perf record -e probe:vfs_read__return -aR sleep 1 # perf probe -l Semantic error :Failed to parse event name: r16:probe/vfs_read__return Error: Failed to show event list. And with this fixes: # perf probe -a vfs_read%return ... # perf probe -l probe:vfs_read__return (on vfs_read%return) Fixes: 6a13a0d ("ftrace/kprobe: Show the maxactive number on kprobe_events") Reported-by: Yuxuan Shui <[email protected]> Signed-off-by: Masami Hiramatsu <[email protected]> Tested-by: Yuxuan Shui <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207587 Link: http://lore.kernel.org/lkml/158877535215.26469.1113127926699134067.stgit@devnote2 Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit d4eaa28 ] For kvmalloc'ed data object that contains sensitive information like cryptographic keys, we need to make sure that the buffer is always cleared before freeing it. Using memset() alone for buffer clearing may not provide certainty as the compiler may compile it away. To be sure, the special memzero_explicit() has to be used. This patch introduces a new kvfree_sensitive() for freeing those sensitive data objects allocated by kvmalloc(). The relevant places where kvfree_sensitive() can be used are modified to use it. Fixes: 4f08824 ("KEYS: Avoid false positive ENOMEM error on key read") Suggested-by: Linus Torvalds <[email protected]> Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Eric Biggers <[email protected]> Acked-by: David Howells <[email protected]> Cc: Jarkko Sakkinen <[email protected]> Cc: James Morris <[email protected]> Cc: "Serge E. Hallyn" <[email protected]> Cc: Joe Perches <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: David Rientjes <[email protected]> Cc: Uladzislau Rezki <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 0531b03 ] Flower tests used to create ingress filter with specified parent qdisc "parent ffff:" but dump them on "ingress". With recent commit that fixed tcm_parent handling in dump those are not considered same parent anymore, which causes iproute2 tc to emit additional "parent ffff:" in first line of filter dump output. The change in output causes filter match in tests to fail. Prevent parent qdisc output when dumping filters in flower tests by always correctly specifying "ingress" parent both when creating and dumping filters. Fixes: a7df487 ("net_sched: fix tcm_parent in tc filter dump") Signed-off-by: Vlad Buslov <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 2f02fd3 ] The comments in fanotify_group_event_mask() say: "If the event is on dir/child and this mark doesn't care about events on dir/child, don't send it!" Specifically, mount and filesystem marks do not care about events on child, but they can still specify an ignore mask for those events. For example, a group that has: - A mount mark with mask 0 and ignore_mask FAN_OPEN - An inode mark on a directory with mask FAN_OPEN | FAN_OPEN_EXEC with flag FAN_EVENT_ON_CHILD A child file open for exec would be reported to group with the FAN_OPEN event despite the fact that FAN_OPEN is in ignore mask of mount mark, because the mark iteration loop skips over non-inode marks for events on child when calculating the ignore mask. Move ignore mask calculation to the top of the iteration loop block before excluding marks for events on dir/child. Link: https://lore.kernel.org/r/[email protected] Reported-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/linux-fsdevel/[email protected]/ Fixes: 55bf882 "fanotify: fix merging marks masks with FAN_ONDIR" Fixes: b469e7e "fanotify: fix handling of events on child..." Signed-off-by: Amir Goldstein <[email protected]> Signed-off-by: Jan Kara <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
commit 530f32f upstream. Avi Kivity reports that on fuse filesystems running in a user namespace asyncronous fsync fails with EOVERFLOW. The reason is that f_ops->fsync() is called with the creds of the kthread performing aio work instead of the creds of the process originally submitting IOCB_CMD_FSYNC. Fuse sends the creds of the caller in the request header and it needs to translate the uid and gid into the server's user namespace. Since the kthread is running in init_user_ns, the translation will fail and the operation returns an error. It can be argued that fsync doesn't actually need any creds, but just zeroing out those fields in the header (as with requests that currently don't take creds) is a backward compatibility risk. Instead of working around this issue in fuse, solve the core of the problem by calling the filesystem with the proper creds. Reported-by: Avi Kivity <[email protected]> Tested-by: Giuseppe Scrivano <[email protected]> Fixes: c9582eb ("fuse: Fail all requests with invalid uids or gids") Cc: [email protected] # 4.18+ Signed-off-by: Miklos Szeredi <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit e4ff08a upstream. Write out of slab bounds. We should check epid. The case reported by syzbot: https://lore.kernel.org/linux-usb/[email protected] BUG: KASAN: use-after-free in htc_process_conn_rsp drivers/net/wireless/ath/ath9k/htc_hst.c:131 [inline] BUG: KASAN: use-after-free in ath9k_htc_rx_msg+0xa25/0xaf0 drivers/net/wireless/ath/ath9k/htc_hst.c:443 Write of size 2 at addr ffff8881cea291f0 by task swapper/1/0 Call Trace: htc_process_conn_rsp drivers/net/wireless/ath/ath9k/htc_hst.c:131 [inline] ath9k_htc_rx_msg+0xa25/0xaf0 drivers/net/wireless/ath/ath9k/htc_hst.c:443 ath9k_hif_usb_reg_in_cb+0x1ba/0x630 drivers/net/wireless/ath/ath9k/hif_usb.c:718 __usb_hcd_giveback_urb+0x29a/0x550 drivers/usb/core/hcd.c:1650 usb_hcd_giveback_urb+0x368/0x420 drivers/usb/core/hcd.c:1716 dummy_timer+0x1258/0x32ae drivers/usb/gadget/udc/dummy_hcd.c:1966 call_timer_fn+0x195/0x6f0 kernel/time/timer.c:1404 expire_timers kernel/time/timer.c:1449 [inline] __run_timers kernel/time/timer.c:1773 [inline] __run_timers kernel/time/timer.c:1740 [inline] run_timer_softirq+0x5f9/0x1500 kernel/time/timer.c:1786 Reported-and-tested-by: [email protected] Signed-off-by: Qiujun Huang <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 19d6c37 upstream. Add barrier to accessing the stack array skb_pool. The case reported by syzbot: https://lore.kernel.org/linux-usb/[email protected] BUG: KASAN: stack-out-of-bounds in ath9k_hif_usb_rx_stream drivers/net/wireless/ath/ath9k/hif_usb.c:626 [inline] BUG: KASAN: stack-out-of-bounds in ath9k_hif_usb_rx_cb+0xdf6/0xf70 drivers/net/wireless/ath/ath9k/hif_usb.c:666 Write of size 8 at addr ffff8881db309a28 by task swapper/1/0 Call Trace: ath9k_hif_usb_rx_stream drivers/net/wireless/ath/ath9k/hif_usb.c:626 [inline] ath9k_hif_usb_rx_cb+0xdf6/0xf70 drivers/net/wireless/ath/ath9k/hif_usb.c:666 __usb_hcd_giveback_urb+0x1f2/0x470 drivers/usb/core/hcd.c:1648 usb_hcd_giveback_urb+0x368/0x420 drivers/usb/core/hcd.c:1713 dummy_timer+0x1258/0x32ae drivers/usb/gadget/udc/dummy_hcd.c:1966 call_timer_fn+0x195/0x6f0 kernel/time/timer.c:1404 expire_timers kernel/time/timer.c:1449 [inline] __run_timers kernel/time/timer.c:1773 [inline] __run_timers kernel/time/timer.c:1740 [inline] run_timer_softirq+0x5f9/0x1500 kernel/time/timer.c:1786 Reported-and-tested-by: [email protected] Signed-off-by: Qiujun Huang <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 2bbcaae upstream. In ath9k_hif_usb_rx_cb interface number is assumed to be 0. usb_ifnum_to_if(urb->dev, 0) But it isn't always true. The case reported by syzbot: https://lore.kernel.org/linux-usb/[email protected] usb 2-1: new high-speed USB device number 2 using dummy_hcd usb 2-1: config 1 has an invalid interface number: 2 but max is 0 usb 2-1: config 1 has no interface number 0 usb 2-1: New USB device found, idVendor=0cf3, idProduct=9271, bcdDevice= 1.08 usb 2-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3 general protection fault, probably for non-canonical address 0xdffffc0000000015: 0000 [Freescale#1] SMP KASAN KASAN: null-ptr-deref in range [0x00000000000000a8-0x00000000000000af] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.6.0-rc5-syzkaller #0 Call Trace __usb_hcd_giveback_urb+0x29a/0x550 drivers/usb/core/hcd.c:1650 usb_hcd_giveback_urb+0x368/0x420 drivers/usb/core/hcd.c:1716 dummy_timer+0x1258/0x32ae drivers/usb/gadget/udc/dummy_hcd.c:1966 call_timer_fn+0x195/0x6f0 kernel/time/timer.c:1404 expire_timers kernel/time/timer.c:1449 [inline] __run_timers kernel/time/timer.c:1773 [inline] __run_timers kernel/time/timer.c:1740 [inline] run_timer_softirq+0x5f9/0x1500 kernel/time/timer.c:1786 __do_softirq+0x21e/0x950 kernel/softirq.c:292 invoke_softirq kernel/softirq.c:373 [inline] irq_exit+0x178/0x1a0 kernel/softirq.c:413 exiting_irq arch/x86/include/asm/apic.h:546 [inline] smp_apic_timer_interrupt+0x141/0x540 arch/x86/kernel/apic/apic.c:1146 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829 Reported-and-tested-by: [email protected] Signed-off-by: Qiujun Huang <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 84e99e5 upstream. Add barrier to soob. Return -EOVERFLOW if the buffer is exceeded. Suggested-by: Hillf Danton <[email protected]> Reported-by: [email protected] Signed-off-by: Casey Schaufler <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 0ea2ea4 upstream. We need to keep the reference to the drm_gem_object until the last access by vkms_dumb_create. Therefore, the put the object after it is used. This fixes a use-after-free issue reported by syzbot. While here, change vkms_gem_create() symbol to static. Reported-and-tested-by: [email protected] Signed-off-by: Ezequiel Garcia <[email protected]> Reviewed-by: Rodrigo Siqueira <[email protected]> Signed-off-by: Rodrigo Siqueira <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit dde3c6b upstream. syzkaller reports for memory leak when kobject_init_and_add() returns an error in the function sysfs_slab_add() [1] When this happened, the function kobject_put() is not called for the corresponding kobject, which potentially leads to memory leak. This patch fixes the issue by calling kobject_put() even if kobject_init_and_add() fails. [1] BUG: memory leak unreferenced object 0xffff8880a6d4be88 (size 8): comm "syz-executor.3", pid 946, jiffies 4295772514 (age 18.396s) hex dump (first 8 bytes): 70 69 64 5f 33 00 ff ff pid_3... backtrace: kstrdup+0x35/0x70 mm/util.c:60 kstrdup_const+0x3d/0x50 mm/util.c:82 kvasprintf_const+0x112/0x170 lib/kasprintf.c:48 kobject_set_name_vargs+0x55/0x130 lib/kobject.c:289 kobject_add_varg lib/kobject.c:384 [inline] kobject_init_and_add+0xd8/0x170 lib/kobject.c:473 sysfs_slab_add+0x1d8/0x290 mm/slub.c:5811 __kmem_cache_create+0x50a/0x570 mm/slub.c:4384 create_cache+0x113/0x1e0 mm/slab_common.c:407 kmem_cache_create_usercopy+0x1a1/0x260 mm/slab_common.c:505 kmem_cache_create+0xd/0x10 mm/slab_common.c:564 create_pid_cachep kernel/pid_namespace.c:54 [inline] create_pid_namespace kernel/pid_namespace.c:96 [inline] copy_pid_ns+0x77c/0x8f0 kernel/pid_namespace.c:148 create_new_namespaces+0x26b/0xa30 kernel/nsproxy.c:95 unshare_nsproxy_namespaces+0xa7/0x1e0 kernel/nsproxy.c:229 ksys_unshare+0x3d2/0x770 kernel/fork.c:2969 __do_sys_unshare kernel/fork.c:3037 [inline] __se_sys_unshare kernel/fork.c:3035 [inline] __x64_sys_unshare+0x2d/0x40 kernel/fork.c:3035 do_syscall_64+0xa1/0x530 arch/x86/entry/common.c:295 Fixes: 80da026 ("mm/slub: fix slab double-free in case of duplicate sysfs filename") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Wang Hai <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit b1b6575 upstream. If FAT length == 0, the image doesn't have any data. And it can be the cause of overlapping the root dir and FAT entries. Also Windows treats it as invalid format. Reported-by: [email protected] Signed-off-by: OGAWA Hirofumi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Marco Elver <[email protected]> Cc: Dmitry Vyukov <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 2ed6edd upstream. Under rare circumstances, task_function_call() can repeatedly fail and cause a soft lockup. There is a slight race where the process is no longer running on the cpu we targeted by the time remote_function() runs. The code will simply try again. If we are very unlucky, this will continue to fail, until a watchdog fires. This can happen in a heavily loaded, multi-core virtual machine. Reported-by: [email protected] Signed-off-by: Barret Rhoden <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit f30d3ce upstream. After changing the timing between GTT updates and execution on the GPU, we started seeing sporadic failures on Ironlake. These were narrowed down to being an insufficiently strong enough barrier/delay after updating the GTT and scheduling execution on the GPU. By forcing the uncached read, and adding the missing barrier for the singular insert_page (relocation paths), the sporadic failures go away. Fixes: 983d308 ("agp/intel: Serialise after GTT updates") Fixes: 3497971 ("agp/intel: Flush chipset writes after updating a single PTE") Signed-off-by: Chris Wilson <[email protected]> Acked-by: Andi Shyti <[email protected]> Cc: [email protected] # v4.0+ Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 9253d71 upstream. Clear tuning_done flag while executing tuning to ensure vendor specific HS400 settings are applied properly when the controller is re-initialized in HS400 mode. Without this, re-initialization of the qcom SDHC in HS400 mode fails while resuming the driver from runtime-suspend or system-suspend. Fixes: ff06ce4 ("mmc: sdhci-msm: Add HS400 platform support") Cc: [email protected] Signed-off-by: Veerabhadrarao Badiganti <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ulf Hansson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit fe8d33b upstream. Turning on CONFIG_DMA_API_DEBUG_SG results in the following warning: WARNING: CPU: 1 PID: 20 at kernel/dma/debug.c:500 add_dma_entry+0x16c/0x17c DMA-API: exceeded 7 overlapping mappings of cacheline 0x031d2645 Modules linked in: CPU: 1 PID: 20 Comm: kworker/1:1 Not tainted 5.5.0-rc2-00021-gdeda30999c2b-dirty Freescale#49 Hardware name: STM32 (Device Tree Support) Workqueue: events_freezable mmc_rescan [<c03138c0>] (unwind_backtrace) from [<c030d760>] (show_stack+0x10/0x14) [<c030d760>] (show_stack) from [<c0f2eb28>] (dump_stack+0xc0/0xd4) [<c0f2eb28>] (dump_stack) from [<c034a14c>] (__warn+0xd0/0xf8) [<c034a14c>] (__warn) from [<c034a530>] (warn_slowpath_fmt+0x94/0xb8) [<c034a530>] (warn_slowpath_fmt) from [<c03bca0c>] (add_dma_entry+0x16c/0x17c) [<c03bca0c>] (add_dma_entry) from [<c03bdf54>] (debug_dma_map_sg+0xe4/0x3d4) [<c03bdf54>] (debug_dma_map_sg) from [<c0d09244>] (sdmmc_idma_prep_data+0x94/0xf8) [<c0d09244>] (sdmmc_idma_prep_data) from [<c0d05a2c>] (mmci_prep_data+0x2c/0xb0) [<c0d05a2c>] (mmci_prep_data) from [<c0d073ec>] (mmci_start_data+0x134/0x2f0) [<c0d073ec>] (mmci_start_data) from [<c0d078d0>] (mmci_request+0xe8/0x154) [<c0d078d0>] (mmci_request) from [<c0cecb44>] (mmc_start_request+0x94/0xbc) DMA api debug brings to light leaking dma-mappings, dma_map_sg and dma_unmap_sg are not correctly balanced. If a request is prepared, the dma_map/unmap are done in asynchronous call pre_req (prep_data) and post_req (unprep_data). In this case the dma-mapping is right balanced. But if the request was not prepared, the data->host_cookie is define to zero and the dma_map/unmap must be done in the request. The dma_map is called by mmci_dma_start (prep_data), but there is no dma_unmap in this case. This patch adds dma_unmap_sg when the dma is finalized and the data cookie is zero (request not prepared). Signed-off-by: Ludovic Barre <[email protected]> Link: https://lore.kernel.org/r/[email protected] Fixes: 46b723d ("mmc: mmci: add stm32 sdmmc variant") Cc: [email protected] Signed-off-by: Ulf Hansson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 4bd7844 upstream. Before calling tmio_mmc_host_probe(), the caller is required to enable clocks for its device, as to make it accessible when reading/writing registers during probe. Therefore, the responsibility to disable these clocks, in the error path of ->probe() and during ->remove(), is better managed outside tmio_mmc_host_remove(). As a matter of fact, callers of tmio_mmc_host_remove() already expects this to be the behaviour. However, there's a problem with tmio_mmc_host_remove() when the Kconfig option, CONFIG_PM, is set. More precisely, tmio_mmc_host_remove() may then disable the clock via runtime PM, which leads to clock enable/disable imbalance problems, when the caller of tmio_mmc_host_remove() also tries to disable the same clocks. To solve the problem, let's make sure tmio_mmc_host_remove() leaves the device with clocks enabled, but also make sure to disable the IRQs, as we normally do at ->runtime_suspend(). Reported-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Wolfram Sang <[email protected]> Tested-by: Wolfram Sang <[email protected]> Signed-off-by: Ulf Hansson <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Tested-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 5d1f42e upstream. Currently, tmio_mmc_irq() handler is registered before the host is fully initialized by tmio_mmc_host_probe(). I did not previously notice this problem. The boot ROM of a new Socionext SoC unmasks interrupts (CTL_IRQ_MASK) somehow. The handler is invoked before tmio_mmc_host_probe(), then emits noisy call trace. Move devm_request_irq() below tmio_mmc_host_probe(). Fixes: 3fd784f ("mmc: uniphier-sd: add UniPhier SD/eMMC controller driver") Signed-off-by: Masahiro Yamada <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ulf Hansson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit a1af7f3 upstream. Remove non-removable and mmc-ddr-1_8v properties from the sdmmc0 node which come probably from an unchecked copy/paste. Signed-off-by: Ludovic Desroches <[email protected]> Fixes:42ed535595ec "ARM: dts: at91: introduce the sama5d2 ptc ek board" Cc: [email protected] # 4.19 and later Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexandre Belloni <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit f04086c upstream. During some scenarios mmc_sdio_init_card() runs a retry path for the UHS-I specific initialization, which leads to removal of the previously allocated card. A new card is then re-allocated while retrying. However, in one of the corresponding error paths we may end up to remove an already removed card, which likely leads to a NULL pointer exception. So, let's fix this. Fixes: 5fc3d80 ("mmc: sdio: don't use rocr to check if the card could support UHS mode") Cc: <[email protected]> Signed-off-by: Ulf Hansson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit a94a59f upstream. Over the years, the code in mmc_sdio_init_card() has grown to become quite messy. Unfortunate this has also lead to that several paths are leaking memory in form of an allocated struct mmc_card, which includes additional data, such as initialized struct device for example. Unfortunate, it's a too complex task find each offending commit. Therefore, this change fixes all memory leaks at once. Cc: <[email protected]> Signed-off-by: Ulf Hansson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 263c615 upstream. Since the switch of floppy driver to blk-mq, the contended (fdc_busy) case in floppy_queue_rq() is not handled correctly. In case we reach floppy_queue_rq() with fdc_busy set (i.e. with the floppy locked due to another request still being in-flight), we put the request on the list of requests and return BLK_STS_OK to the block core, without actually scheduling delayed work / doing further processing of the request. This means that processing of this request is postponed until another request comes and passess uncontended. Which in some cases might actually never happen and we keep waiting indefinitely. The simple testcase is for i in `seq 1 2000`; do echo -en $i '\r'; blkid --info /dev/fd0 2> /dev/null; done run in quemu. That reliably causes blkid eventually indefinitely hanging in __floppy_read_block_0() waiting for completion, as the BIO callback never happens, and no further IO is ever submitted on the (non-existent) floppy device. This was observed reliably on qemu-emulated device. Fix that by not queuing the request in the contended case, and return BLK_STS_RESOURCE instead, so that blk core handles the request rescheduling and let it pass properly non-contended later. Fixes: a9f38e1 ("floppy: convert to blk-mq") Cc: [email protected] Tested-by: Libor Pechacek <[email protected]> Signed-off-by: Jiri Kosina <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit c8d70a2 upstream. backend_connect() can fail, so switch the device to connected only if no error occurred. Fixes: 0a9c75c ("xen/pvcalls: xenbus state handling") Cc: [email protected] Signed-off-by: Juergen Gross <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Stefano Stabellini <[email protected]> Signed-off-by: Boris Ostrovsky <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 0370964 upstream. On a VHE system, the EL1 state is left in the CPU most of the time, and only syncronized back to memory when vcpu_put() is called (most of the time on preemption). Which means that when injecting an exception, we'd better have a way to either: (1) write directly to the EL1 sysregs (2) synchronize the state back to memory, and do the changes there For an AArch64, we already do (1), so we are safe. Unfortunately, doing the same thing for AArch32 would be pretty invasive. Instead, we can easily implement (2) by calling the put/load architectural backends, and keep preemption disabled. We can then reload the state back into EL1. Cc: [email protected] Reported-by: James Morse <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit ef3e40a upstream. When using the PtrAuth feature in a guest, we need to save the host's keys before allowing the guest to program them. For that, we dump them in a per-CPU data structure (the so called host context). But both call sites that do this are in preemptible context, which may end up in disaster should the vcpu thread get preempted before reentering the guest. Instead, save the keys eagerly on each vcpu_load(). This has an increased overhead, but is at least safe. Cc: [email protected] Reviewed-by: Mark Rutland <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
This is the 5.4.47 stable release Conflicts (manual resolved, upstream revision taken): drivers/firmware/imx/imx-scu.c Upstream commit 0070e73 is merged and taken to resolve conflict, as it provides additional check for size bound condition which solved the issue with corruption of header in imx_scu_ipc_write. Signed-off-by: Andrey Zhizhikin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Aug 16, 2021
[ Upstream commit 6206b79 ] Liajian reported a bug_on hit on a ThunderX2 arm64 server with FastLinQ QL41000 ethernet controller: BUG: scheduling while atomic: kworker/0:4/531/0x00000200 [qed_probe:488()]hw prepare failed kernel BUG at mm/vmalloc.c:2355! Internal error: Oops - BUG: 0 [Freescale#1] SMP CPU: 0 PID: 531 Comm: kworker/0:4 Tainted: G W 5.4.0-77-generic Freescale#86-Ubuntu pstate: 00400009 (nzcv daif +PAN -UAO) Call trace: vunmap+0x4c/0x50 iounmap+0x48/0x58 qed_free_pci+0x60/0x80 [qed] qed_probe+0x35c/0x688 [qed] __qede_probe+0x88/0x5c8 [qede] qede_probe+0x60/0xe0 [qede] local_pci_probe+0x48/0xa0 work_for_cpu_fn+0x24/0x38 process_one_work+0x1d0/0x468 worker_thread+0x238/0x4e0 kthread+0xf0/0x118 ret_from_fork+0x10/0x18 In this case, qed_hw_prepare() returns error due to hw/fw error, but in theory work queue should be in process context instead of interrupt. The root cause might be the unpaired spin_{un}lock_bh() in _qed_mcp_cmd_and_union(), which causes botton half is disabled incorrectly. Reported-by: Lijian Zhang <[email protected]> Signed-off-by: Jia He <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Aug 16, 2021
[ Upstream commit a17ad09 ] In some cases skb head could be locked and entire header data is pulled from skb. When skb_zerocopy() called in such cases, following BUG is triggered. This patch fixes it by copying entire skb in such cases. This could be optimized incase this is performance bottleneck. ---8<--- kernel BUG at net/core/skbuff.c:2961! invalid opcode: 0000 [Freescale#1] SMP PTI CPU: 2 PID: 0 Comm: swapper/2 Tainted: G OE 5.4.0-77-generic Freescale#86-Ubuntu Hardware name: OpenStack Foundation OpenStack Nova, BIOS 1.13.0-1ubuntu1.1 04/01/2014 RIP: 0010:skb_zerocopy+0x37a/0x3a0 RSP: 0018:ffffbcc70013ca38 EFLAGS: 00010246 Call Trace: <IRQ> queue_userspace_packet+0x2af/0x5e0 [openvswitch] ovs_dp_upcall+0x3d/0x60 [openvswitch] ovs_dp_process_packet+0x125/0x150 [openvswitch] ovs_vport_receive+0x77/0xd0 [openvswitch] netdev_port_receive+0x87/0x130 [openvswitch] netdev_frame_hook+0x4b/0x60 [openvswitch] __netif_receive_skb_core+0x2b4/0xc90 __netif_receive_skb_one_core+0x3f/0xa0 __netif_receive_skb+0x18/0x60 process_backlog+0xa9/0x160 net_rx_action+0x142/0x390 __do_softirq+0xe1/0x2d6 irq_exit+0xae/0xb0 do_IRQ+0x5a/0xf0 common_interrupt+0xf/0xf Code that triggered BUG: int skb_zerocopy(struct sk_buff *to, struct sk_buff *from, int len, int hlen) { int i, j = 0; int plen = 0; /* length of skb->head fragment */ int ret; struct page *page; unsigned int offset; BUG_ON(!from->head_frag && !hlen); Signed-off-by: Pravin B Shelar <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
LeBlue
pushed a commit
to LeBlue/linux-fslc
that referenced
this pull request
Jan 20, 2022
[ Upstream commit a17ad09 ] In some cases skb head could be locked and entire header data is pulled from skb. When skb_zerocopy() called in such cases, following BUG is triggered. This patch fixes it by copying entire skb in such cases. This could be optimized incase this is performance bottleneck. ---8<--- kernel BUG at net/core/skbuff.c:2961! invalid opcode: 0000 [Freescale#1] SMP PTI CPU: 2 PID: 0 Comm: swapper/2 Tainted: G OE 5.4.0-77-generic Freescale#86-Ubuntu Hardware name: OpenStack Foundation OpenStack Nova, BIOS 1.13.0-1ubuntu1.1 04/01/2014 RIP: 0010:skb_zerocopy+0x37a/0x3a0 RSP: 0018:ffffbcc70013ca38 EFLAGS: 00010246 Call Trace: <IRQ> queue_userspace_packet+0x2af/0x5e0 [openvswitch] ovs_dp_upcall+0x3d/0x60 [openvswitch] ovs_dp_process_packet+0x125/0x150 [openvswitch] ovs_vport_receive+0x77/0xd0 [openvswitch] netdev_port_receive+0x87/0x130 [openvswitch] netdev_frame_hook+0x4b/0x60 [openvswitch] __netif_receive_skb_core+0x2b4/0xc90 __netif_receive_skb_one_core+0x3f/0xa0 __netif_receive_skb+0x18/0x60 process_backlog+0xa9/0x160 net_rx_action+0x142/0x390 __do_softirq+0xe1/0x2d6 irq_exit+0xae/0xb0 do_IRQ+0x5a/0xf0 common_interrupt+0xf/0xf Code that triggered BUG: int skb_zerocopy(struct sk_buff *to, struct sk_buff *from, int len, int hlen) { int i, j = 0; int plen = 0; /* length of skb->head fragment */ int ret; struct page *page; unsigned int offset; BUG_ON(!from->head_frag && !hlen); Signed-off-by: Pravin B Shelar <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Nov 17, 2022
…ging commit 1fdbed6 upstream. The following bug is reported to be triggered when starting X on x86-32 system with i915: [ 225.777375] kernel BUG at mm/memory.c:2664! [ 225.777391] invalid opcode: 0000 [Freescale#1] PREEMPT SMP [ 225.777405] CPU: 0 PID: 2402 Comm: Xorg Not tainted 6.1.0-rc3-bdg+ Freescale#86 [ 225.777415] Hardware name: /8I865G775-G, BIOS F1 08/29/2006 [ 225.777421] EIP: __apply_to_page_range+0x24d/0x31c [ 225.777437] Code: ff ff 8b 55 e8 8b 45 cc e8 0a 11 ec ff 89 d8 83 c4 28 5b 5e 5f 5d c3 81 7d e0 a0 ef 96 c1 74 ad 8b 45 d0 e8 2d 83 49 00 eb a3 <0f> 0b 25 00 f0 ff ff 81 eb 00 00 00 40 01 c3 8b 45 ec 8b 00 e8 76 [ 225.777446] EAX: 00000001 EBX: c53a3b58 ECX: b5c00000 EDX: c258aa00 [ 225.777454] ESI: b5c00000 EDI: b5900000 EBP: c4b0fdb4 ESP: c4b0fd80 [ 225.777462] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010202 [ 225.777470] CR0: 80050033 CR2: b5900000 CR3: 053a3000 CR4: 000006d0 [ 225.777479] Call Trace: [ 225.777486] ? i915_memcpy_init_early+0x63/0x63 [i915] [ 225.777684] apply_to_page_range+0x21/0x27 [ 225.777694] ? i915_memcpy_init_early+0x63/0x63 [i915] [ 225.777870] remap_io_mapping+0x49/0x75 [i915] [ 225.778046] ? i915_memcpy_init_early+0x63/0x63 [i915] [ 225.778220] ? mutex_unlock+0xb/0xd [ 225.778231] ? i915_vma_pin_fence+0x6d/0xf7 [i915] [ 225.778420] vm_fault_gtt+0x2a9/0x8f1 [i915] [ 225.778644] ? lock_is_held_type+0x56/0xe7 [ 225.778655] ? lock_is_held_type+0x7a/0xe7 [ 225.778663] ? 0xc1000000 [ 225.778670] __do_fault+0x21/0x6a [ 225.778679] handle_mm_fault+0x708/0xb21 [ 225.778686] ? mt_find+0x21e/0x5ae [ 225.778696] exc_page_fault+0x185/0x705 [ 225.778704] ? doublefault_shim+0x127/0x127 [ 225.778715] handle_exception+0x130/0x130 [ 225.778723] EIP: 0xb700468a Recently pud_huge() got aware of non-present entry by commit 3a194f3 ("mm/hugetlb: make pud_huge() and follow_huge_pud() aware of non-present pud entry") to handle some special states of gigantic page. However, it's overlooked that pud_none() always returns false when running with 2-level paging, and as a result pud_huge() can return true pointlessly. Introduce "#if CONFIG_PGTABLE_LEVELS > 2" to pud_huge() to deal with this. Link: https://lkml.kernel.org/r/[email protected] Fixes: 3a194f3 ("mm/hugetlb: make pud_huge() and follow_huge_pud() aware of non-present pud entry") Signed-off-by: Naoya Horiguchi <[email protected]> Reported-by: Ville Syrjälä <[email protected]> Tested-by: Ville Syrjälä <[email protected]> Reviewed-by: Miaohe Lin <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Liu Shixin <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Muchun Song <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Yang Shi <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Jan 20, 2023
commit 031af50 upstream. The inline assembly for arm64's cmpxchg_double*() implementations use a +Q constraint to hazard against other accesses to the memory location being exchanged. However, the pointer passed to the constraint is a pointer to unsigned long, and thus the hazard only applies to the first 8 bytes of the location. GCC can take advantage of this, assuming that other portions of the location are unchanged, leading to a number of potential problems. This is similar to what we fixed back in commit: fee960b ("arm64: xchg: hazard against entire exchange variable") ... but we forgot to adjust cmpxchg_double*() similarly at the same time. The same problem applies, as demonstrated with the following test: | struct big { | u64 lo, hi; | } __aligned(128); | | unsigned long foo(struct big *b) | { | u64 hi_old, hi_new; | | hi_old = b->hi; | cmpxchg_double_local(&b->lo, &b->hi, 0x12, 0x34, 0x56, 0x78); | hi_new = b->hi; | | return hi_old ^ hi_new; | } ... which GCC 12.1.0 compiles as: | 0000000000000000 <foo>: | 0: d503233f paciasp | 4: aa0003e4 mov x4, x0 | 8: 1400000e b 40 <foo+0x40> | c: d2800240 mov x0, #0x12 // Freescale#18 | 10: d2800681 mov x1, #0x34 // Freescale#52 | 14: aa0003e5 mov x5, x0 | 18: aa0103e6 mov x6, x1 | 1c: d2800ac2 mov x2, #0x56 // Freescale#86 | 20: d2800f03 mov x3, #0x78 // Freescale#120 | 24: 48207c82 casp x0, x1, x2, x3, [x4] | 28: ca050000 eor x0, x0, x5 | 2c: ca060021 eor x1, x1, x6 | 30: aa010000 orr x0, x0, x1 | 34: d2800000 mov x0, #0x0 // #0 <--- BANG | 38: d50323bf autiasp | 3c: d65f03c0 ret | 40: d2800240 mov x0, #0x12 // Freescale#18 | 44: d2800681 mov x1, #0x34 // Freescale#52 | 48: d2800ac2 mov x2, #0x56 // Freescale#86 | 4c: d2800f03 mov x3, #0x78 // Freescale#120 | 50: f9800091 prfm pstl1strm, [x4] | 54: c87f1885 ldxp x5, x6, [x4] | 58: ca0000a5 eor x5, x5, x0 | 5c: ca0100c6 eor x6, x6, x1 | 60: aa0600a6 orr x6, x5, x6 | 64: b5000066 cbnz x6, 70 <foo+0x70> | 68: c8250c82 stxp w5, x2, x3, [x4] | 6c: 35ffff45 cbnz w5, 54 <foo+0x54> | 70: d2800000 mov x0, #0x0 // #0 <--- BANG | 74: d50323bf autiasp | 78: d65f03c0 ret Notice that at the lines with "BANG" comments, GCC has assumed that the higher 8 bytes are unchanged by the cmpxchg_double() call, and that `hi_old ^ hi_new` can be reduced to a constant zero, for both LSE and LL/SC versions of cmpxchg_double(). This patch fixes the issue by passing a pointer to __uint128_t into the +Q constraint, ensuring that the compiler hazards against the entire 16 bytes being modified. With this change, GCC 12.1.0 compiles the above test as: | 0000000000000000 <foo>: | 0: f9400407 ldr x7, [x0, Freescale#8] | 4: d503233f paciasp | 8: aa0003e4 mov x4, x0 | c: 1400000f b 48 <foo+0x48> | 10: d2800240 mov x0, #0x12 // Freescale#18 | 14: d2800681 mov x1, #0x34 // Freescale#52 | 18: aa0003e5 mov x5, x0 | 1c: aa0103e6 mov x6, x1 | 20: d2800ac2 mov x2, #0x56 // Freescale#86 | 24: d2800f03 mov x3, #0x78 // Freescale#120 | 28: 48207c82 casp x0, x1, x2, x3, [x4] | 2c: ca050000 eor x0, x0, x5 | 30: ca060021 eor x1, x1, x6 | 34: aa010000 orr x0, x0, x1 | 38: f9400480 ldr x0, [x4, Freescale#8] | 3c: d50323bf autiasp | 40: ca0000e0 eor x0, x7, x0 | 44: d65f03c0 ret | 48: d2800240 mov x0, #0x12 // Freescale#18 | 4c: d2800681 mov x1, #0x34 // Freescale#52 | 50: d2800ac2 mov x2, #0x56 // Freescale#86 | 54: d2800f03 mov x3, #0x78 // Freescale#120 | 58: f9800091 prfm pstl1strm, [x4] | 5c: c87f1885 ldxp x5, x6, [x4] | 60: ca0000a5 eor x5, x5, x0 | 64: ca0100c6 eor x6, x6, x1 | 68: aa0600a6 orr x6, x5, x6 | 6c: b5000066 cbnz x6, 78 <foo+0x78> | 70: c8250c82 stxp w5, x2, x3, [x4] | 74: 35ffff45 cbnz w5, 5c <foo+0x5c> | 78: f9400480 ldr x0, [x4, Freescale#8] | 7c: d50323bf autiasp | 80: ca0000e0 eor x0, x7, x0 | 84: d65f03c0 ret ... sampling the high 8 bytes before and after the cmpxchg, and performing an EOR, as we'd expect. For backporting, I've tested this atop linux-4.9.y with GCC 5.5.0. Note that linux-4.9.y is oldest currently supported stable release, and mandates GCC 5.1+. Unfortunately I couldn't get a GCC 5.1 binary to run on my machines due to library incompatibilities. I've also used a standalone test to check that we can use a __uint128_t pointer in a +Q constraint at least as far back as GCC 4.8.5 and LLVM 3.9.1. Fixes: 5284e1b ("arm64: xchg: Implement cmpxchg_double") Fixes: e9a4b79 ("arm64: cmpxchg_dbl: patch in lse instructions when supported by the CPU") Reported-by: Boqun Feng <[email protected]> Link: https://lore.kernel.org/lkml/Y6DEfQXymYVgL3oJ@boqun-archlinux/ Reported-by: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]/ Signed-off-by: Mark Rutland <[email protected]> Cc: [email protected] Cc: Arnd Bergmann <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Steve Capper <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Jan 20, 2023
[ Upstream commit 031af50 ] The inline assembly for arm64's cmpxchg_double*() implementations use a +Q constraint to hazard against other accesses to the memory location being exchanged. However, the pointer passed to the constraint is a pointer to unsigned long, and thus the hazard only applies to the first 8 bytes of the location. GCC can take advantage of this, assuming that other portions of the location are unchanged, leading to a number of potential problems. This is similar to what we fixed back in commit: fee960b ("arm64: xchg: hazard against entire exchange variable") ... but we forgot to adjust cmpxchg_double*() similarly at the same time. The same problem applies, as demonstrated with the following test: | struct big { | u64 lo, hi; | } __aligned(128); | | unsigned long foo(struct big *b) | { | u64 hi_old, hi_new; | | hi_old = b->hi; | cmpxchg_double_local(&b->lo, &b->hi, 0x12, 0x34, 0x56, 0x78); | hi_new = b->hi; | | return hi_old ^ hi_new; | } ... which GCC 12.1.0 compiles as: | 0000000000000000 <foo>: | 0: d503233f paciasp | 4: aa0003e4 mov x4, x0 | 8: 1400000e b 40 <foo+0x40> | c: d2800240 mov x0, #0x12 // Freescale#18 | 10: d2800681 mov x1, #0x34 // Freescale#52 | 14: aa0003e5 mov x5, x0 | 18: aa0103e6 mov x6, x1 | 1c: d2800ac2 mov x2, #0x56 // Freescale#86 | 20: d2800f03 mov x3, #0x78 // Freescale#120 | 24: 48207c82 casp x0, x1, x2, x3, [x4] | 28: ca050000 eor x0, x0, x5 | 2c: ca060021 eor x1, x1, x6 | 30: aa010000 orr x0, x0, x1 | 34: d2800000 mov x0, #0x0 // #0 <--- BANG | 38: d50323bf autiasp | 3c: d65f03c0 ret | 40: d2800240 mov x0, #0x12 // Freescale#18 | 44: d2800681 mov x1, #0x34 // Freescale#52 | 48: d2800ac2 mov x2, #0x56 // Freescale#86 | 4c: d2800f03 mov x3, #0x78 // Freescale#120 | 50: f9800091 prfm pstl1strm, [x4] | 54: c87f1885 ldxp x5, x6, [x4] | 58: ca0000a5 eor x5, x5, x0 | 5c: ca0100c6 eor x6, x6, x1 | 60: aa0600a6 orr x6, x5, x6 | 64: b5000066 cbnz x6, 70 <foo+0x70> | 68: c8250c82 stxp w5, x2, x3, [x4] | 6c: 35ffff45 cbnz w5, 54 <foo+0x54> | 70: d2800000 mov x0, #0x0 // #0 <--- BANG | 74: d50323bf autiasp | 78: d65f03c0 ret Notice that at the lines with "BANG" comments, GCC has assumed that the higher 8 bytes are unchanged by the cmpxchg_double() call, and that `hi_old ^ hi_new` can be reduced to a constant zero, for both LSE and LL/SC versions of cmpxchg_double(). This patch fixes the issue by passing a pointer to __uint128_t into the +Q constraint, ensuring that the compiler hazards against the entire 16 bytes being modified. With this change, GCC 12.1.0 compiles the above test as: | 0000000000000000 <foo>: | 0: f9400407 ldr x7, [x0, Freescale#8] | 4: d503233f paciasp | 8: aa0003e4 mov x4, x0 | c: 1400000f b 48 <foo+0x48> | 10: d2800240 mov x0, #0x12 // Freescale#18 | 14: d2800681 mov x1, #0x34 // Freescale#52 | 18: aa0003e5 mov x5, x0 | 1c: aa0103e6 mov x6, x1 | 20: d2800ac2 mov x2, #0x56 // Freescale#86 | 24: d2800f03 mov x3, #0x78 // Freescale#120 | 28: 48207c82 casp x0, x1, x2, x3, [x4] | 2c: ca050000 eor x0, x0, x5 | 30: ca060021 eor x1, x1, x6 | 34: aa010000 orr x0, x0, x1 | 38: f9400480 ldr x0, [x4, Freescale#8] | 3c: d50323bf autiasp | 40: ca0000e0 eor x0, x7, x0 | 44: d65f03c0 ret | 48: d2800240 mov x0, #0x12 // Freescale#18 | 4c: d2800681 mov x1, #0x34 // Freescale#52 | 50: d2800ac2 mov x2, #0x56 // Freescale#86 | 54: d2800f03 mov x3, #0x78 // Freescale#120 | 58: f9800091 prfm pstl1strm, [x4] | 5c: c87f1885 ldxp x5, x6, [x4] | 60: ca0000a5 eor x5, x5, x0 | 64: ca0100c6 eor x6, x6, x1 | 68: aa0600a6 orr x6, x5, x6 | 6c: b5000066 cbnz x6, 78 <foo+0x78> | 70: c8250c82 stxp w5, x2, x3, [x4] | 74: 35ffff45 cbnz w5, 5c <foo+0x5c> | 78: f9400480 ldr x0, [x4, Freescale#8] | 7c: d50323bf autiasp | 80: ca0000e0 eor x0, x7, x0 | 84: d65f03c0 ret ... sampling the high 8 bytes before and after the cmpxchg, and performing an EOR, as we'd expect. For backporting, I've tested this atop linux-4.9.y with GCC 5.5.0. Note that linux-4.9.y is oldest currently supported stable release, and mandates GCC 5.1+. Unfortunately I couldn't get a GCC 5.1 binary to run on my machines due to library incompatibilities. I've also used a standalone test to check that we can use a __uint128_t pointer in a +Q constraint at least as far back as GCC 4.8.5 and LLVM 3.9.1. Fixes: 5284e1b ("arm64: xchg: Implement cmpxchg_double") Fixes: e9a4b79 ("arm64: cmpxchg_dbl: patch in lse instructions when supported by the CPU") Reported-by: Boqun Feng <[email protected]> Link: https://lore.kernel.org/lkml/Y6DEfQXymYVgL3oJ@boqun-archlinux/ Reported-by: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]/ Signed-off-by: Mark Rutland <[email protected]> Cc: [email protected] Cc: Arnd Bergmann <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Steve Capper <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Jan 20, 2023
[ Upstream commit 031af50 ] The inline assembly for arm64's cmpxchg_double*() implementations use a +Q constraint to hazard against other accesses to the memory location being exchanged. However, the pointer passed to the constraint is a pointer to unsigned long, and thus the hazard only applies to the first 8 bytes of the location. GCC can take advantage of this, assuming that other portions of the location are unchanged, leading to a number of potential problems. This is similar to what we fixed back in commit: fee960b ("arm64: xchg: hazard against entire exchange variable") ... but we forgot to adjust cmpxchg_double*() similarly at the same time. The same problem applies, as demonstrated with the following test: | struct big { | u64 lo, hi; | } __aligned(128); | | unsigned long foo(struct big *b) | { | u64 hi_old, hi_new; | | hi_old = b->hi; | cmpxchg_double_local(&b->lo, &b->hi, 0x12, 0x34, 0x56, 0x78); | hi_new = b->hi; | | return hi_old ^ hi_new; | } ... which GCC 12.1.0 compiles as: | 0000000000000000 <foo>: | 0: d503233f paciasp | 4: aa0003e4 mov x4, x0 | 8: 1400000e b 40 <foo+0x40> | c: d2800240 mov x0, #0x12 // Freescale#18 | 10: d2800681 mov x1, #0x34 // Freescale#52 | 14: aa0003e5 mov x5, x0 | 18: aa0103e6 mov x6, x1 | 1c: d2800ac2 mov x2, #0x56 // Freescale#86 | 20: d2800f03 mov x3, #0x78 // Freescale#120 | 24: 48207c82 casp x0, x1, x2, x3, [x4] | 28: ca050000 eor x0, x0, x5 | 2c: ca060021 eor x1, x1, x6 | 30: aa010000 orr x0, x0, x1 | 34: d2800000 mov x0, #0x0 // #0 <--- BANG | 38: d50323bf autiasp | 3c: d65f03c0 ret | 40: d2800240 mov x0, #0x12 // Freescale#18 | 44: d2800681 mov x1, #0x34 // Freescale#52 | 48: d2800ac2 mov x2, #0x56 // Freescale#86 | 4c: d2800f03 mov x3, #0x78 // Freescale#120 | 50: f9800091 prfm pstl1strm, [x4] | 54: c87f1885 ldxp x5, x6, [x4] | 58: ca0000a5 eor x5, x5, x0 | 5c: ca0100c6 eor x6, x6, x1 | 60: aa0600a6 orr x6, x5, x6 | 64: b5000066 cbnz x6, 70 <foo+0x70> | 68: c8250c82 stxp w5, x2, x3, [x4] | 6c: 35ffff45 cbnz w5, 54 <foo+0x54> | 70: d2800000 mov x0, #0x0 // #0 <--- BANG | 74: d50323bf autiasp | 78: d65f03c0 ret Notice that at the lines with "BANG" comments, GCC has assumed that the higher 8 bytes are unchanged by the cmpxchg_double() call, and that `hi_old ^ hi_new` can be reduced to a constant zero, for both LSE and LL/SC versions of cmpxchg_double(). This patch fixes the issue by passing a pointer to __uint128_t into the +Q constraint, ensuring that the compiler hazards against the entire 16 bytes being modified. With this change, GCC 12.1.0 compiles the above test as: | 0000000000000000 <foo>: | 0: f9400407 ldr x7, [x0, Freescale#8] | 4: d503233f paciasp | 8: aa0003e4 mov x4, x0 | c: 1400000f b 48 <foo+0x48> | 10: d2800240 mov x0, #0x12 // Freescale#18 | 14: d2800681 mov x1, #0x34 // Freescale#52 | 18: aa0003e5 mov x5, x0 | 1c: aa0103e6 mov x6, x1 | 20: d2800ac2 mov x2, #0x56 // Freescale#86 | 24: d2800f03 mov x3, #0x78 // Freescale#120 | 28: 48207c82 casp x0, x1, x2, x3, [x4] | 2c: ca050000 eor x0, x0, x5 | 30: ca060021 eor x1, x1, x6 | 34: aa010000 orr x0, x0, x1 | 38: f9400480 ldr x0, [x4, Freescale#8] | 3c: d50323bf autiasp | 40: ca0000e0 eor x0, x7, x0 | 44: d65f03c0 ret | 48: d2800240 mov x0, #0x12 // Freescale#18 | 4c: d2800681 mov x1, #0x34 // Freescale#52 | 50: d2800ac2 mov x2, #0x56 // Freescale#86 | 54: d2800f03 mov x3, #0x78 // Freescale#120 | 58: f9800091 prfm pstl1strm, [x4] | 5c: c87f1885 ldxp x5, x6, [x4] | 60: ca0000a5 eor x5, x5, x0 | 64: ca0100c6 eor x6, x6, x1 | 68: aa0600a6 orr x6, x5, x6 | 6c: b5000066 cbnz x6, 78 <foo+0x78> | 70: c8250c82 stxp w5, x2, x3, [x4] | 74: 35ffff45 cbnz w5, 5c <foo+0x5c> | 78: f9400480 ldr x0, [x4, Freescale#8] | 7c: d50323bf autiasp | 80: ca0000e0 eor x0, x7, x0 | 84: d65f03c0 ret ... sampling the high 8 bytes before and after the cmpxchg, and performing an EOR, as we'd expect. For backporting, I've tested this atop linux-4.9.y with GCC 5.5.0. Note that linux-4.9.y is oldest currently supported stable release, and mandates GCC 5.1+. Unfortunately I couldn't get a GCC 5.1 binary to run on my machines due to library incompatibilities. I've also used a standalone test to check that we can use a __uint128_t pointer in a +Q constraint at least as far back as GCC 4.8.5 and LLVM 3.9.1. Fixes: 5284e1b ("arm64: xchg: Implement cmpxchg_double") Fixes: e9a4b79 ("arm64: cmpxchg_dbl: patch in lse instructions when supported by the CPU") Reported-by: Boqun Feng <[email protected]> Link: https://lore.kernel.org/lkml/Y6DEfQXymYVgL3oJ@boqun-archlinux/ Reported-by: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]/ Signed-off-by: Mark Rutland <[email protected]> Cc: [email protected] Cc: Arnd Bergmann <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Steve Capper <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Notes to v5.4.47 tag upgrade:
Conflicts (manual resolved, upstream revision taken):
Upstream commit [0070e73] is merged and taken to resolve conflict, as it provides additional check for size bound condition which solved the issue with corruption of header in
imx_scu_ipc_write
.Build and boot tested on imx8mmevk, result - Pass.
-- andrey