forked from torvalds/linux
-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update 5.4.x+fslc to v5.4.39 #69
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
commit a5bff92 upstream. The uapi is the same on 32 and 64 bit, but the number isn't. Everyone who botched this please re-read: https://www.kernel.org/doc/html/v5.4-preprc-cpu/ioctl/botching-up-ioctls.html Also, the type argument for the ioctl macros is for the type the void __user *arg pointer points at, which in this case would be the variable-sized char[] of a 0 terminated string. So this was botched in more than just the usual ways. Cc: Sumit Semwal <[email protected]> Cc: Chenbo Feng <[email protected]> Cc: Greg Hackmann <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Martin Liu <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Tested-by: Martin Liu <[email protected]> Reviewed-by: Martin Liu <[email protected]> Signed-off-by: Sumit Semwal <[email protected]> [sumits: updated some checkpatch fixes, corrected author email] Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 6292b8e upstream. The DispID DTD pixel clock is documented as: "00 00 00 h → FF FF FF h | Pixel clock ÷ 10,000 0.01 → 167,772.16 Mega Pixels per Sec" Which seems to imply that we to add one to the raw value. Reality seems to agree as there are tiled displays in the wild which currently show a 10kHz difference in the pixel clock between the tiles (one tile gets its mode from the base EDID, the other from the DispID block). Cc: [email protected] References: https://gitlab.freedesktop.org/drm/intel/-/issues/27 Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Manasi Navare <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 87b7ebc upstream. [why] We have seen a green screen after resume from suspend in a Raven system connected with two displays (HDMI and DP) on X based system. We noticed that this issue is related to bad DCC metadata from user space which may generate hangs and consequently an underflow on HUBP. After taking a deep look at the code path we realized that after resume we try to restore the commit with the DCC enabled framebuffer but the framebuffer is no longer valid. [how] This problem was only reported on Raven based system and after suspend, for this reason, this commit adds a new parameter on fill_plane_dcc_attributes() to give the option of disabling DCC programmatically. In summary, for disabling DCC we first verify if is a Raven system and if it is in suspend; if both conditions are true we disable DCC temporarily, otherwise, it is enabled. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1099 Co-developed-by: Nicholas Kazlauskas <[email protected]> Signed-off-by: Nicholas Kazlauskas <[email protected]> Signed-off-by: Rodrigo Siqueira <[email protected]> Reviewed-by: Nicholas Kazlauskas <[email protected]> Acked-by: Rodrigo Siqueira <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 85e9b88 upstream. ret should be changed to release allocated struct qxl_release Cc: [email protected] Fixes: 8002db6 ("qxl: convert qxl driver to proper use for reservations") Signed-off-by: Vasily Averin <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Gerd Hoffmann <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit a65aa9c upstream. Cc: [email protected] Fixes: 8002db6 ("qxl: convert qxl driver to proper use for reservations") Signed-off-by: Vasily Averin <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Gerd Hoffmann <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 933db73 upstream. qxl_release should not be accesses after qxl_push_*_ring_release() calls: userspace driver can process submitted command quickly, move qxl_release into release_ring, generate interrupt and trigger garbage collector. It can lead to crashes in qxl driver or trigger memory corruption in some kmalloc-192 slab object Gerd Hoffmann proposes to swap the qxl_release_fence_buffer_objects() + qxl_push_{cursor,command}_ring_release() calls to close that race window. cc: [email protected] Fixes: f64122c ("drm: add new QXL driver. (v1.4)") Signed-off-by: Vasily Averin <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Gerd Hoffmann <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit dff5853 upstream. Currently, if the client sends BIND_CONN_TO_SESSION with NFS4_CDFC4_FORE_OR_BOTH but only gets NFS4_CDFS4_FORE back it ignores that it wasn't able to enable a backchannel. To make sure, the client sends BIND_CONN_TO_SESSION as the first operation on the connections (ie., no other session compounds haven't been sent before), and if the client's request to bind the backchannel is not satisfied, then reset the connection and retry. Cc: [email protected] Signed-off-by: Olga Kornievskaia <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 1402d17 upstream. btrfs_recover_relocation() invokes btrfs_join_transaction(), which joins a btrfs_trans_handle object into transactions and returns a reference of it with increased refcount to "trans". When btrfs_recover_relocation() returns, "trans" becomes invalid, so the refcount should be decreased to keep refcount balanced. The reference counting issue happens in one exception handling path of btrfs_recover_relocation(). When read_fs_root() failed, the refcnt increased by btrfs_join_transaction() is not decreased, causing a refcnt leak. Fix this issue by calling btrfs_end_transaction() on this error path when read_fs_root() failed. Fixes: 79787ea ("btrfs: replace many BUG_ONs with proper error handling") CC: [email protected] # 4.4+ Reviewed-by: Filipe Manana <[email protected]> Signed-off-by: Xiyu Yang <[email protected]> Signed-off-by: Xin Tan <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit f6033c5 upstream. btrfs_remove_block_group() invokes btrfs_lookup_block_group(), which returns a local reference of the block group that contains the given bytenr to "block_group" with increased refcount. When btrfs_remove_block_group() returns, "block_group" becomes invalid, so the refcount should be decreased to keep refcount balanced. The reference counting issue happens in several exception handling paths of btrfs_remove_block_group(). When those error scenarios occur such as btrfs_alloc_path() returns NULL, the function forgets to decrease its refcnt increased by btrfs_lookup_block_group() and will cause a refcnt leak. Fix this issue by jumping to "out_put_group" label and calling btrfs_put_block_group() when those error scenarios occur. CC: [email protected] # 4.4+ Signed-off-by: Xiyu Yang <[email protected]> Signed-off-by: Xin Tan <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit f135cea upstream. When we have an inode with a prealloc extent that starts at an offset lower than the i_size and there is another prealloc extent that starts at an offset beyond i_size, we can end up losing part of the first prealloc extent (the part that starts at i_size) and have an implicit hole if we fsync the file and then have a power failure. Consider the following example with comments explaining how and why it happens. $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt # Create our test file with 2 consecutive prealloc extents, each with a # size of 128Kb, and covering the range from 0 to 256Kb, with a file # size of 0. $ xfs_io -f -c "falloc -k 0 128K" /mnt/foo $ xfs_io -c "falloc -k 128K 128K" /mnt/foo # Fsync the file to record both extents in the log tree. $ xfs_io -c "fsync" /mnt/foo # Now do a redudant extent allocation for the range from 0 to 64Kb. # This will merely increase the file size from 0 to 64Kb. Instead we # could also do a truncate to set the file size to 64Kb. $ xfs_io -c "falloc 0 64K" /mnt/foo # Fsync the file, so we update the inode item in the log tree with the # new file size (64Kb). This also ends up setting the number of bytes # for the first prealloc extent to 64Kb. This is done by the truncation # at btrfs_log_prealloc_extents(). # This means that if a power failure happens after this, a write into # the file range 64Kb to 128Kb will not use the prealloc extent and # will result in allocation of a new extent. $ xfs_io -c "fsync" /mnt/foo # Now set the file size to 256K with a truncate and then fsync the file. # Since no changes happened to the extents, the fsync only updates the # i_size in the inode item at the log tree. This results in an implicit # hole for the file range from 64Kb to 128Kb, something which fsck will # complain when not using the NO_HOLES feature if we replay the log # after a power failure. $ xfs_io -c "truncate 256K" -c "fsync" /mnt/foo So instead of always truncating the log to the inode's current i_size at btrfs_log_prealloc_extents(), check first if there's a prealloc extent that starts at an offset lower than the i_size and with a length that crosses the i_size - if there is one, just make sure we truncate to a size that corresponds to the end offset of that prealloc extent, so that we don't lose the part of that extent that starts at i_size if a power failure happens. A test case for fstests follows soon. Fixes: 31d11b8 ("Btrfs: fix duplicate extents after fsync of file with prealloc extents") CC: [email protected] # 4.14+ Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
…f fs_info::journal_info commit fcc9973 upstream. [BUG] One run of btrfs/063 triggered the following lockdep warning: ============================================ WARNING: possible recursive locking detected 5.6.0-rc7-custom+ Freescale#48 Not tainted -------------------------------------------- kworker/u24:0/7 is trying to acquire lock: ffff88817d3a46e0 (sb_internal#2){.+.+}, at: start_transaction+0x66c/0x890 [btrfs] but task is already holding lock: ffff88817d3a46e0 (sb_internal#2){.+.+}, at: start_transaction+0x66c/0x890 [btrfs] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(sb_internal#2); lock(sb_internal#2); *** DEADLOCK *** May be due to missing lock nesting notation 4 locks held by kworker/u24:0/7: #0: ffff88817b495948 ((wq_completion)btrfs-endio-write){+.+.}, at: process_one_work+0x557/0xb80 Freescale#1: ffff888189ea7db8 ((work_completion)(&work->normal_work)){+.+.}, at: process_one_work+0x557/0xb80 Freescale#2: ffff88817d3a46e0 (sb_internal#2){.+.+}, at: start_transaction+0x66c/0x890 [btrfs] Freescale#3: ffff888174ca4da8 (&fs_info->reloc_mutex){+.+.}, at: btrfs_record_root_in_trans+0x83/0xd0 [btrfs] stack backtrace: CPU: 0 PID: 7 Comm: kworker/u24:0 Not tainted 5.6.0-rc7-custom+ Freescale#48 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Workqueue: btrfs-endio-write btrfs_work_helper [btrfs] Call Trace: dump_stack+0xc2/0x11a __lock_acquire.cold+0xce/0x214 lock_acquire+0xe6/0x210 __sb_start_write+0x14e/0x290 start_transaction+0x66c/0x890 [btrfs] btrfs_join_transaction+0x1d/0x20 [btrfs] find_free_extent+0x1504/0x1a50 [btrfs] btrfs_reserve_extent+0xd5/0x1f0 [btrfs] btrfs_alloc_tree_block+0x1ac/0x570 [btrfs] btrfs_copy_root+0x213/0x580 [btrfs] create_reloc_root+0x3bd/0x470 [btrfs] btrfs_init_reloc_root+0x2d2/0x310 [btrfs] record_root_in_trans+0x191/0x1d0 [btrfs] btrfs_record_root_in_trans+0x90/0xd0 [btrfs] start_transaction+0x16e/0x890 [btrfs] btrfs_join_transaction+0x1d/0x20 [btrfs] btrfs_finish_ordered_io+0x55d/0xcd0 [btrfs] finish_ordered_fn+0x15/0x20 [btrfs] btrfs_work_helper+0x116/0x9a0 [btrfs] process_one_work+0x632/0xb80 worker_thread+0x80/0x690 kthread+0x1a3/0x1f0 ret_from_fork+0x27/0x50 It's pretty hard to reproduce, only one hit so far. [CAUSE] This is because we're calling btrfs_join_transaction() without re-using the current running one: btrfs_finish_ordered_io() |- btrfs_join_transaction() <<< Call Freescale#1 |- btrfs_record_root_in_trans() |- btrfs_reserve_extent() |- btrfs_join_transaction() <<< Call Freescale#2 Normally such btrfs_join_transaction() call should re-use the existing one, without trying to re-start a transaction. But the problem is, in btrfs_join_transaction() call Freescale#1, we call btrfs_record_root_in_trans() before initializing current::journal_info. And in btrfs_join_transaction() call Freescale#2, we're relying on current::journal_info to avoid such deadlock. [FIX] Call btrfs_record_root_in_trans() after we have initialized current::journal_info. CC: [email protected] # 4.4+ Signed-off-by: Qu Wenruo <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
…out loop commit b1ac62a upstream. Open-coding a timeout loop invariably leads to errors with handling the timeout properly in one corner case or another. In the case of cqhci we might report "CQE stuck on" even if it wasn't stuck on. You'd just need this sequence of events to happen in cqhci_off(): 1. Call ktime_get(). 2. Something happens to interrupt the CPU for > 100 us (context switch or interrupt). 3. Check time and; set "timed_out" to true since > 100 us. 4. Read CQHCI_CTL. 5. Both "reg & CQHCI_HALT" and "timed_out" are true, so break. 6. Since "timed_out" is true, falsely print the error message. Rather than fixing the polling loop, use readx_poll_timeout() like many people do. This has been time tested to handle the corner cases. Fixes: a408022 ("mmc: cqhci: support for command queue enabled host") Signed-off-by: Douglas Anderson <[email protected]> Acked-by: Adrian Hunter <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/20200413162717.1.Idece266f5c8793193b57a1ddb1066d030c6af8e0@changeid Signed-off-by: Ulf Hansson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit bb32e19 upstream. For some reason the Host Control2 register of the Xenon SDHCI controller sometimes reports the bit representing 1.8V signaling as 0 when read after it was written as 1. Subsequent read reports 1. This causes the sdhci_start_signal_voltage_switch function to report 1.8V regulator output did not become stable When CONFIG_PM is enabled, the host is suspended and resumend many times, and in each resume the switch to 1.8V is called, and so the kernel log reports this message annoyingly often. Do an empty read of the Host Control2 register in Xenon's .voltage_switch method to circumvent this. This patch fixes this particular problem on Turris MOX. Signed-off-by: Marek Behún <[email protected]> Fixes: 8d876bf ("mmc: sdhci-xenon: wait 5ms after set 1.8V...") Cc: [email protected] # v4.16+ Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ulf Hansson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 1a8eb6b upstream. BIOS writers have begun the practice of setting 40 ohm eMMC driver strength even though the eMMC may not support it, on the assumption that the kernel will validate the value against the eMMC (Extended CSD DRIVER_STRENGTH [offset 197]) and revert to the default 50 ohm value if 40 ohm is invalid. This is done to avoid changing the value for different boards. Putting aside the merits of this approach, it is clear the eMMC's mask of supported driver strengths is more reliable than the value provided by BIOS. Add validation accordingly. Signed-off-by: Adrian Hunter <[email protected]> Fixes: 51ced59 ("mmc: sdhci-pci: Use ACPI DSM to get driver strength for some Intel devices") Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ulf Hansson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 9d8cb58 upstream. MSM sd host controller is capable of HW busy detection of device busy signaling over DAT0 line. And it requires the R1B response for commands that have this response associated with them. So set the below two host capabilities for qcom SDHC. - MMC_CAP_WAIT_WHILE_BUSY - MMC_CAP_NEED_RSP_BUSY Recent development of the mmc core in regards to this, revealed this as being a potential bug, hence the stable tag. Cc: <[email protected]> # v4.19+ Signed-off-by: Veerabhadrarao Badiganti <[email protected]> Acked-by: Adrian Hunter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ulf Hansson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit e53b868 upstream. The Meson SDIO controller uses the DAT0 lane for hardware busy detection. Set MMC_CAP_WAIT_WHILE_BUSY accordingly. This fixes the following error observed with Linux 5.7 (pre-rc-1): mmc1: Card stuck being busy! __mmc_poll_for_busy blk_update_request: I/O error, dev mmcblk1, sector 17111080 op 0x3:(DISCARD) flags 0x0 phys_seg 1 prio class 0 Fixes: ed80a13 ("mmc: meson-mx-sdio: Add a driver for the Amlogic Meson8 and Meson8b SoCs") Signed-off-by: Martin Blumenstingl <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ulf Hansson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit ddca109 upstream. The recent commit 0d84c3e ("mmc: core: Convert to mmc_poll_for_busy() for erase/trim/discard") makes use of the ->card_busy() op for SD cards. This uncovered that the ->card_busy() op in the Meson SDIO driver was never working right: while polling the busy status with ->card_busy() meson_mx_mmc_card_busy() reads only one of the two MESON_MX_SDIO_IRQC register values 0x1f001f10 or 0x1f003f10. This translates to "three out of four DAT lines are HIGH" and "all four DAT lines are HIGH", which is interpreted as "the card is busy". It turns out that no situation can be observed where all four DAT lines are LOW, meaning the card is not busy anymore. Upon further research the 3.10 vendor driver for this controller does not implement the ->card_busy() op. Remove the ->card_busy() op from the meson-mx-sdio driver since it is not working. At the time of writing this patch it is not clear what's needed to make the ->card_busy() implementation work with this specific controller hardware. For all use-cases which have previously worked the MMC_CAP_WAIT_WHILE_BUSY flag is now taking over, even if we don't have a ->card_busy() op anymore. Fixes: ed80a13 ("mmc: meson-mx-sdio: Add a driver for the Amlogic Meson8 and Meson8b SoCs") Signed-off-by: Martin Blumenstingl <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ulf Hansson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 55b3209 upstream. For skcipher algorithms, the input, output HW S/G tables look like this: [IV, src][dst, IV] Now, we can have 2 conditions here: - there is no IV; - src and dst are equal (in-place encryption) and scattered and the error is an "off-by-one" in the HW S/G table. This issue was seen with KASAN: BUG: KASAN: slab-out-of-bounds in skcipher_edesc_alloc+0x95c/0x1018 Read of size 4 at addr ffff000022a02958 by task cryptomgr_test/321 CPU: 2 PID: 321 Comm: cryptomgr_test Not tainted 5.6.0-rc1-00165-ge4ef8383-dirty Freescale#4 Hardware name: LS1046A RDB Board (DT) Call trace: dump_backtrace+0x0/0x260 show_stack+0x14/0x20 dump_stack+0xe8/0x144 print_address_description.isra.11+0x64/0x348 __kasan_report+0x11c/0x230 kasan_report+0xc/0x18 __asan_load4+0x90/0xb0 skcipher_edesc_alloc+0x95c/0x1018 skcipher_encrypt+0x84/0x150 crypto_skcipher_encrypt+0x50/0x68 test_skcipher_vec_cfg+0x4d4/0xc10 test_skcipher_vec+0x178/0x1d8 alg_test_skcipher+0xec/0x230 alg_test.part.44+0x114/0x4a0 alg_test+0x1c/0x60 cryptomgr_test+0x34/0x58 kthread+0x1b8/0x1c0 ret_from_fork+0x10/0x18 Allocated by task 321: save_stack+0x24/0xb0 __kasan_kmalloc.isra.10+0xc4/0xe0 kasan_kmalloc+0xc/0x18 __kmalloc+0x178/0x2b8 skcipher_edesc_alloc+0x21c/0x1018 skcipher_encrypt+0x84/0x150 crypto_skcipher_encrypt+0x50/0x68 test_skcipher_vec_cfg+0x4d4/0xc10 test_skcipher_vec+0x178/0x1d8 alg_test_skcipher+0xec/0x230 alg_test.part.44+0x114/0x4a0 alg_test+0x1c/0x60 cryptomgr_test+0x34/0x58 kthread+0x1b8/0x1c0 ret_from_fork+0x10/0x18 Freed by task 0: (stack is not available) The buggy address belongs to the object at ffff000022a02800 which belongs to the cache dma-kmalloc-512 of size 512 The buggy address is located 344 bytes inside of 512-byte region [ffff000022a02800, ffff000022a02a00) The buggy address belongs to the page: page:fffffe00006a8000 refcount:1 mapcount:0 mapping:ffff00093200c400 index:0x0 compound_mapcount: 0 flags: 0xffff00000010200(slab|head) raw: 0ffff00000010200 dead000000000100 dead000000000122 ffff00093200c400 raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff000022a02800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff000022a02880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff000022a02900: 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc fc ^ ffff000022a02980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff000022a02a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc Fixes: 334d37c ("crypto: caam - update IV using HW support") Cc: <[email protected]> # v5.3+ Signed-off-by: Iuliana Prodan <[email protected]> Reviewed-by: Horia Geantă <[email protected]> Signed-off-by: Herbert Xu <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit ef0b320 upstream. This new Lenovo ThinkCenter has two front mics which can't be handled by PA so far, so apply the fixup ALC283_FIXUP_HEADSET_MIC to change the location for one of the mics. Cc: <[email protected]> Signed-off-by: Hui Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 547d2c9 upstream. The USB vendor ID of NuPrime DAC-10 is not 16b0 but 16d0. Fixes: f656891 ("ALSA: usb-audio: add more quirks for DSD interfaces") Cc: <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit a2f6472 upstream. Fix the following coccicheck warning: sound/pci/hda/patch_hdmi.c:1852:2-8: preceding lock on line 1846 After add sanity check to pass klockwork check, The spdif_mutex should be unlock before return true in check_non_pcm_per_cvt(). Fixes: 960a581 ("ALSA: hda: fix some klockwork scan warnings") Signed-off-by: Wu Bo <[email protected]> Cc: <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit cc18b2f upstream. Apparently interface 1 is control interface akin to HD500X, setting LINE6_CAP_CONTROL and choosing it as ctrl_if fixes audio playback on POD HD500. Signed-off-by: Vasily Khoruzhick <[email protected]> Cc: <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 4285de0 upstream. The checks of the plugin buffer overflow in the previous fix by commit f2ecf90 ("ALSA: pcm: oss: Avoid plugin buffer overflow") are put in the wrong places mistakenly, which leads to the expected (repeated) sound when the rate plugin is involved. Fix in the right places. Also, at those right places, the zero check is needed for the termination node, so added there as well, and let's get it done, finally. Fixes: f2ecf90 ("ALSA: pcm: oss: Avoid plugin buffer overflow") Cc: <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit ac2b081 upstream. The problem is that we dereference "privdata->pci_dev" when we print the error messages in amd_mp2_pci_init(): dev_err(ndev_dev(privdata), "Failed to enable MP2 PCI device\n"); ^^^^^^^^^^^^^^^^^ Fixes: 529766e ("i2c: Add drivers for the AMD PCIe MP2 I2C controller") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Wolfram Sang <[email protected]> Cc: [email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 1a06d01 upstream. Before the hibernation patchset (e.g. f53335e), in a Generation-2 Linux VM on Hyper-V, the user can run "echo freeze > /sys/power/state" to freeze the system, i.e. Suspend-to-Idle. The user can press the keyboard or move the mouse to wake up the VM. With the hibernation patchset, Linux VM on Hyper-V can hibernate to disk, but Suspend-to-Idle is broken: when the synthetic keyboard/mouse are suspended, there is no way to wake up the VM. Fix the issue by not suspending and resuming the vmbus devices upon Suspend-to-Idle. Fixes: f53335e ("Drivers: hv: vmbus: Suspend/resume the vmbus itself for hibernation") Cc: [email protected] Reviewed-by: Michael Kelley <[email protected]> Signed-off-by: Dexuan Cui <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Wei Liu <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 3815f1b upstream. 'count' is how much you want written, not the final position. Moreover, it can legitimately be less than the current position... Cc: [email protected] Signed-off-by: Al Viro <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 47c370c upstream. The commit below modified rvt_create_mmap_info() to return ERR_PTR's but didn't update the callers to handle them. Modify rvt_create_mmap_info() to only return ERR_PTR and fix all error checking after rvt_create_mmap_info() was called. Fixes: ff23dfa ("IB: Pass only ib_udata in function prototypes") Link: https://lore.kernel.org/r/[email protected] Cc: [email protected] [5.4+] Tested-by: Mike Marciniszyn <[email protected]> Acked-by: Mike Marciniszyn <[email protected]> Signed-off-by: Sudip Mukherjee <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit a9b760b upstream. Transitioned power state logged at the end of setting ACPI power. However, D3cold won't be in the message because state can only be D3hot at most. Use target_state to corretly report when power state is D3cold. Cc: All applicable <[email protected]> Signed-off-by: Kai-Heng Feng <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 2351f8d upstream. Currently the kernel threads are not frozen in software_resume(), so between dpm_suspend_start(PMSG_QUIESCE) and resume_target_kernel(), system_freezable_power_efficient_wq can still try to submit SCSI commands and this can cause a panic since the low level SCSI driver (e.g. hv_storvsc) has quiesced the SCSI adapter and can not accept any SCSI commands: https://lkml.org/lkml/2020/4/10/47 At first I posted a fix (https://lkml.org/lkml/2020/4/21/1318) trying to resolve the issue from hv_storvsc, but with the help of Bart Van Assche, I realized it's better to fix software_resume(), since this looks like a generic issue, not only pertaining to SCSI. Cc: All applicable <[email protected]> Signed-off-by: Dexuan Cui <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit ad4e80a upstream. The error correction data is computed as if data and hash blocks were concatenated. But hash block number starts from v->hash_start. So, we have to calculate hash block number based on that. Fixes: a739ff3 ("dm verity: add support for forward error correction") Cc: [email protected] Signed-off-by: Sunwook Eom <[email protected]> Reviewed-by: Sami Tolvanen <[email protected]> Signed-off-by: Mike Snitzer <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit b74aa02 upstream. Currently, system fails to boot because the legacy interrupt remapping mode does not enable 128-bit IRTE (GA), which is required for x2APIC support. Fix by using AMD_IOMMU_GUEST_IR_LEGACY_GA mode when booting with kernel option amd_iommu_intr=legacy instead. The initialization logic will check GASup and automatically fallback to using AMD_IOMMU_GUEST_IR_LEGACY if GA mode is not supported. Fixes: 3928aa3 ("iommu/amd: Detect and enable guest vAPIC support") Signed-off-by: Suravee Suthikulpanit <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit c926c87 upstream. In AST2600 there have a slow peripheral bus between CPU and i2c controller. Therefore GIC i2c interrupt status clear have delay timing, when CPU issue write clear i2c controller interrupt status. To avoid this issue, the driver need have read after write clear at i2c ISR. Fixes: f327c68 ("i2c: aspeed: added driver for Aspeed I2C") Signed-off-by: ryan_chen <[email protected]> Acked-by: Benjamin Herrenschmidt <[email protected]> [wsa: added Fixes tag] Signed-off-by: Wolfram Sang <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 5ce0076 upstream. gcc-10 points out a few instances of suspicious integer arithmetic leading to value truncation: sound/isa/opti9xx/opti92x-ad1848.c: In function 'snd_opti9xx_configure': sound/isa/opti9xx/opti92x-ad1848.c:322:43: error: overflow in conversion from 'int' to 'unsigned char' changes value from '(int)snd_opti9xx_read(chip, 3) & -256 | 240' to '240' [-Werror=overflow] 322 | (snd_opti9xx_read(chip, reg) & ~(mask)) | ((value) & (mask))) | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~ sound/isa/opti9xx/opti92x-ad1848.c:351:3: note: in expansion of macro 'snd_opti9xx_write_mask' 351 | snd_opti9xx_write_mask(chip, OPTi9XX_MC_REG(3), 0xf0, 0xff); | ^~~~~~~~~~~~~~~~~~~~~~ sound/isa/opti9xx/miro.c: In function 'snd_miro_configure': sound/isa/opti9xx/miro.c:873:40: error: overflow in conversion from 'int' to 'unsigned char' changes value from '(int)snd_miro_read(chip, 3) & -256 | 240' to '240' [-Werror=overflow] 873 | (snd_miro_read(chip, reg) & ~(mask)) | ((value) & (mask))) | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~ sound/isa/opti9xx/miro.c:1010:3: note: in expansion of macro 'snd_miro_write_mask' 1010 | snd_miro_write_mask(chip, OPTi9XX_MC_REG(3), 0xf0, 0xff); | ^~~~~~~~~~~~~~~~~~~ These are all harmless here as only the low 8 bit are passed down anyway. Change the macros to inline functions to make the code more readable and also avoid the warning. Strictly speaking those functions also need locking to make the read/write pair atomic, but it seems unlikely that anyone would still run into that issue. Fixes: 1841f61 ("[ALSA] Add snd-miro driver") Signed-off-by: Arnd Bergmann <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit dd7bc81 upstream. Commit 6fcf0c7, a fix to get_tree_bdev() put a missing blkdev_put() in the wrong place, before a warnf() that displays the bdev under consideration rather after it. This results in a silent lockup in printk("%pg") called via warnf() from get_tree_bdev() under some circumstances when there's a race with the blockdev being frozen. This can be caused by xfstests/tests/generic/085 in combination with Lukas Czerner's ext4 mount API conversion patchset. It looks like it ought to occur with other users of get_tree_bdev() such as XFS, but apparently doesn't. Fix this by switching the order of the lines. Fixes: 6fcf0c7 ("vfs: add missing blkdev_put() in get_tree_bdev()") Reported-by: Lukas Czerner <[email protected]> Signed-off-by: David Howells <[email protected]> cc: Ian Kent <[email protected]> cc: Al Viro <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 132be62 upstream. When jumping to the out_put_disk label, we will call put_disk(), which will trigger a call to disk_release(), which calls blk_put_queue(). Later in the cleanup code, we do blk_cleanup_queue(), which will also call blk_put_queue(). Putting the queue twice is incorrect, and will generate a KASAN splat. Set the disk->queue pointer to NULL, before calling put_disk(), so that the first call to blk_put_queue() will not free the queue. The second call to blk_put_queue() uses another pointer to the same queue, so this call will still free the queue. Fixes: 85136c0 ("lightnvm: simplify geometry enumeration") Signed-off-by: Niklas Cassel <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 7648f93 upstream. nfs3_set_acl keeps track of the acl it allocated locally to determine if an acl needs to be released at the end. This results in a memory leak when the function allocates an acl as well as a default acl. Fix by releasing acls that differ from the acl originally passed into nfs3_set_acl. Fixes: b7fa055 ("[PATCH] NFS: Add support for NFSv3 ACLs") Reported-by: Xiyu Yang <[email protected]> Signed-off-by: Andreas Gruenbacher <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit b9f9602 upstream. Under some circumstances, i.e. when test is still running and about to time out and user runs, for example, grep -H . /sys/module/dmatest/parameters/* the iterations parameter is not respected and test is going on and on until user gives echo 0 > /sys/module/dmatest/parameters/run This is not what expected. The history of this bug is interesting. I though that the commit 2d88ce7 ("dmatest: add a 'wait' parameter") is a culprit, but looking closer to the code I think it simple revealed the broken logic from the day one, i.e. in the commit 0a2ff57 ("dmaengine: dmatest: add a maximum number of test iterations") which adds iterations parameter. So, to the point, the conditional of checking the thread to be stopped being first part of conjunction logic prevents to check iterations. Thus, we have to always check both conditions to be able to stop after given iterations. Since it wasn't visible before second commit appeared, I add a respective Fixes tag. Fixes: 2d88ce7 ("dmatest: add a 'wait' parameter") Cc: Dan Williams <[email protected]> Cc: Nicolas Ferre <[email protected]> Signed-off-by: Andy Shevchenko <[email protected]> Acked-by: Nicolas Ferre <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Vinod Koul <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit aa72f1d upstream. If we do % echo 1 > /sys/module/dmatest/parameters/run [ 115.851124] dmatest: Could not start test, no channels configured % echo dma8chan7 > /sys/module/dmatest/parameters/channel [ 127.563872] dmatest: Added 1 threads using dma8chan7 % cat /sys/module/dmatest/parameters/wait ... !!! HANG !!! ... The culprit is the commit 6138f96 ("dmaengine: dmatest: Use fixed point div to calculate iops") which makes threads not to run, but pending and being kicked off by writing to the 'run' node. However, it forgot to consider 'wait' routine to avoid above mentioned case. In order to fix this, check for really running threads, i.e. with pending and done flags unset. It's pity the culprit commit hadn't updated documentation and tested all possible scenarios. Fixes: 6138f96 ("dmaengine: dmatest: Use fixed point div to calculate iops") Cc: Seraj Alijan <[email protected]> Signed-off-by: Andy Shevchenko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Vinod Koul <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 1578e5d upstream. On arm64 linux gcc uses -fasynchronous-unwind-tables -funwind-tables by default since gcc-8, so now the de facto platform ABI is to allow unwinding from async signal handlers. However on bare metal targets (aarch64-none-elf), and on old gcc, async and sync unwind tables are not enabled by default to avoid runtime memory costs. This means if linux is built with a baremetal toolchain the vdso.so may not have unwind tables which breaks the gcc platform ABI guarantee in userspace. Add -fasynchronous-unwind-tables explicitly to the vgettimeofday.o cflags to address the ABI change. Fixes: 28b1a82 ("arm64: vdso: Substitute gettimeofday() with C implementation") Cc: Will Deacon <[email protected]> Reported-by: Szabolcs Nagy <[email protected]> Signed-off-by: Vincenzo Frascino <[email protected]> Signed-off-by: Catalin Marinas <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit fb73974 upstream. Fix the SELinux netlink_send hook to properly handle multiple netlink messages in a single sk_buff; each message is parsed and subject to SELinux access control. Prior to this patch, SELinux only inspected the first message in the sk_buff. Cc: [email protected] Reported-by: Dmitry Vyukov <[email protected]> Reviewed-by: Stephen Smalley <[email protected]> Signed-off-by: Paul Moore <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
This is the 5.4.39 stable release Signed-off-by: Andrey Zhizhikin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Oct 29, 2020
[ Upstream commit bad60b8 ] The idx in __ath10k_htt_rx_ring_fill_n function lives in consistent dma region writable by the device. Malfunctional or malicious device could manipulate such idx to have a OOB write. Either by htt->rx_ring.netbufs_ring[idx] = skb; or by ath10k_htt_set_paddrs_ring(htt, paddr, idx); The idx can also be negative as it's signed, giving a large memory space to write to. It's possibly exploitable by corruptting a legit pointer with a skb pointer. And then fill skb with payload as rougue object. Part of the log here. Sometimes it appears as UAF when writing to a freed memory by chance. [ 15.594376] BUG: unable to handle page fault for address: ffff887f5c1804f0 [ 15.595483] #PF: supervisor write access in kernel mode [ 15.596250] #PF: error_code(0x0002) - not-present page [ 15.597013] PGD 0 P4D 0 [ 15.597395] Oops: 0002 [Freescale#1] SMP KASAN PTI [ 15.597967] CPU: 0 PID: 82 Comm: kworker/u2:2 Not tainted 5.6.0 Freescale#69 [ 15.598843] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 [ 15.600438] Workqueue: ath10k_wq ath10k_core_register_work [ath10k_core] [ 15.601389] RIP: 0010:__ath10k_htt_rx_ring_fill_n (linux/drivers/net/wireless/ath/ath10k/htt_rx.c:173) ath10k_core Signed-off-by: Zekun Shen <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
LeBlue
pushed a commit
to LeBlue/linux-fslc
that referenced
this pull request
May 1, 2021
[ Upstream commit bad60b8 ] The idx in __ath10k_htt_rx_ring_fill_n function lives in consistent dma region writable by the device. Malfunctional or malicious device could manipulate such idx to have a OOB write. Either by htt->rx_ring.netbufs_ring[idx] = skb; or by ath10k_htt_set_paddrs_ring(htt, paddr, idx); The idx can also be negative as it's signed, giving a large memory space to write to. It's possibly exploitable by corruptting a legit pointer with a skb pointer. And then fill skb with payload as rougue object. Part of the log here. Sometimes it appears as UAF when writing to a freed memory by chance. [ 15.594376] BUG: unable to handle page fault for address: ffff887f5c1804f0 [ 15.595483] #PF: supervisor write access in kernel mode [ 15.596250] #PF: error_code(0x0002) - not-present page [ 15.597013] PGD 0 P4D 0 [ 15.597395] Oops: 0002 [Freescale#1] SMP KASAN PTI [ 15.597967] CPU: 0 PID: 82 Comm: kworker/u2:2 Not tainted 5.6.0 Freescale#69 [ 15.598843] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 [ 15.600438] Workqueue: ath10k_wq ath10k_core_register_work [ath10k_core] [ 15.601389] RIP: 0010:__ath10k_htt_rx_ring_fill_n (linux/drivers/net/wireless/ath/ath10k/htt_rx.c:173) ath10k_core Signed-off-by: Zekun Shen <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
May 14, 2021
[ Upstream commit b34ea31 ] In mtk_iommu_runtime_resume always enable the clk, even if m4u_dom is null. Otherwise the 'suspend' cb might disable the clk which is already disabled causing the warning: [ 1.586104] infra_m4u already disabled [ 1.586133] WARNING: CPU: 0 PID: 121 at drivers/clk/clk.c:952 clk_core_disable+0xb0/0xb8 [ 1.594391] mtk-iommu 10205000.iommu: bound 18001000.larb (ops mtk_smi_larb_component_ops) [ 1.598108] Modules linked in: [ 1.598114] CPU: 0 PID: 121 Comm: kworker/0:2 Not tainted 5.12.0-rc5 Freescale#69 [ 1.609246] mtk-iommu 10205000.iommu: bound 14027000.larb (ops mtk_smi_larb_component_ops) [ 1.617487] Hardware name: Google Elm (DT) [ 1.617491] Workqueue: pm pm_runtime_work [ 1.620545] mtk-iommu 10205000.iommu: bound 19001000.larb (ops mtk_smi_larb_component_ops) [ 1.627229] pstate: 60000085 (nZCv daIf -PAN -UAO -TCO BTYPE=--) [ 1.659297] pc : clk_core_disable+0xb0/0xb8 [ 1.663475] lr : clk_core_disable+0xb0/0xb8 [ 1.667652] sp : ffff800011b9bbe0 [ 1.670959] x29: ffff800011b9bbe0 x28: 0000000000000000 [ 1.676267] x27: ffff800011448000 x26: ffff8000100cfd98 [ 1.681574] x25: ffff800011b9bd48 x24: 0000000000000000 [ 1.686882] x23: 0000000000000000 x22: ffff8000106fad90 [ 1.692189] x21: 000000000000000a x20: ffff0000c0048500 [ 1.697496] x19: ffff0000c0048500 x18: ffffffffffffffff [ 1.702804] x17: 0000000000000000 x16: 0000000000000000 [ 1.708112] x15: ffff800011460300 x14: fffffffffffe0000 [ 1.713420] x13: ffff8000114602d8 x12: 0720072007200720 [ 1.718727] x11: 0720072007200720 x10: 0720072007200720 [ 1.724035] x9 : ffff800011b9bbe0 x8 : ffff800011b9bbe0 [ 1.729342] x7 : 0000000000000009 x6 : ffff8000114b8328 [ 1.734649] x5 : 0000000000000000 x4 : 0000000000000000 [ 1.739956] x3 : 00000000ffffffff x2 : ffff800011460298 [ 1.745263] x1 : 1af1d7de276f4500 x0 : 0000000000000000 [ 1.750572] Call trace: [ 1.753010] clk_core_disable+0xb0/0xb8 [ 1.756840] clk_core_disable_lock+0x24/0x40 [ 1.761105] clk_disable+0x20/0x30 [ 1.764501] mtk_iommu_runtime_suspend+0x88/0xa8 [ 1.769114] pm_generic_runtime_suspend+0x2c/0x48 [ 1.773815] __rpm_callback+0xe0/0x178 [ 1.777559] rpm_callback+0x24/0x88 [ 1.781041] rpm_suspend+0xdc/0x470 [ 1.784523] rpm_idle+0x12c/0x170 [ 1.787831] pm_runtime_work+0xa8/0xc0 [ 1.791573] process_one_work+0x1e8/0x360 [ 1.795580] worker_thread+0x44/0x478 [ 1.799237] kthread+0x150/0x158 [ 1.802460] ret_from_fork+0x10/0x30 [ 1.806034] ---[ end trace 82402920ef64573b ]--- [ 1.810728] ------------[ cut here ]------------ In addition, we now don't need to enable the clock from the function mtk_iommu_hw_init since it is already enabled by the resume. Fixes: c0b5758 ("iommu/mediatek: Add power-domain operation") Signed-off-by: Dafna Hirschfeld <[email protected]> Reviewed-by: Yong Wu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Sep 16, 2021
[ Upstream commit fe4d557 ] lockdep complains that in omap-aes, the list_lock is taken both with softirqs enabled at probe time, and also in softirq context, which could lead to a deadlock: ================================ WARNING: inconsistent lock state 5.14.0-rc1-00035-gc836005b01c5-dirty Freescale#69 Not tainted -------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. ksoftirqd/0/7 [HC0[0]:SC1[3]:HE1:SE0] takes: bf00e014 (list_lock){+.?.}-{2:2}, at: omap_aes_find_dev+0x18/0x54 [omap_aes_driver] {SOFTIRQ-ON-W} state was registered at: _raw_spin_lock+0x40/0x50 omap_aes_probe+0x1d4/0x664 [omap_aes_driver] platform_probe+0x58/0xb8 really_probe+0xbc/0x314 __driver_probe_device+0x80/0xe4 driver_probe_device+0x30/0xc8 __driver_attach+0x70/0xf4 bus_for_each_dev+0x70/0xb4 bus_add_driver+0xf0/0x1d4 driver_register+0x74/0x108 do_one_initcall+0x84/0x2e4 do_init_module+0x5c/0x240 load_module+0x221c/0x2584 sys_finit_module+0xb0/0xec ret_fast_syscall+0x0/0x2c 0xbed90b30 irq event stamp: 111800 hardirqs last enabled at (111800): [<c02a21e4>] __kmalloc+0x484/0x5ec hardirqs last disabled at (111799): [<c02a21f0>] __kmalloc+0x490/0x5ec softirqs last enabled at (111776): [<c01015f0>] __do_softirq+0x2b8/0x4d0 softirqs last disabled at (111781): [<c0135948>] run_ksoftirqd+0x34/0x50 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(list_lock); <Interrupt> lock(list_lock); *** DEADLOCK *** 2 locks held by ksoftirqd/0/7: #0: c0f5e8c8 (rcu_read_lock){....}-{1:2}, at: netif_receive_skb+0x6c/0x260 Freescale#1: c0f5e8c8 (rcu_read_lock){....}-{1:2}, at: ip_local_deliver_finish+0x2c/0xdc stack backtrace: CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 5.14.0-rc1-00035-gc836005b01c5-dirty Freescale#69 Hardware name: Generic AM43 (Flattened Device Tree) [<c010e6e0>] (unwind_backtrace) from [<c010b9d0>] (show_stack+0x10/0x14) [<c010b9d0>] (show_stack) from [<c017c640>] (mark_lock.part.17+0x5bc/0xd04) [<c017c640>] (mark_lock.part.17) from [<c017d9e4>] (__lock_acquire+0x960/0x2fa4) [<c017d9e4>] (__lock_acquire) from [<c0180980>] (lock_acquire+0x10c/0x358) [<c0180980>] (lock_acquire) from [<c093d324>] (_raw_spin_lock_bh+0x44/0x58) [<c093d324>] (_raw_spin_lock_bh) from [<bf00b258>] (omap_aes_find_dev+0x18/0x54 [omap_aes_driver]) [<bf00b258>] (omap_aes_find_dev [omap_aes_driver]) from [<bf00b328>] (omap_aes_crypt+0x94/0xd4 [omap_aes_driver]) [<bf00b328>] (omap_aes_crypt [omap_aes_driver]) from [<c08ac6d0>] (esp_input+0x1b0/0x2c8) [<c08ac6d0>] (esp_input) from [<c08c9e90>] (xfrm_input+0x410/0x1290) [<c08c9e90>] (xfrm_input) from [<c08b6374>] (xfrm4_esp_rcv+0x54/0x11c) [<c08b6374>] (xfrm4_esp_rcv) from [<c0838840>] (ip_protocol_deliver_rcu+0x48/0x3bc) [<c0838840>] (ip_protocol_deliver_rcu) from [<c0838c50>] (ip_local_deliver_finish+0x9c/0xdc) [<c0838c50>] (ip_local_deliver_finish) from [<c0838dd8>] (ip_local_deliver+0x148/0x1b0) [<c0838dd8>] (ip_local_deliver) from [<c0838f5c>] (ip_rcv+0x11c/0x180) [<c0838f5c>] (ip_rcv) from [<c077e3a4>] (__netif_receive_skb_one_core+0x54/0x74) [<c077e3a4>] (__netif_receive_skb_one_core) from [<c077e588>] (netif_receive_skb+0xa8/0x260) [<c077e588>] (netif_receive_skb) from [<c068d6d4>] (cpsw_rx_handler+0x224/0x2fc) [<c068d6d4>] (cpsw_rx_handler) from [<c0688ccc>] (__cpdma_chan_process+0xf4/0x188) [<c0688ccc>] (__cpdma_chan_process) from [<c068a0c0>] (cpdma_chan_process+0x3c/0x5c) [<c068a0c0>] (cpdma_chan_process) from [<c0690e14>] (cpsw_rx_mq_poll+0x44/0x98) [<c0690e14>] (cpsw_rx_mq_poll) from [<c0780810>] (__napi_poll+0x28/0x268) [<c0780810>] (__napi_poll) from [<c0780c64>] (net_rx_action+0xcc/0x204) [<c0780c64>] (net_rx_action) from [<c0101478>] (__do_softirq+0x140/0x4d0) [<c0101478>] (__do_softirq) from [<c0135948>] (run_ksoftirqd+0x34/0x50) [<c0135948>] (run_ksoftirqd) from [<c01583b8>] (smpboot_thread_fn+0xf4/0x1d8) [<c01583b8>] (smpboot_thread_fn) from [<c01546dc>] (kthread+0x14c/0x174) [<c01546dc>] (kthread) from [<c010013c>] (ret_from_fork+0x14/0x38) ... The omap-des and omap-sham drivers appear to have a similar issue. Fix this by using spin_{,un}lock_bh() around device list access in all the probe and remove functions. Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: Herbert Xu <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Sep 16, 2021
[ Upstream commit fe4d557 ] lockdep complains that in omap-aes, the list_lock is taken both with softirqs enabled at probe time, and also in softirq context, which could lead to a deadlock: ================================ WARNING: inconsistent lock state 5.14.0-rc1-00035-gc836005b01c5-dirty Freescale#69 Not tainted -------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. ksoftirqd/0/7 [HC0[0]:SC1[3]:HE1:SE0] takes: bf00e014 (list_lock){+.?.}-{2:2}, at: omap_aes_find_dev+0x18/0x54 [omap_aes_driver] {SOFTIRQ-ON-W} state was registered at: _raw_spin_lock+0x40/0x50 omap_aes_probe+0x1d4/0x664 [omap_aes_driver] platform_probe+0x58/0xb8 really_probe+0xbc/0x314 __driver_probe_device+0x80/0xe4 driver_probe_device+0x30/0xc8 __driver_attach+0x70/0xf4 bus_for_each_dev+0x70/0xb4 bus_add_driver+0xf0/0x1d4 driver_register+0x74/0x108 do_one_initcall+0x84/0x2e4 do_init_module+0x5c/0x240 load_module+0x221c/0x2584 sys_finit_module+0xb0/0xec ret_fast_syscall+0x0/0x2c 0xbed90b30 irq event stamp: 111800 hardirqs last enabled at (111800): [<c02a21e4>] __kmalloc+0x484/0x5ec hardirqs last disabled at (111799): [<c02a21f0>] __kmalloc+0x490/0x5ec softirqs last enabled at (111776): [<c01015f0>] __do_softirq+0x2b8/0x4d0 softirqs last disabled at (111781): [<c0135948>] run_ksoftirqd+0x34/0x50 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(list_lock); <Interrupt> lock(list_lock); *** DEADLOCK *** 2 locks held by ksoftirqd/0/7: #0: c0f5e8c8 (rcu_read_lock){....}-{1:2}, at: netif_receive_skb+0x6c/0x260 Freescale#1: c0f5e8c8 (rcu_read_lock){....}-{1:2}, at: ip_local_deliver_finish+0x2c/0xdc stack backtrace: CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 5.14.0-rc1-00035-gc836005b01c5-dirty Freescale#69 Hardware name: Generic AM43 (Flattened Device Tree) [<c010e6e0>] (unwind_backtrace) from [<c010b9d0>] (show_stack+0x10/0x14) [<c010b9d0>] (show_stack) from [<c017c640>] (mark_lock.part.17+0x5bc/0xd04) [<c017c640>] (mark_lock.part.17) from [<c017d9e4>] (__lock_acquire+0x960/0x2fa4) [<c017d9e4>] (__lock_acquire) from [<c0180980>] (lock_acquire+0x10c/0x358) [<c0180980>] (lock_acquire) from [<c093d324>] (_raw_spin_lock_bh+0x44/0x58) [<c093d324>] (_raw_spin_lock_bh) from [<bf00b258>] (omap_aes_find_dev+0x18/0x54 [omap_aes_driver]) [<bf00b258>] (omap_aes_find_dev [omap_aes_driver]) from [<bf00b328>] (omap_aes_crypt+0x94/0xd4 [omap_aes_driver]) [<bf00b328>] (omap_aes_crypt [omap_aes_driver]) from [<c08ac6d0>] (esp_input+0x1b0/0x2c8) [<c08ac6d0>] (esp_input) from [<c08c9e90>] (xfrm_input+0x410/0x1290) [<c08c9e90>] (xfrm_input) from [<c08b6374>] (xfrm4_esp_rcv+0x54/0x11c) [<c08b6374>] (xfrm4_esp_rcv) from [<c0838840>] (ip_protocol_deliver_rcu+0x48/0x3bc) [<c0838840>] (ip_protocol_deliver_rcu) from [<c0838c50>] (ip_local_deliver_finish+0x9c/0xdc) [<c0838c50>] (ip_local_deliver_finish) from [<c0838dd8>] (ip_local_deliver+0x148/0x1b0) [<c0838dd8>] (ip_local_deliver) from [<c0838f5c>] (ip_rcv+0x11c/0x180) [<c0838f5c>] (ip_rcv) from [<c077e3a4>] (__netif_receive_skb_one_core+0x54/0x74) [<c077e3a4>] (__netif_receive_skb_one_core) from [<c077e588>] (netif_receive_skb+0xa8/0x260) [<c077e588>] (netif_receive_skb) from [<c068d6d4>] (cpsw_rx_handler+0x224/0x2fc) [<c068d6d4>] (cpsw_rx_handler) from [<c0688ccc>] (__cpdma_chan_process+0xf4/0x188) [<c0688ccc>] (__cpdma_chan_process) from [<c068a0c0>] (cpdma_chan_process+0x3c/0x5c) [<c068a0c0>] (cpdma_chan_process) from [<c0690e14>] (cpsw_rx_mq_poll+0x44/0x98) [<c0690e14>] (cpsw_rx_mq_poll) from [<c0780810>] (__napi_poll+0x28/0x268) [<c0780810>] (__napi_poll) from [<c0780c64>] (net_rx_action+0xcc/0x204) [<c0780c64>] (net_rx_action) from [<c0101478>] (__do_softirq+0x140/0x4d0) [<c0101478>] (__do_softirq) from [<c0135948>] (run_ksoftirqd+0x34/0x50) [<c0135948>] (run_ksoftirqd) from [<c01583b8>] (smpboot_thread_fn+0xf4/0x1d8) [<c01583b8>] (smpboot_thread_fn) from [<c01546dc>] (kthread+0x14c/0x174) [<c01546dc>] (kthread) from [<c010013c>] (ret_from_fork+0x14/0x38) ... The omap-des and omap-sham drivers appear to have a similar issue. Fix this by using spin_{,un}lock_bh() around device list access in all the probe and remove functions. Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: Herbert Xu <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Sep 20, 2021
[ Upstream commit 9de71ed ] xfstest generic/587 reports a deadlock issue as below: ====================================================== WARNING: possible circular locking dependency detected 5.14.0-rc1 Freescale#69 Not tainted ------------------------------------------------------ repquota/8606 is trying to acquire lock: ffff888022ac9320 (&sb->s_type->i_mutex_key#18){+.+.}-{3:3}, at: f2fs_quota_sync+0x207/0x300 [f2fs] but task is already holding lock: ffff8880084bcde8 (&sbi->quota_sem){.+.+}-{3:3}, at: f2fs_quota_sync+0x59/0x300 [f2fs] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> Freescale#2 (&sbi->quota_sem){.+.+}-{3:3}: __lock_acquire+0x648/0x10b0 lock_acquire+0x128/0x470 down_read+0x3b/0x2a0 f2fs_quota_sync+0x59/0x300 [f2fs] f2fs_quota_on+0x48/0x100 [f2fs] do_quotactl+0x5e3/0xb30 __x64_sys_quotactl+0x23a/0x4e0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae -> Freescale#1 (&sbi->cp_rwsem){++++}-{3:3}: __lock_acquire+0x648/0x10b0 lock_acquire+0x128/0x470 down_read+0x3b/0x2a0 f2fs_unlink+0x353/0x670 [f2fs] vfs_unlink+0x1c7/0x380 do_unlinkat+0x413/0x4b0 __x64_sys_unlinkat+0x50/0xb0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae -> #0 (&sb->s_type->i_mutex_key#18){+.+.}-{3:3}: check_prev_add+0xdc/0xb30 validate_chain+0xa67/0xb20 __lock_acquire+0x648/0x10b0 lock_acquire+0x128/0x470 down_write+0x39/0xc0 f2fs_quota_sync+0x207/0x300 [f2fs] do_quotactl+0xaff/0xb30 __x64_sys_quotactl+0x23a/0x4e0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae other info that might help us debug this: Chain exists of: &sb->s_type->i_mutex_key#18 --> &sbi->cp_rwsem --> &sbi->quota_sem Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&sbi->quota_sem); lock(&sbi->cp_rwsem); lock(&sbi->quota_sem); lock(&sb->s_type->i_mutex_key#18); *** DEADLOCK *** 3 locks held by repquota/8606: #0: ffff88801efac0e0 (&type->s_umount_key#53){++++}-{3:3}, at: user_get_super+0xd9/0x190 Freescale#1: ffff8880084bc380 (&sbi->cp_rwsem){++++}-{3:3}, at: f2fs_quota_sync+0x3e/0x300 [f2fs] Freescale#2: ffff8880084bcde8 (&sbi->quota_sem){.+.+}-{3:3}, at: f2fs_quota_sync+0x59/0x300 [f2fs] stack backtrace: CPU: 6 PID: 8606 Comm: repquota Not tainted 5.14.0-rc1 Freescale#69 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 Call Trace: dump_stack_lvl+0xce/0x134 dump_stack+0x17/0x20 print_circular_bug.isra.0.cold+0x239/0x253 check_noncircular+0x1be/0x1f0 check_prev_add+0xdc/0xb30 validate_chain+0xa67/0xb20 __lock_acquire+0x648/0x10b0 lock_acquire+0x128/0x470 down_write+0x39/0xc0 f2fs_quota_sync+0x207/0x300 [f2fs] do_quotactl+0xaff/0xb30 __x64_sys_quotactl+0x23a/0x4e0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f883b0b4efe The root cause is ABBA deadlock of inode lock and cp_rwsem, reorder locks in f2fs_quota_sync() as below to fix this issue: - lock inode - lock cp_rwsem - lock quota_sem Fixes: db6ec53 ("f2fs: add a rw_sem to cover quota flag changes") Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Sep 20, 2021
[ Upstream commit 9de71ed ] xfstest generic/587 reports a deadlock issue as below: ====================================================== WARNING: possible circular locking dependency detected 5.14.0-rc1 Freescale#69 Not tainted ------------------------------------------------------ repquota/8606 is trying to acquire lock: ffff888022ac9320 (&sb->s_type->i_mutex_key#18){+.+.}-{3:3}, at: f2fs_quota_sync+0x207/0x300 [f2fs] but task is already holding lock: ffff8880084bcde8 (&sbi->quota_sem){.+.+}-{3:3}, at: f2fs_quota_sync+0x59/0x300 [f2fs] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> Freescale#2 (&sbi->quota_sem){.+.+}-{3:3}: __lock_acquire+0x648/0x10b0 lock_acquire+0x128/0x470 down_read+0x3b/0x2a0 f2fs_quota_sync+0x59/0x300 [f2fs] f2fs_quota_on+0x48/0x100 [f2fs] do_quotactl+0x5e3/0xb30 __x64_sys_quotactl+0x23a/0x4e0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae -> Freescale#1 (&sbi->cp_rwsem){++++}-{3:3}: __lock_acquire+0x648/0x10b0 lock_acquire+0x128/0x470 down_read+0x3b/0x2a0 f2fs_unlink+0x353/0x670 [f2fs] vfs_unlink+0x1c7/0x380 do_unlinkat+0x413/0x4b0 __x64_sys_unlinkat+0x50/0xb0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae -> #0 (&sb->s_type->i_mutex_key#18){+.+.}-{3:3}: check_prev_add+0xdc/0xb30 validate_chain+0xa67/0xb20 __lock_acquire+0x648/0x10b0 lock_acquire+0x128/0x470 down_write+0x39/0xc0 f2fs_quota_sync+0x207/0x300 [f2fs] do_quotactl+0xaff/0xb30 __x64_sys_quotactl+0x23a/0x4e0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae other info that might help us debug this: Chain exists of: &sb->s_type->i_mutex_key#18 --> &sbi->cp_rwsem --> &sbi->quota_sem Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&sbi->quota_sem); lock(&sbi->cp_rwsem); lock(&sbi->quota_sem); lock(&sb->s_type->i_mutex_key#18); *** DEADLOCK *** 3 locks held by repquota/8606: #0: ffff88801efac0e0 (&type->s_umount_key#53){++++}-{3:3}, at: user_get_super+0xd9/0x190 Freescale#1: ffff8880084bc380 (&sbi->cp_rwsem){++++}-{3:3}, at: f2fs_quota_sync+0x3e/0x300 [f2fs] Freescale#2: ffff8880084bcde8 (&sbi->quota_sem){.+.+}-{3:3}, at: f2fs_quota_sync+0x59/0x300 [f2fs] stack backtrace: CPU: 6 PID: 8606 Comm: repquota Not tainted 5.14.0-rc1 Freescale#69 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 Call Trace: dump_stack_lvl+0xce/0x134 dump_stack+0x17/0x20 print_circular_bug.isra.0.cold+0x239/0x253 check_noncircular+0x1be/0x1f0 check_prev_add+0xdc/0xb30 validate_chain+0xa67/0xb20 __lock_acquire+0x648/0x10b0 lock_acquire+0x128/0x470 down_write+0x39/0xc0 f2fs_quota_sync+0x207/0x300 [f2fs] do_quotactl+0xaff/0xb30 __x64_sys_quotactl+0x23a/0x4e0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f883b0b4efe The root cause is ABBA deadlock of inode lock and cp_rwsem, reorder locks in f2fs_quota_sync() as below to fix this issue: - lock inode - lock cp_rwsem - lock quota_sem Fixes: db6ec53 ("f2fs: add a rw_sem to cover quota flag changes") Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Sep 22, 2021
[ Upstream commit 9de71ed ] xfstest generic/587 reports a deadlock issue as below: ====================================================== WARNING: possible circular locking dependency detected 5.14.0-rc1 Freescale#69 Not tainted ------------------------------------------------------ repquota/8606 is trying to acquire lock: ffff888022ac9320 (&sb->s_type->i_mutex_key#18){+.+.}-{3:3}, at: f2fs_quota_sync+0x207/0x300 [f2fs] but task is already holding lock: ffff8880084bcde8 (&sbi->quota_sem){.+.+}-{3:3}, at: f2fs_quota_sync+0x59/0x300 [f2fs] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> Freescale#2 (&sbi->quota_sem){.+.+}-{3:3}: __lock_acquire+0x648/0x10b0 lock_acquire+0x128/0x470 down_read+0x3b/0x2a0 f2fs_quota_sync+0x59/0x300 [f2fs] f2fs_quota_on+0x48/0x100 [f2fs] do_quotactl+0x5e3/0xb30 __x64_sys_quotactl+0x23a/0x4e0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae -> Freescale#1 (&sbi->cp_rwsem){++++}-{3:3}: __lock_acquire+0x648/0x10b0 lock_acquire+0x128/0x470 down_read+0x3b/0x2a0 f2fs_unlink+0x353/0x670 [f2fs] vfs_unlink+0x1c7/0x380 do_unlinkat+0x413/0x4b0 __x64_sys_unlinkat+0x50/0xb0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae -> #0 (&sb->s_type->i_mutex_key#18){+.+.}-{3:3}: check_prev_add+0xdc/0xb30 validate_chain+0xa67/0xb20 __lock_acquire+0x648/0x10b0 lock_acquire+0x128/0x470 down_write+0x39/0xc0 f2fs_quota_sync+0x207/0x300 [f2fs] do_quotactl+0xaff/0xb30 __x64_sys_quotactl+0x23a/0x4e0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae other info that might help us debug this: Chain exists of: &sb->s_type->i_mutex_key#18 --> &sbi->cp_rwsem --> &sbi->quota_sem Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&sbi->quota_sem); lock(&sbi->cp_rwsem); lock(&sbi->quota_sem); lock(&sb->s_type->i_mutex_key#18); *** DEADLOCK *** 3 locks held by repquota/8606: #0: ffff88801efac0e0 (&type->s_umount_key#53){++++}-{3:3}, at: user_get_super+0xd9/0x190 Freescale#1: ffff8880084bc380 (&sbi->cp_rwsem){++++}-{3:3}, at: f2fs_quota_sync+0x3e/0x300 [f2fs] Freescale#2: ffff8880084bcde8 (&sbi->quota_sem){.+.+}-{3:3}, at: f2fs_quota_sync+0x59/0x300 [f2fs] stack backtrace: CPU: 6 PID: 8606 Comm: repquota Not tainted 5.14.0-rc1 Freescale#69 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 Call Trace: dump_stack_lvl+0xce/0x134 dump_stack+0x17/0x20 print_circular_bug.isra.0.cold+0x239/0x253 check_noncircular+0x1be/0x1f0 check_prev_add+0xdc/0xb30 validate_chain+0xa67/0xb20 __lock_acquire+0x648/0x10b0 lock_acquire+0x128/0x470 down_write+0x39/0xc0 f2fs_quota_sync+0x207/0x300 [f2fs] do_quotactl+0xaff/0xb30 __x64_sys_quotactl+0x23a/0x4e0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f883b0b4efe The root cause is ABBA deadlock of inode lock and cp_rwsem, reorder locks in f2fs_quota_sync() as below to fix this issue: - lock inode - lock cp_rwsem - lock quota_sem Fixes: db6ec53 ("f2fs: add a rw_sem to cover quota flag changes") Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Oct 21, 2022
…te() [ Upstream commit f020d95 ] When peer delete failed in a disconnect operation, use-after-free detected by KFENCE in below log. It is because for each vdev_id and address, it has only one struct ath10k_peer, it is allocated in ath10k_peer_map_event(). When connected to an AP, it has more than one HTT_T2H_MSG_TYPE_PEER_MAP reported from firmware, then the array peer_map of struct ath10k will be set muti-elements to the same ath10k_peer in ath10k_peer_map_event(). When peer delete failed in ath10k_sta_state(), the ath10k_peer will be free for the 1st peer id in array peer_map of struct ath10k, and then use-after-free happened for the 2nd peer id because they map to the same ath10k_peer. And clean up all peers in array peer_map for the ath10k_peer, then user-after-free disappeared peer map event log: [ 306.911021] wlan0: authenticate with b0:2a:43:e6:75:0e [ 306.957187] ath10k_pci 0000:01:00.0: mac vdev 0 peer create b0:2a:43:e6:75:0e (new sta) sta 1 / 32 peer 1 / 33 [ 306.957395] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 246 [ 306.957404] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 198 [ 306.986924] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 166 peer unmap event log: [ 435.715691] wlan0: deauthenticating from b0:2a:43:e6:75:0e by local choice (Reason: 3=DEAUTH_LEAVING) [ 435.716802] ath10k_pci 0000:01:00.0: mac vdev 0 peer delete b0:2a:43:e6:75:0e sta ffff990e0e9c2b50 (sta gone) [ 435.717177] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 246 [ 435.717186] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 198 [ 435.717193] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 166 use-after-free log: [21705.888627] wlan0: deauthenticating from d0:76:8f:82:be:75 by local choice (Reason: 3=DEAUTH_LEAVING) [21713.799910] ath10k_pci 0000:01:00.0: failed to delete peer d0:76:8f:82:be:75 for vdev 0: -110 [21713.799925] ath10k_pci 0000:01:00.0: found sta peer d0:76:8f:82:be:75 (ptr 0000000000000000 id 102) entry on vdev 0 after it was supposedly removed [21713.799968] ================================================================== [21713.799991] BUG: KFENCE: use-after-free read in ath10k_sta_state+0x265/0xb8a [ath10k_core] [21713.799991] [21713.799997] Use-after-free read at 0x00000000abe1c75e (in kfence-Freescale#69): [21713.800010] ath10k_sta_state+0x265/0xb8a [ath10k_core] [21713.800041] drv_sta_state+0x115/0x677 [mac80211] [21713.800059] __sta_info_destroy_part2+0xb1/0x133 [mac80211] [21713.800076] __sta_info_flush+0x11d/0x162 [mac80211] [21713.800093] ieee80211_set_disassoc+0x12d/0x2f4 [mac80211] [21713.800110] ieee80211_mgd_deauth+0x26c/0x29b [mac80211] [21713.800137] cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211] [21713.800153] nl80211_deauthenticate+0xf8/0x121 [cfg80211] [21713.800161] genl_rcv_msg+0x38e/0x3be [21713.800166] netlink_rcv_skb+0x89/0xf7 [21713.800171] genl_rcv+0x28/0x36 [21713.800176] netlink_unicast+0x179/0x24b [21713.800181] netlink_sendmsg+0x3a0/0x40e [21713.800187] sock_sendmsg+0x72/0x76 [21713.800192] ____sys_sendmsg+0x16d/0x1e3 [21713.800196] ___sys_sendmsg+0x95/0xd1 [21713.800200] __sys_sendmsg+0x85/0xbf [21713.800205] do_syscall_64+0x43/0x55 [21713.800210] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [21713.800213] [21713.800219] kfence-Freescale#69: 0x000000009149b0d5-0x000000004c0697fb, size=1064, cache=kmalloc-2k [21713.800219] [21713.800224] allocated by task 13 on cpu 0 at 21705.501373s: [21713.800241] ath10k_peer_map_event+0x7e/0x154 [ath10k_core] [21713.800254] ath10k_htt_t2h_msg_handler+0x586/0x1039 [ath10k_core] [21713.800265] ath10k_htt_htc_t2h_msg_handler+0x12/0x28 [ath10k_core] [21713.800277] ath10k_htc_rx_completion_handler+0x14c/0x1b5 [ath10k_core] [21713.800283] ath10k_pci_process_rx_cb+0x195/0x1df [ath10k_pci] [21713.800294] ath10k_ce_per_engine_service+0x55/0x74 [ath10k_core] [21713.800305] ath10k_ce_per_engine_service_any+0x76/0x84 [ath10k_core] [21713.800310] ath10k_pci_napi_poll+0x49/0x144 [ath10k_pci] [21713.800316] net_rx_action+0xdc/0x361 [21713.800320] __do_softirq+0x163/0x29a [21713.800325] asm_call_irq_on_stack+0x12/0x20 [21713.800331] do_softirq_own_stack+0x3c/0x48 [21713.800337] __irq_exit_rcu+0x9b/0x9d [21713.800342] common_interrupt+0xc9/0x14d [21713.800346] asm_common_interrupt+0x1e/0x40 [21713.800351] ksoftirqd_should_run+0x5/0x16 [21713.800357] smpboot_thread_fn+0x148/0x211 [21713.800362] kthread+0x150/0x15f [21713.800367] ret_from_fork+0x22/0x30 [21713.800370] [21713.800374] freed by task 708 on cpu 1 at 21713.799953s: [21713.800498] ath10k_sta_state+0x2c6/0xb8a [ath10k_core] [21713.800515] drv_sta_state+0x115/0x677 [mac80211] [21713.800532] __sta_info_destroy_part2+0xb1/0x133 [mac80211] [21713.800548] __sta_info_flush+0x11d/0x162 [mac80211] [21713.800565] ieee80211_set_disassoc+0x12d/0x2f4 [mac80211] [21713.800581] ieee80211_mgd_deauth+0x26c/0x29b [mac80211] [21713.800598] cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211] [21713.800614] nl80211_deauthenticate+0xf8/0x121 [cfg80211] [21713.800619] genl_rcv_msg+0x38e/0x3be [21713.800623] netlink_rcv_skb+0x89/0xf7 [21713.800628] genl_rcv+0x28/0x36 [21713.800632] netlink_unicast+0x179/0x24b [21713.800637] netlink_sendmsg+0x3a0/0x40e [21713.800642] sock_sendmsg+0x72/0x76 [21713.800646] ____sys_sendmsg+0x16d/0x1e3 [21713.800651] ___sys_sendmsg+0x95/0xd1 [21713.800655] __sys_sendmsg+0x85/0xbf [21713.800659] do_syscall_64+0x43/0x55 [21713.800663] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00288-QCARMSWPZ-1 Fixes: d0eeafa ("ath10k: Clean up peer when sta goes away.") Signed-off-by: Wen Gong <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
angolini
pushed a commit
to angolini/linux-fslc
that referenced
this pull request
Nov 8, 2022
…te() [ Upstream commit f020d95 ] When peer delete failed in a disconnect operation, use-after-free detected by KFENCE in below log. It is because for each vdev_id and address, it has only one struct ath10k_peer, it is allocated in ath10k_peer_map_event(). When connected to an AP, it has more than one HTT_T2H_MSG_TYPE_PEER_MAP reported from firmware, then the array peer_map of struct ath10k will be set muti-elements to the same ath10k_peer in ath10k_peer_map_event(). When peer delete failed in ath10k_sta_state(), the ath10k_peer will be free for the 1st peer id in array peer_map of struct ath10k, and then use-after-free happened for the 2nd peer id because they map to the same ath10k_peer. And clean up all peers in array peer_map for the ath10k_peer, then user-after-free disappeared peer map event log: [ 306.911021] wlan0: authenticate with b0:2a:43:e6:75:0e [ 306.957187] ath10k_pci 0000:01:00.0: mac vdev 0 peer create b0:2a:43:e6:75:0e (new sta) sta 1 / 32 peer 1 / 33 [ 306.957395] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 246 [ 306.957404] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 198 [ 306.986924] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 166 peer unmap event log: [ 435.715691] wlan0: deauthenticating from b0:2a:43:e6:75:0e by local choice (Reason: 3=DEAUTH_LEAVING) [ 435.716802] ath10k_pci 0000:01:00.0: mac vdev 0 peer delete b0:2a:43:e6:75:0e sta ffff990e0e9c2b50 (sta gone) [ 435.717177] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 246 [ 435.717186] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 198 [ 435.717193] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 166 use-after-free log: [21705.888627] wlan0: deauthenticating from d0:76:8f:82:be:75 by local choice (Reason: 3=DEAUTH_LEAVING) [21713.799910] ath10k_pci 0000:01:00.0: failed to delete peer d0:76:8f:82:be:75 for vdev 0: -110 [21713.799925] ath10k_pci 0000:01:00.0: found sta peer d0:76:8f:82:be:75 (ptr 0000000000000000 id 102) entry on vdev 0 after it was supposedly removed [21713.799968] ================================================================== [21713.799991] BUG: KFENCE: use-after-free read in ath10k_sta_state+0x265/0xb8a [ath10k_core] [21713.799991] [21713.799997] Use-after-free read at 0x00000000abe1c75e (in kfence-Freescale#69): [21713.800010] ath10k_sta_state+0x265/0xb8a [ath10k_core] [21713.800041] drv_sta_state+0x115/0x677 [mac80211] [21713.800059] __sta_info_destroy_part2+0xb1/0x133 [mac80211] [21713.800076] __sta_info_flush+0x11d/0x162 [mac80211] [21713.800093] ieee80211_set_disassoc+0x12d/0x2f4 [mac80211] [21713.800110] ieee80211_mgd_deauth+0x26c/0x29b [mac80211] [21713.800137] cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211] [21713.800153] nl80211_deauthenticate+0xf8/0x121 [cfg80211] [21713.800161] genl_rcv_msg+0x38e/0x3be [21713.800166] netlink_rcv_skb+0x89/0xf7 [21713.800171] genl_rcv+0x28/0x36 [21713.800176] netlink_unicast+0x179/0x24b [21713.800181] netlink_sendmsg+0x3a0/0x40e [21713.800187] sock_sendmsg+0x72/0x76 [21713.800192] ____sys_sendmsg+0x16d/0x1e3 [21713.800196] ___sys_sendmsg+0x95/0xd1 [21713.800200] __sys_sendmsg+0x85/0xbf [21713.800205] do_syscall_64+0x43/0x55 [21713.800210] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [21713.800213] [21713.800219] kfence-Freescale#69: 0x000000009149b0d5-0x000000004c0697fb, size=1064, cache=kmalloc-2k [21713.800219] [21713.800224] allocated by task 13 on cpu 0 at 21705.501373s: [21713.800241] ath10k_peer_map_event+0x7e/0x154 [ath10k_core] [21713.800254] ath10k_htt_t2h_msg_handler+0x586/0x1039 [ath10k_core] [21713.800265] ath10k_htt_htc_t2h_msg_handler+0x12/0x28 [ath10k_core] [21713.800277] ath10k_htc_rx_completion_handler+0x14c/0x1b5 [ath10k_core] [21713.800283] ath10k_pci_process_rx_cb+0x195/0x1df [ath10k_pci] [21713.800294] ath10k_ce_per_engine_service+0x55/0x74 [ath10k_core] [21713.800305] ath10k_ce_per_engine_service_any+0x76/0x84 [ath10k_core] [21713.800310] ath10k_pci_napi_poll+0x49/0x144 [ath10k_pci] [21713.800316] net_rx_action+0xdc/0x361 [21713.800320] __do_softirq+0x163/0x29a [21713.800325] asm_call_irq_on_stack+0x12/0x20 [21713.800331] do_softirq_own_stack+0x3c/0x48 [21713.800337] __irq_exit_rcu+0x9b/0x9d [21713.800342] common_interrupt+0xc9/0x14d [21713.800346] asm_common_interrupt+0x1e/0x40 [21713.800351] ksoftirqd_should_run+0x5/0x16 [21713.800357] smpboot_thread_fn+0x148/0x211 [21713.800362] kthread+0x150/0x15f [21713.800367] ret_from_fork+0x22/0x30 [21713.800370] [21713.800374] freed by task 708 on cpu 1 at 21713.799953s: [21713.800498] ath10k_sta_state+0x2c6/0xb8a [ath10k_core] [21713.800515] drv_sta_state+0x115/0x677 [mac80211] [21713.800532] __sta_info_destroy_part2+0xb1/0x133 [mac80211] [21713.800548] __sta_info_flush+0x11d/0x162 [mac80211] [21713.800565] ieee80211_set_disassoc+0x12d/0x2f4 [mac80211] [21713.800581] ieee80211_mgd_deauth+0x26c/0x29b [mac80211] [21713.800598] cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211] [21713.800614] nl80211_deauthenticate+0xf8/0x121 [cfg80211] [21713.800619] genl_rcv_msg+0x38e/0x3be [21713.800623] netlink_rcv_skb+0x89/0xf7 [21713.800628] genl_rcv+0x28/0x36 [21713.800632] netlink_unicast+0x179/0x24b [21713.800637] netlink_sendmsg+0x3a0/0x40e [21713.800642] sock_sendmsg+0x72/0x76 [21713.800646] ____sys_sendmsg+0x16d/0x1e3 [21713.800651] ___sys_sendmsg+0x95/0xd1 [21713.800655] __sys_sendmsg+0x85/0xbf [21713.800659] do_syscall_64+0x43/0x55 [21713.800663] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00288-QCARMSWPZ-1 Fixes: d0eeafa ("ath10k: Clean up peer when sta goes away.") Signed-off-by: Wen Gong <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Nov 10, 2022
…te() [ Upstream commit f020d95 ] When peer delete failed in a disconnect operation, use-after-free detected by KFENCE in below log. It is because for each vdev_id and address, it has only one struct ath10k_peer, it is allocated in ath10k_peer_map_event(). When connected to an AP, it has more than one HTT_T2H_MSG_TYPE_PEER_MAP reported from firmware, then the array peer_map of struct ath10k will be set muti-elements to the same ath10k_peer in ath10k_peer_map_event(). When peer delete failed in ath10k_sta_state(), the ath10k_peer will be free for the 1st peer id in array peer_map of struct ath10k, and then use-after-free happened for the 2nd peer id because they map to the same ath10k_peer. And clean up all peers in array peer_map for the ath10k_peer, then user-after-free disappeared peer map event log: [ 306.911021] wlan0: authenticate with b0:2a:43:e6:75:0e [ 306.957187] ath10k_pci 0000:01:00.0: mac vdev 0 peer create b0:2a:43:e6:75:0e (new sta) sta 1 / 32 peer 1 / 33 [ 306.957395] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 246 [ 306.957404] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 198 [ 306.986924] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 166 peer unmap event log: [ 435.715691] wlan0: deauthenticating from b0:2a:43:e6:75:0e by local choice (Reason: 3=DEAUTH_LEAVING) [ 435.716802] ath10k_pci 0000:01:00.0: mac vdev 0 peer delete b0:2a:43:e6:75:0e sta ffff990e0e9c2b50 (sta gone) [ 435.717177] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 246 [ 435.717186] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 198 [ 435.717193] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 166 use-after-free log: [21705.888627] wlan0: deauthenticating from d0:76:8f:82:be:75 by local choice (Reason: 3=DEAUTH_LEAVING) [21713.799910] ath10k_pci 0000:01:00.0: failed to delete peer d0:76:8f:82:be:75 for vdev 0: -110 [21713.799925] ath10k_pci 0000:01:00.0: found sta peer d0:76:8f:82:be:75 (ptr 0000000000000000 id 102) entry on vdev 0 after it was supposedly removed [21713.799968] ================================================================== [21713.799991] BUG: KFENCE: use-after-free read in ath10k_sta_state+0x265/0xb8a [ath10k_core] [21713.799991] [21713.799997] Use-after-free read at 0x00000000abe1c75e (in kfence-Freescale#69): [21713.800010] ath10k_sta_state+0x265/0xb8a [ath10k_core] [21713.800041] drv_sta_state+0x115/0x677 [mac80211] [21713.800059] __sta_info_destroy_part2+0xb1/0x133 [mac80211] [21713.800076] __sta_info_flush+0x11d/0x162 [mac80211] [21713.800093] ieee80211_set_disassoc+0x12d/0x2f4 [mac80211] [21713.800110] ieee80211_mgd_deauth+0x26c/0x29b [mac80211] [21713.800137] cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211] [21713.800153] nl80211_deauthenticate+0xf8/0x121 [cfg80211] [21713.800161] genl_rcv_msg+0x38e/0x3be [21713.800166] netlink_rcv_skb+0x89/0xf7 [21713.800171] genl_rcv+0x28/0x36 [21713.800176] netlink_unicast+0x179/0x24b [21713.800181] netlink_sendmsg+0x3a0/0x40e [21713.800187] sock_sendmsg+0x72/0x76 [21713.800192] ____sys_sendmsg+0x16d/0x1e3 [21713.800196] ___sys_sendmsg+0x95/0xd1 [21713.800200] __sys_sendmsg+0x85/0xbf [21713.800205] do_syscall_64+0x43/0x55 [21713.800210] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [21713.800213] [21713.800219] kfence-Freescale#69: 0x000000009149b0d5-0x000000004c0697fb, size=1064, cache=kmalloc-2k [21713.800219] [21713.800224] allocated by task 13 on cpu 0 at 21705.501373s: [21713.800241] ath10k_peer_map_event+0x7e/0x154 [ath10k_core] [21713.800254] ath10k_htt_t2h_msg_handler+0x586/0x1039 [ath10k_core] [21713.800265] ath10k_htt_htc_t2h_msg_handler+0x12/0x28 [ath10k_core] [21713.800277] ath10k_htc_rx_completion_handler+0x14c/0x1b5 [ath10k_core] [21713.800283] ath10k_pci_process_rx_cb+0x195/0x1df [ath10k_pci] [21713.800294] ath10k_ce_per_engine_service+0x55/0x74 [ath10k_core] [21713.800305] ath10k_ce_per_engine_service_any+0x76/0x84 [ath10k_core] [21713.800310] ath10k_pci_napi_poll+0x49/0x144 [ath10k_pci] [21713.800316] net_rx_action+0xdc/0x361 [21713.800320] __do_softirq+0x163/0x29a [21713.800325] asm_call_irq_on_stack+0x12/0x20 [21713.800331] do_softirq_own_stack+0x3c/0x48 [21713.800337] __irq_exit_rcu+0x9b/0x9d [21713.800342] common_interrupt+0xc9/0x14d [21713.800346] asm_common_interrupt+0x1e/0x40 [21713.800351] ksoftirqd_should_run+0x5/0x16 [21713.800357] smpboot_thread_fn+0x148/0x211 [21713.800362] kthread+0x150/0x15f [21713.800367] ret_from_fork+0x22/0x30 [21713.800370] [21713.800374] freed by task 708 on cpu 1 at 21713.799953s: [21713.800498] ath10k_sta_state+0x2c6/0xb8a [ath10k_core] [21713.800515] drv_sta_state+0x115/0x677 [mac80211] [21713.800532] __sta_info_destroy_part2+0xb1/0x133 [mac80211] [21713.800548] __sta_info_flush+0x11d/0x162 [mac80211] [21713.800565] ieee80211_set_disassoc+0x12d/0x2f4 [mac80211] [21713.800581] ieee80211_mgd_deauth+0x26c/0x29b [mac80211] [21713.800598] cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211] [21713.800614] nl80211_deauthenticate+0xf8/0x121 [cfg80211] [21713.800619] genl_rcv_msg+0x38e/0x3be [21713.800623] netlink_rcv_skb+0x89/0xf7 [21713.800628] genl_rcv+0x28/0x36 [21713.800632] netlink_unicast+0x179/0x24b [21713.800637] netlink_sendmsg+0x3a0/0x40e [21713.800642] sock_sendmsg+0x72/0x76 [21713.800646] ____sys_sendmsg+0x16d/0x1e3 [21713.800651] ___sys_sendmsg+0x95/0xd1 [21713.800655] __sys_sendmsg+0x85/0xbf [21713.800659] do_syscall_64+0x43/0x55 [21713.800663] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00288-QCARMSWPZ-1 Fixes: d0eeafa ("ath10k: Clean up peer when sta goes away.") Signed-off-by: Wen Gong <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
flobz
pushed a commit
to flobz/linux-fslc
that referenced
this pull request
Aug 23, 2024
[ Upstream commit 7015843 ] Alexei reported that send_signal test may fail with nested CONFIG_PARAVIRT configs. In this particular case, the base VM is AMD with 166 cpus, and I run selftests with regular qemu on top of that and indeed send_signal test failed. I also tried with an Intel box with 80 cpus and there is no issue. The main qemu command line includes: -enable-kvm -smp 16 -cpu host The failure log looks like: $ ./test_progs -t send_signal [ 48.501588] watchdog: BUG: soft lockup - CPU#9 stuck for 26s! [test_progs:2225] [ 48.503622] Modules linked in: bpf_testmod(O) [ 48.503622] CPU: 9 PID: 2225 Comm: test_progs Tainted: G O 6.9.0-08561-g2c1713a8f1c9-dirty Freescale#69 [ 48.507629] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 [ 48.511635] RIP: 0010:handle_softirqs+0x71/0x290 [ 48.511635] Code: [...] 10 0a 00 00 00 31 c0 65 66 89 05 d5 f4 fa 7e fb bb ff ff ff ff <49> c7 c2 cb [ 48.518527] RSP: 0018:ffffc90000310fa0 EFLAGS: 00000246 [ 48.519579] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 00000000000006e0 [ 48.522526] RDX: 0000000000000006 RSI: ffff88810791ae80 RDI: 0000000000000000 [ 48.523587] RBP: ffffc90000fabc88 R08: 00000005a0af4f7f R09: 0000000000000000 [ 48.525525] R10: 0000000561d2f29c R11: 0000000000006534 R12: 0000000000000280 [ 48.528525] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 48.528525] FS: 00007f2f2885cd00(0000) GS:ffff888237c40000(0000) knlGS:0000000000000000 [ 48.531600] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 48.535520] CR2: 00007f2f287059f0 CR3: 0000000106a28002 CR4: 00000000003706f0 [ 48.537538] Call Trace: [ 48.537538] <IRQ> [ 48.537538] ? watchdog_timer_fn+0x1cd/0x250 [ 48.539590] ? lockup_detector_update_enable+0x50/0x50 [ 48.539590] ? __hrtimer_run_queues+0xff/0x280 [ 48.542520] ? hrtimer_interrupt+0x103/0x230 [ 48.544524] ? __sysvec_apic_timer_interrupt+0x4f/0x140 [ 48.545522] ? sysvec_apic_timer_interrupt+0x3a/0x90 [ 48.547612] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 48.547612] ? handle_softirqs+0x71/0x290 [ 48.547612] irq_exit_rcu+0x63/0x80 [ 48.551585] sysvec_apic_timer_interrupt+0x75/0x90 [ 48.552521] </IRQ> [ 48.553529] <TASK> [ 48.553529] asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 48.555609] RIP: 0010:finish_task_switch.isra.0+0x90/0x260 [ 48.556526] Code: [...] 9f 58 0a 00 00 48 85 db 0f 85 89 01 00 00 4c 89 ff e8 53 d9 bd 00 fb 66 90 <4d> 85 ed 74 [ 48.562524] RSP: 0018:ffffc90000fabd38 EFLAGS: 00000282 [ 48.563589] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff83385620 [ 48.563589] RDX: ffff888237c73ae4 RSI: 0000000000000000 RDI: ffff888237c6fd00 [ 48.568521] RBP: ffffc90000fabd68 R08: 0000000000000000 R09: 0000000000000000 [ 48.569528] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8881009d0000 [ 48.573525] R13: ffff8881024e5400 R14: ffff88810791ae80 R15: ffff888237c6fd00 [ 48.575614] ? finish_task_switch.isra.0+0x8d/0x260 [ 48.576523] __schedule+0x364/0xac0 [ 48.577535] schedule+0x2e/0x110 [ 48.578555] pipe_read+0x301/0x400 [ 48.579589] ? destroy_sched_domains_rcu+0x30/0x30 [ 48.579589] vfs_read+0x2b3/0x2f0 [ 48.579589] ksys_read+0x8b/0xc0 [ 48.583590] do_syscall_64+0x3d/0xc0 [ 48.583590] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 48.586525] RIP: 0033:0x7f2f28703fa1 [ 48.587592] Code: [...] 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 80 3d c5 23 14 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0 [ 48.593534] RSP: 002b:00007ffd90f8cf88 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 48.595589] RAX: ffffffffffffffda RBX: 00007ffd90f8d5e8 RCX: 00007f2f28703fa1 [ 48.595589] RDX: 0000000000000001 RSI: 00007ffd90f8cfb0 RDI: 0000000000000006 [ 48.599592] RBP: 00007ffd90f8d2f0 R08: 0000000000000064 R09: 0000000000000000 [ 48.602527] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 48.603589] R13: 00007ffd90f8d608 R14: 00007f2f288d8000 R15: 0000000000f6bdb0 [ 48.605527] </TASK> In the test, two processes are communicating through pipe. Further debugging with strace found that the above splat is triggered as read() syscall could not receive the data even if the corresponding write() syscall in another process successfully wrote data into the pipe. The failed subtest is "send_signal_perf". The corresponding perf event has sample_period 1 and config PERF_COUNT_SW_CPU_CLOCK. sample_period 1 means every overflow event will trigger a call to the BPF program. So I suspect this may overwhelm the system. So I increased the sample_period to 100,000 and the test passed. The sample_period 10,000 still has the test failed. In other parts of selftest, e.g., [1], sample_freq is used instead. So I decided to use sample_freq = 1,000 since the test can pass as well. [1] https://lore.kernel.org/bpf/[email protected]/ Reported-by: Alexei Starovoitov <[email protected]> Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]>
flobz
pushed a commit
to flobz/linux-fslc
that referenced
this pull request
Aug 23, 2024
[ Upstream commit 7015843 ] Alexei reported that send_signal test may fail with nested CONFIG_PARAVIRT configs. In this particular case, the base VM is AMD with 166 cpus, and I run selftests with regular qemu on top of that and indeed send_signal test failed. I also tried with an Intel box with 80 cpus and there is no issue. The main qemu command line includes: -enable-kvm -smp 16 -cpu host The failure log looks like: $ ./test_progs -t send_signal [ 48.501588] watchdog: BUG: soft lockup - CPU#9 stuck for 26s! [test_progs:2225] [ 48.503622] Modules linked in: bpf_testmod(O) [ 48.503622] CPU: 9 PID: 2225 Comm: test_progs Tainted: G O 6.9.0-08561-g2c1713a8f1c9-dirty Freescale#69 [ 48.507629] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 [ 48.511635] RIP: 0010:handle_softirqs+0x71/0x290 [ 48.511635] Code: [...] 10 0a 00 00 00 31 c0 65 66 89 05 d5 f4 fa 7e fb bb ff ff ff ff <49> c7 c2 cb [ 48.518527] RSP: 0018:ffffc90000310fa0 EFLAGS: 00000246 [ 48.519579] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 00000000000006e0 [ 48.522526] RDX: 0000000000000006 RSI: ffff88810791ae80 RDI: 0000000000000000 [ 48.523587] RBP: ffffc90000fabc88 R08: 00000005a0af4f7f R09: 0000000000000000 [ 48.525525] R10: 0000000561d2f29c R11: 0000000000006534 R12: 0000000000000280 [ 48.528525] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 48.528525] FS: 00007f2f2885cd00(0000) GS:ffff888237c40000(0000) knlGS:0000000000000000 [ 48.531600] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 48.535520] CR2: 00007f2f287059f0 CR3: 0000000106a28002 CR4: 00000000003706f0 [ 48.537538] Call Trace: [ 48.537538] <IRQ> [ 48.537538] ? watchdog_timer_fn+0x1cd/0x250 [ 48.539590] ? lockup_detector_update_enable+0x50/0x50 [ 48.539590] ? __hrtimer_run_queues+0xff/0x280 [ 48.542520] ? hrtimer_interrupt+0x103/0x230 [ 48.544524] ? __sysvec_apic_timer_interrupt+0x4f/0x140 [ 48.545522] ? sysvec_apic_timer_interrupt+0x3a/0x90 [ 48.547612] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 48.547612] ? handle_softirqs+0x71/0x290 [ 48.547612] irq_exit_rcu+0x63/0x80 [ 48.551585] sysvec_apic_timer_interrupt+0x75/0x90 [ 48.552521] </IRQ> [ 48.553529] <TASK> [ 48.553529] asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 48.555609] RIP: 0010:finish_task_switch.isra.0+0x90/0x260 [ 48.556526] Code: [...] 9f 58 0a 00 00 48 85 db 0f 85 89 01 00 00 4c 89 ff e8 53 d9 bd 00 fb 66 90 <4d> 85 ed 74 [ 48.562524] RSP: 0018:ffffc90000fabd38 EFLAGS: 00000282 [ 48.563589] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff83385620 [ 48.563589] RDX: ffff888237c73ae4 RSI: 0000000000000000 RDI: ffff888237c6fd00 [ 48.568521] RBP: ffffc90000fabd68 R08: 0000000000000000 R09: 0000000000000000 [ 48.569528] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8881009d0000 [ 48.573525] R13: ffff8881024e5400 R14: ffff88810791ae80 R15: ffff888237c6fd00 [ 48.575614] ? finish_task_switch.isra.0+0x8d/0x260 [ 48.576523] __schedule+0x364/0xac0 [ 48.577535] schedule+0x2e/0x110 [ 48.578555] pipe_read+0x301/0x400 [ 48.579589] ? destroy_sched_domains_rcu+0x30/0x30 [ 48.579589] vfs_read+0x2b3/0x2f0 [ 48.579589] ksys_read+0x8b/0xc0 [ 48.583590] do_syscall_64+0x3d/0xc0 [ 48.583590] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 48.586525] RIP: 0033:0x7f2f28703fa1 [ 48.587592] Code: [...] 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 80 3d c5 23 14 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0 [ 48.593534] RSP: 002b:00007ffd90f8cf88 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 48.595589] RAX: ffffffffffffffda RBX: 00007ffd90f8d5e8 RCX: 00007f2f28703fa1 [ 48.595589] RDX: 0000000000000001 RSI: 00007ffd90f8cfb0 RDI: 0000000000000006 [ 48.599592] RBP: 00007ffd90f8d2f0 R08: 0000000000000064 R09: 0000000000000000 [ 48.602527] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 48.603589] R13: 00007ffd90f8d608 R14: 00007f2f288d8000 R15: 0000000000f6bdb0 [ 48.605527] </TASK> In the test, two processes are communicating through pipe. Further debugging with strace found that the above splat is triggered as read() syscall could not receive the data even if the corresponding write() syscall in another process successfully wrote data into the pipe. The failed subtest is "send_signal_perf". The corresponding perf event has sample_period 1 and config PERF_COUNT_SW_CPU_CLOCK. sample_period 1 means every overflow event will trigger a call to the BPF program. So I suspect this may overwhelm the system. So I increased the sample_period to 100,000 and the test passed. The sample_period 10,000 still has the test failed. In other parts of selftest, e.g., [1], sample_freq is used instead. So I decided to use sample_freq = 1,000 since the test can pass as well. [1] https://lore.kernel.org/bpf/[email protected]/ Reported-by: Alexei Starovoitov <[email protected]> Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]>
angolini
pushed a commit
to angolini/linux-fslc
that referenced
this pull request
Sep 18, 2024
[ Upstream commit 7015843 ] Alexei reported that send_signal test may fail with nested CONFIG_PARAVIRT configs. In this particular case, the base VM is AMD with 166 cpus, and I run selftests with regular qemu on top of that and indeed send_signal test failed. I also tried with an Intel box with 80 cpus and there is no issue. The main qemu command line includes: -enable-kvm -smp 16 -cpu host The failure log looks like: $ ./test_progs -t send_signal [ 48.501588] watchdog: BUG: soft lockup - CPU#9 stuck for 26s! [test_progs:2225] [ 48.503622] Modules linked in: bpf_testmod(O) [ 48.503622] CPU: 9 PID: 2225 Comm: test_progs Tainted: G O 6.9.0-08561-g2c1713a8f1c9-dirty Freescale#69 [ 48.507629] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 [ 48.511635] RIP: 0010:handle_softirqs+0x71/0x290 [ 48.511635] Code: [...] 10 0a 00 00 00 31 c0 65 66 89 05 d5 f4 fa 7e fb bb ff ff ff ff <49> c7 c2 cb [ 48.518527] RSP: 0018:ffffc90000310fa0 EFLAGS: 00000246 [ 48.519579] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 00000000000006e0 [ 48.522526] RDX: 0000000000000006 RSI: ffff88810791ae80 RDI: 0000000000000000 [ 48.523587] RBP: ffffc90000fabc88 R08: 00000005a0af4f7f R09: 0000000000000000 [ 48.525525] R10: 0000000561d2f29c R11: 0000000000006534 R12: 0000000000000280 [ 48.528525] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 48.528525] FS: 00007f2f2885cd00(0000) GS:ffff888237c40000(0000) knlGS:0000000000000000 [ 48.531600] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 48.535520] CR2: 00007f2f287059f0 CR3: 0000000106a28002 CR4: 00000000003706f0 [ 48.537538] Call Trace: [ 48.537538] <IRQ> [ 48.537538] ? watchdog_timer_fn+0x1cd/0x250 [ 48.539590] ? lockup_detector_update_enable+0x50/0x50 [ 48.539590] ? __hrtimer_run_queues+0xff/0x280 [ 48.542520] ? hrtimer_interrupt+0x103/0x230 [ 48.544524] ? __sysvec_apic_timer_interrupt+0x4f/0x140 [ 48.545522] ? sysvec_apic_timer_interrupt+0x3a/0x90 [ 48.547612] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 48.547612] ? handle_softirqs+0x71/0x290 [ 48.547612] irq_exit_rcu+0x63/0x80 [ 48.551585] sysvec_apic_timer_interrupt+0x75/0x90 [ 48.552521] </IRQ> [ 48.553529] <TASK> [ 48.553529] asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 48.555609] RIP: 0010:finish_task_switch.isra.0+0x90/0x260 [ 48.556526] Code: [...] 9f 58 0a 00 00 48 85 db 0f 85 89 01 00 00 4c 89 ff e8 53 d9 bd 00 fb 66 90 <4d> 85 ed 74 [ 48.562524] RSP: 0018:ffffc90000fabd38 EFLAGS: 00000282 [ 48.563589] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff83385620 [ 48.563589] RDX: ffff888237c73ae4 RSI: 0000000000000000 RDI: ffff888237c6fd00 [ 48.568521] RBP: ffffc90000fabd68 R08: 0000000000000000 R09: 0000000000000000 [ 48.569528] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8881009d0000 [ 48.573525] R13: ffff8881024e5400 R14: ffff88810791ae80 R15: ffff888237c6fd00 [ 48.575614] ? finish_task_switch.isra.0+0x8d/0x260 [ 48.576523] __schedule+0x364/0xac0 [ 48.577535] schedule+0x2e/0x110 [ 48.578555] pipe_read+0x301/0x400 [ 48.579589] ? destroy_sched_domains_rcu+0x30/0x30 [ 48.579589] vfs_read+0x2b3/0x2f0 [ 48.579589] ksys_read+0x8b/0xc0 [ 48.583590] do_syscall_64+0x3d/0xc0 [ 48.583590] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 48.586525] RIP: 0033:0x7f2f28703fa1 [ 48.587592] Code: [...] 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 80 3d c5 23 14 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0 [ 48.593534] RSP: 002b:00007ffd90f8cf88 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 48.595589] RAX: ffffffffffffffda RBX: 00007ffd90f8d5e8 RCX: 00007f2f28703fa1 [ 48.595589] RDX: 0000000000000001 RSI: 00007ffd90f8cfb0 RDI: 0000000000000006 [ 48.599592] RBP: 00007ffd90f8d2f0 R08: 0000000000000064 R09: 0000000000000000 [ 48.602527] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 48.603589] R13: 00007ffd90f8d608 R14: 00007f2f288d8000 R15: 0000000000f6bdb0 [ 48.605527] </TASK> In the test, two processes are communicating through pipe. Further debugging with strace found that the above splat is triggered as read() syscall could not receive the data even if the corresponding write() syscall in another process successfully wrote data into the pipe. The failed subtest is "send_signal_perf". The corresponding perf event has sample_period 1 and config PERF_COUNT_SW_CPU_CLOCK. sample_period 1 means every overflow event will trigger a call to the BPF program. So I suspect this may overwhelm the system. So I increased the sample_period to 100,000 and the test passed. The sample_period 10,000 still has the test failed. In other parts of selftest, e.g., [1], sample_freq is used instead. So I decided to use sample_freq = 1,000 since the test can pass as well. [1] https://lore.kernel.org/bpf/[email protected]/ Reported-by: Alexei Starovoitov <[email protected]> Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]>
otavio
pushed a commit
that referenced
this pull request
Nov 11, 2024
commit 409dc51 upstream. The flexspi on imx8ulp only has 16 LUTs, and imx8mm flexspi has 32 LUTs, so correct the compatible string here, otherwise will meet below error: [ 1.119072] ------------[ cut here ]------------ [ 1.123926] WARNING: CPU: 0 PID: 1 at drivers/spi/spi-nxp-fspi.c:855 nxp_fspi_exec_op+0xb04/0xb64 [ 1.133239] Modules linked in: [ 1.136448] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-rc6-next-20240902-00001-g131bf9439dd9 #69 [ 1.146821] Hardware name: NXP i.MX8ULP EVK (DT) [ 1.151647] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 1.158931] pc : nxp_fspi_exec_op+0xb04/0xb64 [ 1.163496] lr : nxp_fspi_exec_op+0xa34/0xb64 [ 1.168060] sp : ffff80008002b2a0 [ 1.171526] x29: ffff80008002b2d0 x28: 0000000000000000 x27: 0000000000000000 [ 1.179002] x26: ffff2eb645542580 x25: ffff800080610014 x24: ffff800080610000 [ 1.186480] x23: ffff2eb645548080 x22: 0000000000000006 x21: ffff2eb6455425e0 [ 1.193956] x20: 0000000000000000 x19: ffff80008002b5e0 x18: ffffffffffffffff [ 1.201432] x17: ffff2eb644467508 x16: 0000000000000138 x15: 0000000000000002 [ 1.208907] x14: 0000000000000000 x13: ffff2eb6400d8080 x12: 00000000ffffff00 [ 1.216378] x11: 0000000000000000 x10: ffff2eb6400d8080 x9 : ffff2eb697adca80 [ 1.223850] x8 : ffff2eb697ad3cc0 x7 : 0000000100000000 x6 : 0000000000000001 [ 1.231324] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000000007a6 [ 1.238795] x2 : 0000000000000000 x1 : 00000000000001ce x0 : 00000000ffffff92 [ 1.246267] Call trace: [ 1.248824] nxp_fspi_exec_op+0xb04/0xb64 [ 1.253031] spi_mem_exec_op+0x3a0/0x430 [ 1.257139] spi_nor_read_id+0x80/0xcc [ 1.261065] spi_nor_scan+0x1ec/0xf10 [ 1.264901] spi_nor_probe+0x108/0x2fc [ 1.268828] spi_mem_probe+0x6c/0xbc [ 1.272574] spi_probe+0x84/0xe4 [ 1.275958] really_probe+0xbc/0x29c [ 1.279713] __driver_probe_device+0x78/0x12c [ 1.284277] driver_probe_device+0xd8/0x15c [ 1.288660] __device_attach_driver+0xb8/0x134 [ 1.293316] bus_for_each_drv+0x88/0xe8 [ 1.297337] __device_attach+0xa0/0x190 [ 1.301353] device_initial_probe+0x14/0x20 [ 1.305734] bus_probe_device+0xac/0xb0 [ 1.309752] device_add+0x5d0/0x790 [ 1.313408] __spi_add_device+0x134/0x204 [ 1.317606] of_register_spi_device+0x3b4/0x590 [ 1.322348] spi_register_controller+0x47c/0x754 [ 1.327181] devm_spi_register_controller+0x4c/0xa4 [ 1.332289] nxp_fspi_probe+0x1cc/0x2b0 [ 1.336307] platform_probe+0x68/0xc4 [ 1.340145] really_probe+0xbc/0x29c [ 1.343893] __driver_probe_device+0x78/0x12c [ 1.348457] driver_probe_device+0xd8/0x15c [ 1.352838] __driver_attach+0x90/0x19c [ 1.356857] bus_for_each_dev+0x7c/0xdc [ 1.360877] driver_attach+0x24/0x30 [ 1.364624] bus_add_driver+0xe4/0x208 [ 1.368552] driver_register+0x5c/0x124 [ 1.372573] __platform_driver_register+0x28/0x34 [ 1.377497] nxp_fspi_driver_init+0x1c/0x28 [ 1.381888] do_one_initcall+0x80/0x1c8 [ 1.385908] kernel_init_freeable+0x1c4/0x28c [ 1.390472] kernel_init+0x20/0x1d8 [ 1.394138] ret_from_fork+0x10/0x20 [ 1.397885] ---[ end trace 0000000000000000 ]--- [ 1.407908] ------------[ cut here ]------------ Fixes: ef89fd5 ("arm64: dts: imx8ulp: add flexspi node") Cc: [email protected] Signed-off-by: Haibo Chen <[email protected]> Signed-off-by: Shawn Guo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
ernestvh
pushed a commit
to ernestvh/linux-fslc
that referenced
this pull request
Jan 30, 2025
commit 9860370 upstream. irq_chip functions may be called in raw spinlock context. Therefore, we must also use a raw spinlock for our own internal locking. This fixes the following lockdep splat: [ 5.349336] ============================= [ 5.353349] [ BUG: Invalid wait context ] [ 5.357361] 6.13.0-rc5+ Freescale#69 Tainted: G W [ 5.363031] ----------------------------- [ 5.367045] kworker/u17:1/44 is trying to lock: [ 5.371587] ffffff88018b02c0 (&chip->gpio_lock){....}-{3:3}, at: xgpio_irq_unmask (drivers/gpio/gpio-xilinx.c:433 (discriminator 8)) [ 5.380079] other info that might help us debug this: [ 5.385138] context-{5:5} [ 5.387762] 5 locks held by kworker/u17:1/44: [ 5.392123] #0: ffffff8800014958 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work (kernel/workqueue.c:3204) [ 5.402260] Freescale#1: ffffffc082fcbdd8 (deferred_probe_work){+.+.}-{0:0}, at: process_one_work (kernel/workqueue.c:3205) [ 5.411528] Freescale#2: ffffff880172c900 (&dev->mutex){....}-{4:4}, at: __device_attach (drivers/base/dd.c:1006) [ 5.419929] Freescale#3: ffffff88039c8268 (request_class#2){+.+.}-{4:4}, at: __setup_irq (kernel/irq/internals.h:156 kernel/irq/manage.c:1596) [ 5.428331] Freescale#4: ffffff88039c80c8 (lock_class#2){....}-{2:2}, at: __setup_irq (kernel/irq/manage.c:1614) [ 5.436472] stack backtrace: [ 5.439359] CPU: 2 UID: 0 PID: 44 Comm: kworker/u17:1 Tainted: G W 6.13.0-rc5+ Freescale#69 [ 5.448690] Tainted: [W]=WARN [ 5.451656] Hardware name: xlnx,zynqmp (DT) [ 5.455845] Workqueue: events_unbound deferred_probe_work_func [ 5.461699] Call trace: [ 5.464147] show_stack+0x18/0x24 C [ 5.467821] dump_stack_lvl (lib/dump_stack.c:123) [ 5.471501] dump_stack (lib/dump_stack.c:130) [ 5.474824] __lock_acquire (kernel/locking/lockdep.c:4828 kernel/locking/lockdep.c:4898 kernel/locking/lockdep.c:5176) [ 5.478758] lock_acquire (arch/arm64/include/asm/percpu.h:40 kernel/locking/lockdep.c:467 kernel/locking/lockdep.c:5851 kernel/locking/lockdep.c:5814) [ 5.482429] _raw_spin_lock_irqsave (include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162) [ 5.486797] xgpio_irq_unmask (drivers/gpio/gpio-xilinx.c:433 (discriminator 8)) [ 5.490737] irq_enable (kernel/irq/internals.h:236 kernel/irq/chip.c:170 kernel/irq/chip.c:439 kernel/irq/chip.c:432 kernel/irq/chip.c:345) [ 5.494060] __irq_startup (kernel/irq/internals.h:241 kernel/irq/chip.c:180 kernel/irq/chip.c:250) [ 5.497645] irq_startup (kernel/irq/chip.c:270) [ 5.501143] __setup_irq (kernel/irq/manage.c:1807) [ 5.504728] request_threaded_irq (kernel/irq/manage.c:2208) Fixes: a32c7ca ("gpio: gpio-xilinx: Add interrupt support") Signed-off-by: Sean Anderson <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bartosz Golaszewski <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Feb 8, 2025
commit 9860370 upstream. irq_chip functions may be called in raw spinlock context. Therefore, we must also use a raw spinlock for our own internal locking. This fixes the following lockdep splat: [ 5.349336] ============================= [ 5.353349] [ BUG: Invalid wait context ] [ 5.357361] 6.13.0-rc5+ Freescale#69 Tainted: G W [ 5.363031] ----------------------------- [ 5.367045] kworker/u17:1/44 is trying to lock: [ 5.371587] ffffff88018b02c0 (&chip->gpio_lock){....}-{3:3}, at: xgpio_irq_unmask (drivers/gpio/gpio-xilinx.c:433 (discriminator 8)) [ 5.380079] other info that might help us debug this: [ 5.385138] context-{5:5} [ 5.387762] 5 locks held by kworker/u17:1/44: [ 5.392123] #0: ffffff8800014958 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work (kernel/workqueue.c:3204) [ 5.402260] Freescale#1: ffffffc082fcbdd8 (deferred_probe_work){+.+.}-{0:0}, at: process_one_work (kernel/workqueue.c:3205) [ 5.411528] Freescale#2: ffffff880172c900 (&dev->mutex){....}-{4:4}, at: __device_attach (drivers/base/dd.c:1006) [ 5.419929] Freescale#3: ffffff88039c8268 (request_class#2){+.+.}-{4:4}, at: __setup_irq (kernel/irq/internals.h:156 kernel/irq/manage.c:1596) [ 5.428331] Freescale#4: ffffff88039c80c8 (lock_class#2){....}-{2:2}, at: __setup_irq (kernel/irq/manage.c:1614) [ 5.436472] stack backtrace: [ 5.439359] CPU: 2 UID: 0 PID: 44 Comm: kworker/u17:1 Tainted: G W 6.13.0-rc5+ Freescale#69 [ 5.448690] Tainted: [W]=WARN [ 5.451656] Hardware name: xlnx,zynqmp (DT) [ 5.455845] Workqueue: events_unbound deferred_probe_work_func [ 5.461699] Call trace: [ 5.464147] show_stack+0x18/0x24 C [ 5.467821] dump_stack_lvl (lib/dump_stack.c:123) [ 5.471501] dump_stack (lib/dump_stack.c:130) [ 5.474824] __lock_acquire (kernel/locking/lockdep.c:4828 kernel/locking/lockdep.c:4898 kernel/locking/lockdep.c:5176) [ 5.478758] lock_acquire (arch/arm64/include/asm/percpu.h:40 kernel/locking/lockdep.c:467 kernel/locking/lockdep.c:5851 kernel/locking/lockdep.c:5814) [ 5.482429] _raw_spin_lock_irqsave (include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162) [ 5.486797] xgpio_irq_unmask (drivers/gpio/gpio-xilinx.c:433 (discriminator 8)) [ 5.490737] irq_enable (kernel/irq/internals.h:236 kernel/irq/chip.c:170 kernel/irq/chip.c:439 kernel/irq/chip.c:432 kernel/irq/chip.c:345) [ 5.494060] __irq_startup (kernel/irq/internals.h:241 kernel/irq/chip.c:180 kernel/irq/chip.c:250) [ 5.497645] irq_startup (kernel/irq/chip.c:270) [ 5.501143] __setup_irq (kernel/irq/manage.c:1807) [ 5.504728] request_threaded_irq (kernel/irq/manage.c:2208) Fixes: a32c7ca ("gpio: gpio-xilinx: Add interrupt support") Signed-off-by: Sean Anderson <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bartosz Golaszewski <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Automatic merge performed, no conflicts reported.
-- andrey