forked from torvalds/linux
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sync up with Linus #62
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
When fsync is done through checkpoint, previous f2fs missed to clear append and update flag. This patch fixes to clear them. This was originally catched by Changman Lee before. Signed-off-by: Changman Lee <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
If inode has inline_data, it should report -ENOENT when accessing out-of-bound region. This is used by f2fs_fiemap which treats -ENOENT with no error. Signed-off-by: Jaegeuk Kim <[email protected]>
The f2fs has been shipped on many smartphone devices during a couple of years. So, it is worth to relocate Kconfig into main page from misc filesystems for developers to choose it more easily. Signed-off-by: Jaegeuk Kim <[email protected]>
extent tree/node slab cache is created during f2fs insmod, how, it isn't destroyed during f2fs rmmod, this patch fix it by destroy extent tree/node slab cache once rmmod f2fs. Signed-off-by: Wanpeng Li <[email protected]> Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
In __set_free we will check whether all segment are free in one section when free one segment, in order to set section to free status. But the searching region of segmap is from start segno to last segno of main area, it's not necessary. So let's just only check all segment bitmap of target section. Signed-off-by: Wanpeng Li <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
The function 'find_in_inline_dir()' contain 'res_page' as an argument. So, we should initiaize 'res_page' before this function. Signed-off-by: Yuan Zhong <[email protected]> Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
Remove the unnecessary condition judgment, because 'max_slots' has been initialized to '0' at the beginging of the function, as following: if (max_slots) *max_slots = 0; Signed-off-by: Yuan Zhong <[email protected]> Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
Through each macro, we can read the meaning easily. Signed-off-by: Changman Lee <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
nm_i->nat_tree_lock is used to sync both the operations of nat entry cache tree and nat set cache tree, however, it isn't held when flush nat entries during checkpoint which lead to potential race, this patch fix it by holding the lock when gang lookup nat set cache and delete item from nat set cache. Signed-off-by: Wanpeng Li <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
When lookuping for creating, we will try to record the level of current dentry hash table if current dentry has enough contiguous slots for storing name of new file which will be created later, this can save our lookup time when add a link into parent dir. But currently in find_target_dentry, our current length of contiguous free slots is not calculated correctly. This make us leaving some holes in dentry block occasionally, it wastes our space of dentry block. Let's refactor the lookup flow for max slots as following to fix this issue: a) increase max_len if current slot is free; b) update max_slots with max_len if max_len is larger than max_slots; c) reset max_len to zero if current slot is not free. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
Our f2fs_acl_create is copied and modified from posix_acl_create to avoid deadlock bug when inline_dentry feature is enabled. Now, we got reference leaks in posix_acl_create, and this has been fixed in commit fed0b58 ("posix_acl: fix reference leaks in posix_acl_create") by Omar Sandoval. https://lkml.org/lkml/2015/2/9/5 Let's fix this issue in f2fs_acl_create too. Signed-off-by: Chao Yu <[email protected]> Reviewed-by: Changman Lee <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
Previously if inode is with inline data, we will try to invalid partial inline data in page #0 when we truncate size of inode in truncate_partial_data_page(). And then we set page #0 to dirty, after this we can synchronize inode page with page #0 at ->writepage(). But sometimes we will fail to operate page #0 in truncate_partial_data_page() due to below reason: a) if offset is zero, we will skip setting page #0 to dirty. b) if page #0 is not uptodate, we will fail to update it as it has no mapping data. So with following operations, we will meet recent data which should be truncated. 1.write inline data to file 2.sync first data page to inode page 3.truncate file size to 0 4.truncate file size to max_inline_size 5.echo 1 > /proc/sys/vm/drop_caches 6.read file --> meet original inline data which is remained in inode page. This patch renames truncate_inline_data() to truncate_inline_inode() for code readability, then use truncate_inline_inode() to truncate inline data in inode page in truncate_blocks() and truncate page #0 in truncate_partial_data_page() for fixing. v2: o truncate partially #0 page in truncate_partial_data_page to avoid keeping old data in #0 page. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
In __allocate_data_blocks, we should check current blkaddr which is located at ofs_in_node of dnode page instead of checking first blkaddr all the time. Otherwise we can only allocate one blkaddr in each dnode page. Fix it. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
In the following call stack, f2fs changes the bitmap for dirty segments and # of dirty sentries without grabbing sit_i->sentry_lock. This can result in mismatch on bitmap and # of dirty sentries, since if there are some direct_io operations. In allocate_data_block, - __allocate_new_segments - mutex_lock(&curseg->curseg_mutex); - s_ops->allocate_segment - new_curseg/change_curseg - reset_curseg - __set_sit_entry_type - __mark_sit_entry_dirty - set_bit(dirty_sentries_bitmap) - dirty_sentries++; Signed-off-by: Jaegeuk Kim <[email protected]>
This patch tries to set SBI_NEED_FSCK flag into sbi only when we fail to recover in fill_super, so we could skip fscking image when we fail to fill super for other reason. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
This patch modifies to call set_buffer_new, if new blocks are allocated. Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
Previously, f2fs_write_data_pages has a mutex, sbi->writepages, to serialize data writes to maximize write bandwidth, while sacrificing multi-threads performance. Practically, however, multi-threads environment is much more important for users. So this patch tries to remove the mutex. Signed-off-by: Jaegeuk Kim <[email protected]>
This patch removes wrong f2fs_bug_on in truncate_inline_inode. When there is no space, it can happen a corner case where i_isze is over MAX_INLINE_SIZE while its inode is still inline_data. The scenario is 1. write small data into file #A. 2. fill the whole partition to 100%. 3. truncate 4096 on file #A. 4. write data at 8192 offset. --> f2fs_write_begin -> -ENOSPC = f2fs_convert_inline_page -> f2fs_write_failed -> truncate_blocks -> truncate_inline_inode BUG_ON, since i_size is 4096. Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
This patch is to avoid some punch_hole overhead when releasing volatile data. If volatile data was not written yet, we just can make the first page as zero. Signed-off-by: Jaegeuk Kim <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
Fast symlink can utilize inline data flow to avoid using any i_addr region, since we need to handle many cases such as truncation, roll-forward recovery, and fsck/dump tools. Signed-off-by: Wanpeng Li <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
Split __set_data_blkaddr from f2fs_update_extent_cache for readability. Additionally rename __set_data_blkaddr to set_data_blkaddr for exporting. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
This patch introduces __{find,grab}_extent_tree for reusing by following patches. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
With normal extent info cache, we records largest extent mapping between logical block and physical block into extent info, and we persist extent info in on-disk inode. When we enable extent tree cache, if extent info of on-disk inode is exist, and the extent is not a small fragmented mapping extent. We'd better to load the extent info into extent tree cache when inode is loaded. By this way we can have more chance to hit extent tree cache rather than taking more time to read dnode page for block address. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
This patch tries to preserve last extent info in extent tree cache into on-disk inode, so this can help us to reuse the last extent info next time for performance. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
Enable inline_data feature by default since it brings us better performance and space utilization and now has already stable. Add another option noinline_data to disable it during mount. Suggested-by: Jaegeuk Kim <[email protected]> Suggested-by: Chao Yu <[email protected]> Signed-off-by: Wanpeng Li <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
Normally, due to DIO_SKIP_HOLES flag is set by default, blockdev_direct_IO in f2fs_direct_IO tries to skip DIO in holes when writing inside i_size, this makes us falling back to buffered IO which shows lower performance. So in commit 59b802e ("f2fs: allocate data blocks in advance for f2fs_direct_IO"), we improve perfromance by allocating data blocks in advance if we meet holes no matter in i_size or not, since with it we can avoid falling back to buffered IO. But we forget to consider for unwritten fallocated block in this commit. This patch tries to fix it for fallocate case, this helps to improve performance. Test result: Storage info: sandisk ultra 64G micro sd card. touch /mnt/f2fs/file truncate -s 67108864 /mnt/f2fs/file fallocate -o 0 -l 67108864 /mnt/f2fs/file time dd if=/dev/zero of=/mnt/f2fs/file bs=1M count=64 conv=notrunc oflag=direct Time before applying the patch: 67108864 bytes (67 MB) copied, 36.16 s, 1.9 MB/s real 0m36.162s user 0m0.000s sys 0m0.180s Time after applying the patch: 67108864 bytes (67 MB) copied, 27.7776 s, 2.4 MB/s real 0m27.780s user 0m0.000s sys 0m0.036s Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
We will encounter oops by executing below command. getfattr -n system.advise /mnt/f2fs/file Killed message log: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<f8b54d69>] f2fs_xattr_advise_get+0x29/0x40 [f2fs] *pdpt = 00000000319b7001 *pde = 0000000000000000 Oops: 0002 [#1] SMP Modules linked in: f2fs(O) snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq joydev snd_seq_device snd_timer bnep snd rfcomm microcode bluetooth soundcore i2c_piix4 mac_hid serio_raw parport_pc ppdev lp parport binfmt_misc hid_generic psmouse usbhid hid e1000 [last unloaded: f2fs] CPU: 3 PID: 3134 Comm: getfattr Tainted: G O 4.0.0-rc1 #6 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 task: f3a71b60 ti: f19a6000 task.ti: f19a6000 EIP: 0060:[<f8b54d69>] EFLAGS: 00010246 CPU: 3 EIP is at f2fs_xattr_advise_get+0x29/0x40 [f2fs] EAX: 00000000 EBX: f19a7e71 ECX: 00000000 EDX: f8b5b467 ESI: 00000000 EDI: f2008570 EBP: f19a7e14 ESP: f19a7e08 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 CR0: 80050033 CR2: 00000000 CR3: 319b8000 CR4: 000007f0 Stack: f8b5a634 c0cbb580 00000000 f19a7e34 c1193850 00000000 00000007 f19a7e71 f19a7e64 c0cbb580 c1193810 f19a7e50 c1193c00 00000000 00000000 00000000 c0cbb580 00000000 f19a7f70 c1194097 00000000 00000000 00000000 74737973 Call Trace: [<c1193850>] generic_getxattr+0x40/0x50 [<c1193810>] ? xattr_resolve_name+0x80/0x80 [<c1193c00>] vfs_getxattr+0x70/0xa0 [<c1194097>] getxattr+0x87/0x190 [<c11801d7>] ? path_lookupat+0x57/0x5f0 [<c11819d2>] ? putname+0x32/0x50 [<c116653a>] ? kmem_cache_alloc+0x2a/0x130 [<c11819d2>] ? putname+0x32/0x50 [<c11819d2>] ? putname+0x32/0x50 [<c11819d2>] ? putname+0x32/0x50 [<c11827f9>] ? user_path_at_empty+0x49/0x70 [<c118283f>] ? user_path_at+0x1f/0x30 [<c11941e7>] path_getxattr+0x47/0x80 [<c11948e7>] SyS_getxattr+0x27/0x30 [<c163f748>] sysenter_do_call+0x12/0x12 Code: 66 90 55 89 e5 57 56 53 66 66 66 66 90 8b 78 20 89 d3 ba 67 b4 b5 f8 89 d8 89 ce e8 42 7c 7b c8 85 c0 75 16 0f b6 87 44 01 00 00 <88> 06 b8 01 00 00 00 5b 5e 5f 5d c3 8d 76 00 b8 ea ff ff ff eb EIP: [<f8b54d69>] f2fs_xattr_advise_get+0x29/0x40 [f2fs] SS:ESP 0068:f19a7e08 CR2: 0000000000000000 ---[ end trace 860260654f1f416a ]--- The reason is that in getfattr there are two steps which is indicated by strace info: 1) try to lookup and get size of specified xattr. 2) get value of the extented attribute. strace info: getxattr("/mnt/f2fs/file", "system.advise", 0x0, 0) = 1 getxattr("/mnt/f2fs/file", "system.advise", "\x00", 256) = 1 For the first step, getfattr may pass a NULL pointer in @value and zero in @SiZe as parameters for ->getxattr, but we access this @value pointer directly without checking whether the pointer is valid or not in f2fs_xattr_advise_get, so the oops occurs. This patch fixes this issue by verifying @value pointer before using. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
This patch fixes to dirty inode for persisting i_advise of f2fs inode info into on-disk inode if user sets system.advise through setxattr. Otherwise the new value will be lost. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
Map bh over max size which caller defined is not needed, limit it in f2fs_map_bh. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
…t for O= option in Makefile Since commit ee0778a ("tools/power: turbostat: make Makefile a bit more capable") turbostat's Makefile is using [...] BUILD_OUTPUT := $(PWD) [...] which obviously causes trouble when building "turbostat" with make -C /usr/src/linux/tools/power/x86/turbostat ARCH=x86 turbostat because GNU make does not update nor guarantee that $PWD is set. This patch changes the Makefile to use $CURDIR instead, which GNU make guarantees to set and update (i.e. when using "make -C ...") and also adds support for the O= option (see "make help" in your root of your kernel source tree for more details). Link: https://bugs.gentoo.org/show_bug.cgi?id=533918 Fixes: ee0778a ("tools/power: turbostat: make Makefile a bit more capable") Signed-off-by: Thomas D. <[email protected]> Cc: Mark Asselstine <[email protected]> Signed-off-by: Len Brown <[email protected]>
Skylake adds some additional residency counters. Skylake supports a different mix of RAPL registers from any previous product. In most other ways, Skylake is like Broadwell. Signed-off-by: Len Brown <[email protected]>
While not yet documented in the Software Developer's Manual, the data-sheet for modern Xeon states that DRAM RAPL ENERGY units are fixed at 15.3 uJ, rather than being discovered via MSR. Before this patch, DRAM energy on these products is over-stated by turbostat because the RAPL units are 4x larger. ref: "Xeon E5-2600 v3/E5-1600 v3 Datasheet Volume 2" http://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e5-v3-datasheet-vol-2.pdf Signed-off-by: Andrey Semin <[email protected]> Signed-off-by: Len Brown <[email protected]>
turbostat --debug ... CPUID(0x15): eax_crystal: 2 ebx_tsc: 100 ecx_crystal_hz: 0 TSC: 1200 MHz (24000000 Hz * 100 / 2 / 1000000) Signed-off-by: Len Brown <[email protected]>
HSW expanded MSR_PKG_CST_CONFIG_CONTROL.Package-C-State-Limit, from bits[2:0] used by previous implementations, to [3:0]. The value 1000b is unlimited, and is used by BDW and SKL too. Signed-off-by: Len Brown <[email protected]>
I applied the wrong version of this patch series, V4 instead of V10, due to a patchwork bundling snafu. Signed-off-by: David S. Miller <[email protected]>
…lock Investigation of multithreaded iperf experiments on an ethernet interface show the iommu->lock as the hottest lock identified by lockstat, with something of the order of 21M contentions out of 27M acquisitions, and an average wait time of 26 us for the lock. This is not efficient. A more scalable design is to follow the ppc model, where the iommu_map_table has multiple pools, each stretching over a segment of the map, and with a separate lock for each pool. This model allows for better parallelization of the iommu map search. This patch adds the iommu range alloc/free function infrastructure. Signed-off-by: Sowmini Varadhan <[email protected]> Acked-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: David S. Miller <[email protected]>
In iperf experiments running linux as the Tx side (TCP client) with 10 threads results in a severe performance drop when TSO is disabled, indicating a weakness in the software that can be avoided by using the scalable IOMMU arena DMA allocation. Baseline numbers before this patch: with default settings (TSO enabled) : 9-9.5 Gbps Disable TSO using ethtool- drops badly: 2-3 Gbps. After this patch, iperf client with 10 threads, can give a throughput of at least 8.5 Gbps, even when TSO is disabled. Signed-off-by: Sowmini Varadhan <[email protected]> Acked-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Note that this conversion is only being done to consolidate the code and ensure that the common code provides the sufficient abstraction. It is not expected to result in any noticeable performance improvement, as there is typically one ldc_iommu per vnet_port, and each one has 8k entries, with a typical request for 1-4 pages. Thus LDC uses npools == 1. Signed-off-by: Sowmini Varadhan <[email protected]> Acked-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Fixes warnings due to - no DMA_ERROR_CODE on PARISC, - sizeof (unsigned long) == 4 bytes on PARISC. Signed-off-by: Sowmini Varadhan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Sowmini Varadhan says: ==================== Generic IOMMU pooled allocator Investigation of network performance on Sparc shows a high degree of locking contention in the IOMMU allocator, and it was noticed that the PowerPC code has a better locking model. This patch series tries to extract the generic parts of the PowerPC code so that it can be shared across multiple PCI devices and architectures. v10: resend patchv9 without RFC tag, and a new mail Message-Id, (previous non-RFC attempt did not show up on the patchwork queue?) Full revision history below: v2 changes: - incorporate David Miller editorial comments: sparc specific fields moved from iommu-common into sparc's iommu_64.h - make the npools value an input parameter, for the case when the iommu map size is not very large - cookie_to_index mapping, and optimizations for span-boundary check, for use case such as LDC. v3: eliminate iommu_sparc, rearrange the ->demap indirection to be invoked under the pool lock. v4: David Miller review changes: - s/IOMMU_ERROR_CODE/DMA_ERROR_CODE - page_table_map_base and page_table_shift are unsigned long, not u32. v5: removed ->cookie_to_index and ->demap indirection from the iommu_tbl_ops The caller needs to call these functions as needed, before invoking the generic arena allocator functions. Added the "skip_span_boundary" argument to iommu_tbl_pool_init() for those callers like LDC which do no care about span boundary checks. v6: removed iommu_tbl_ops, and instead pass the ->flush_all as an indirection to iommu_tbl_pool_init(); only invoke ->flush_all when there is no large_pool, based on the assumption that large-pool usage is infrequently encountered v7: moved pool_hash initialization to lib/iommu-common.c and cleaned up code duplication from sun4v/sun4u/ldc. v8: Addresses BenH comments with one exception: I've left the IOMMU_POOL_HASH as is, so that powerpc can tailor it to their convenience. Discard trylock for simple spin_lock to acquire pool v9: Addresses latest BenH comments: need_flush checks, add support for dma mask and align_order. v10: resend without RFC tag, and new mail Message-Id. ==================== Signed-off-by: David S. Miller <[email protected]>
…m/linux/kernel/git/ericvh/v9fs Pull 9pfs updates from Eric Van Hensbergen: "Some accumulated cleanup patches for kerneldoc and unused variables as well as some lock bug fixes and adding privateport option for RDMA" * tag 'for-linus-4.1-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: net/9p: add a privport option for RDMA transport. fs/9p: Initialize status in v9fs_file_do_lock. net/9p: Initialize opts->privport as it should be. net/9p: use memcpy() instead of snprintf() in p9_mount_tag_show() 9p: use unsigned integers for nwqid/count 9p: do not crash on unknown lock status code 9p: fix error handling in v9fs_file_do_lock 9p: remove unused variable in p9_fd_create() 9p: kerneldoc warning fixes
Pull sparc fixes from David Miller "Unfortunately, I brown paper bagged the generic iommu pool allocator by applying the wrong revision of the patch series. This reverts the bad one, and puts the right one in" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: iommu-common: Fix PARISC compile-time warnings sparc: Make LDC use common iommu poll management functions sparc: Make sparc64 use scalable lib/iommu-common.c functions Break up monolithic iommu table/lock into finer graularity pools and lock sparc: Revert generic IOMMU allocator.
Commit 8053871 ("smp: Fix smp_call_function_single_async() locking") fixed the locking for the asynchronous smp-call case, but in the process of moving the lock handling around, one of the error cases ended up not unlocking the call data at all. This went unnoticed on x86, because this is a "caller is buggy" case, where the caller is trying to call a non-existent CPU. But apparently ARM does that (at least under qemu-arm). Bindly doing cross-cpu calls to random CPU's that aren't even online seems a bit fishy, but the error handling was clearly not correct. Simply add the missing "csd_unlock()" to the error path. Reported-and-tested-by: Guenter Roeck <[email protected]> Analyzed-by: Rabin Vincent <[email protected]> Acked-by: Ingo Molnar <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
This prevents a race between chown() and execve(), where chowning a setuid-user binary to root would momentarily make the binary setuid root. This patch was mostly written by Linus Torvalds. Signed-off-by: Jann Horn <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
The test_data_1_le[] array is a const array of const char *. To avoid dropping any const information, we need to use "const char * const *", not just "const char **". I'm not sure why the different test arrays end up having different const'ness, but let's make the pointer we use to traverse them as const as possible, since we modify neither the array of pointers _or_ the pointers we find in the array. Signed-off-by: Linus Torvalds <[email protected]>
…el/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "A few bug fixes and add support for file-system level encryption in ext4" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (31 commits) ext4 crypto: enable encryption feature flag ext4 crypto: add symlink encryption ext4 crypto: enable filename encryption ext4 crypto: filename encryption modifications ext4 crypto: partial update to namei.c for fname crypto ext4 crypto: insert encrypted filenames into a leaf directory block ext4 crypto: teach ext4_htree_store_dirent() to store decrypted filenames ext4 crypto: filename encryption facilities ext4 crypto: implement the ext4 decryption read path ext4 crypto: implement the ext4 encryption write path ext4 crypto: inherit encryption policies on inode and directory create ext4 crypto: enforce context consistency ext4 crypto: add encryption key management facilities ext4 crypto: add ext4 encryption facilities ext4 crypto: add encryption policy and password salt support ext4 crypto: add encryption xattr support ext4 crypto: export ext4_empty_dir() ext4 crypto: add ext4 encryption Kconfig ext4 crypto: reserve codepoints used by the ext4 encryption feature ext4 crypto: add ext4_mpage_readpages() ...
…/git/lenb/linux Pull turbostat update from Len Brown: "Updates to the turbostat utility. Just one kernel dependency in this batch -- added a #define to msr-index.h" * 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: tools/power turbostat: correct dumped pkg-cstate-limit value tools/power turbostat: calculate TSC frequency from CPUID(0x15) on SKL tools/power turbostat: correct DRAM RAPL units on recent Xeon processors tools/power turbostat: Initial Skylake support tools/power turbostat: Use $(CURDIR) instead of $(PWD) and add support for O= option in Makefile tools/power turbostat: modprobe msr, if needed tools/power turbostat: dump MSR_TURBO_RATIO_LIMIT2 tools/power turbostat: use new MSR_TURBO_RATIO_LIMIT names x86 msr-index: define MSR_TURBO_RATIO_LIMIT,1,2 tools/power turbostat: label base frequency tools/power turbostat: update PERF_LIMIT_REASONS decoding tools/power turbostat: simplify default output
dabrace
pushed a commit
that referenced
this pull request
Jan 11, 2016
This reverts commit 87b5ed8 ("ASoC: Intel: Skylake: fix memory leak") as it causes regression on Skylake devices The SKL drivers can be deferred probe. The topology file based widgets can have references to topology file so this can't be freed until card is fully created, so revert this patch for now [ 66.682767] BUG: unable to handle kernel paging request at ffffc900001363fc [ 66.690735] IP: [<ffffffff806c94dd>] strnlen+0xd/0x40 [ 66.696509] PGD 16e035067 PUD 16e036067 PMD 16e038067 PTE 0 [ 66.702925] Oops: 0000 [#1] PREEMPT SMP [ 66.768390] CPU: 3 PID: 57 Comm: kworker/u16:3 Tainted: G O 4.4.0-rc7-skl #62 [ 66.778869] Hardware name: Intel Corporation Skylake Client platform [ 66.793201] Workqueue: deferwq deferred_probe_work_func [ 66.799173] task: ffff88008b700f40 ti: ffff88008b704000 task.ti: ffff88008b704000 [ 66.807692] RIP: 0010:[<ffffffff806c94dd>] [<ffffffff806c94dd>] strnlen+0xd/0x40 [ 66.816243] RSP: 0018:ffff88008b707878 EFLAGS: 00010286 [ 66.822293] RAX: ffffffff80e60a82 RBX: 000000000000000e RCX: fffffffffffffffe [ 66.830406] RDX: ffffc900001363fc RSI: ffffffffffffffff RDI: ffffc900001363fc [ 66.838520] RBP: ffff88008b707878 R08: 000000000000ffff R09: 000000000000ffff [ 66.846649] R10: 0000000000000001 R11: ffffffffa01c6368 R12: ffffc900001363fc [ 66.854765] R13: 0000000000000000 R14: 00000000ffffffff R15: 0000000000000000 [ 66.862910] FS: 0000000000000000(0000) GS:ffff88016ecc0000(0000) knlGS:0000000000000000 [ 66.872150] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 66.878696] CR2: ffffc900001363fc CR3: 0000000002c09000 CR4: 00000000003406e0 [ 66.886820] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 66.894938] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 66.903052] Stack: [ 66.905346] ffff88008b7078b0 ffffffff806cb1db 000000000000000e 0000000000000000 [ 66.913854] ffff88008b707928 ffffffffa00d1050 ffffffffa00d104e ffff88008b707918 [ 66.922353] ffffffff806ccbd6 ffff88008b707948 0000000000000046 ffff88008b707940 [ 66.930855] Call Trace: [ 66.933646] [<ffffffff806cb1db>] string.isra.4+0x3b/0xd0 [ 66.939793] [<ffffffff806ccbd6>] vsnprintf+0x116/0x540 [ 66.945742] [<ffffffff806d02f0>] kvasprintf+0x40/0x80 [ 66.951591] [<ffffffff806d0370>] kasprintf+0x40/0x50 [ 66.957359] [<ffffffffa00c085f>] dapm_create_or_share_kcontrol+0x1cf/0x300 [snd_soc_core] [ 66.966771] [<ffffffff8057dd1e>] ? __kmalloc+0x16e/0x2a0 [ 66.972931] [<ffffffffa00c0dab>] snd_soc_dapm_new_widgets+0x41b/0x4b0 [snd_soc_core] [ 66.981857] [<ffffffffa00be8c0>] ? snd_soc_dapm_add_routes+0xb0/0xd0 [snd_soc_core] [ 67.007828] [<ffffffffa00b92ed>] soc_probe_component+0x23d/0x360 [snd_soc_core] [ 67.016244] [<ffffffff80b14e69>] ? mutex_unlock+0x9/0x10 [ 67.022405] [<ffffffffa00ba02f>] snd_soc_instantiate_card+0x47f/0xd10 [snd_soc_core] [ 67.031329] [<ffffffff8049eeb2>] ? debug_mutex_init+0x32/0x40 [ 67.037973] [<ffffffffa00baa92>] snd_soc_register_card+0x1d2/0x2b0 [snd_soc_core] [ 67.046619] [<ffffffffa00c8b54>] devm_snd_soc_register_card+0x44/0x80 [snd_soc_core] [ 67.055539] [<ffffffffa01c303b>] skylake_audio_probe+0x1b/0x20 [snd_soc_skl_rt286] [ 67.064292] [<ffffffff808aa887>] platform_drv_probe+0x37/0x90 Signed-off-by: Vinod Koul <[email protected]> Signed-off-by: Mark Brown <[email protected]>
dabrace
pushed a commit
that referenced
this pull request
Aug 8, 2016
I got a KASAN report of use-after-free: ================================================================== BUG: KASAN: use-after-free in klist_iter_exit+0x61/0x70 at addr ffff8800b6581508 Read of size 8 by task trinity-c1/315 ============================================================================= BUG kmalloc-32 (Not tainted): kasan: bad access detected ----------------------------------------------------------------------------- Disabling lock debugging due to kernel taint INFO: Allocated in disk_seqf_start+0x66/0x110 age=144 cpu=1 pid=315 ___slab_alloc+0x4f1/0x520 __slab_alloc.isra.58+0x56/0x80 kmem_cache_alloc_trace+0x260/0x2a0 disk_seqf_start+0x66/0x110 traverse+0x176/0x860 seq_read+0x7e3/0x11a0 proc_reg_read+0xbc/0x180 do_loop_readv_writev+0x134/0x210 do_readv_writev+0x565/0x660 vfs_readv+0x67/0xa0 do_preadv+0x126/0x170 SyS_preadv+0xc/0x10 do_syscall_64+0x1a1/0x460 return_from_SYSCALL_64+0x0/0x6a INFO: Freed in disk_seqf_stop+0x42/0x50 age=160 cpu=1 pid=315 __slab_free+0x17a/0x2c0 kfree+0x20a/0x220 disk_seqf_stop+0x42/0x50 traverse+0x3b5/0x860 seq_read+0x7e3/0x11a0 proc_reg_read+0xbc/0x180 do_loop_readv_writev+0x134/0x210 do_readv_writev+0x565/0x660 vfs_readv+0x67/0xa0 do_preadv+0x126/0x170 SyS_preadv+0xc/0x10 do_syscall_64+0x1a1/0x460 return_from_SYSCALL_64+0x0/0x6a CPU: 1 PID: 315 Comm: trinity-c1 Tainted: G B 4.7.0+ #62 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 ffffea0002d96000 ffff880119b9f918 ffffffff81d6ce81 ffff88011a804480 ffff8800b6581500 ffff880119b9f948 ffffffff8146c7bd ffff88011a804480 ffffea0002d96000 ffff8800b6581500 fffffffffffffff4 ffff880119b9f970 Call Trace: [<ffffffff81d6ce81>] dump_stack+0x65/0x84 [<ffffffff8146c7bd>] print_trailer+0x10d/0x1a0 [<ffffffff814704ff>] object_err+0x2f/0x40 [<ffffffff814754d1>] kasan_report_error+0x221/0x520 [<ffffffff8147590e>] __asan_report_load8_noabort+0x3e/0x40 [<ffffffff83888161>] klist_iter_exit+0x61/0x70 [<ffffffff82404389>] class_dev_iter_exit+0x9/0x10 [<ffffffff81d2e8ea>] disk_seqf_stop+0x3a/0x50 [<ffffffff8151f812>] seq_read+0x4b2/0x11a0 [<ffffffff815f8fdc>] proc_reg_read+0xbc/0x180 [<ffffffff814b24e4>] do_loop_readv_writev+0x134/0x210 [<ffffffff814b4c45>] do_readv_writev+0x565/0x660 [<ffffffff814b8a17>] vfs_readv+0x67/0xa0 [<ffffffff814b8de6>] do_preadv+0x126/0x170 [<ffffffff814b92ec>] SyS_preadv+0xc/0x10 This problem can occur in the following situation: open() - pread() - .seq_start() - iter = kmalloc() // succeeds - seqf->private = iter - .seq_stop() - kfree(seqf->private) - pread() - .seq_start() - iter = kmalloc() // fails - .seq_stop() - class_dev_iter_exit(seqf->private) // boom! old pointer As the comment in disk_seqf_stop() says, stop is called even if start failed, so we need to reinitialise the private pointer to NULL when seq iteration stops. An alternative would be to set the private pointer to NULL when the kmalloc() in disk_seqf_start() fails. Cc: [email protected] Signed-off-by: Vegard Nossum <[email protected]> Acked-by: Tejun Heo <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
dabrace
pushed a commit
that referenced
this pull request
Dec 14, 2016
If there is an error reported in mballoc via ext4_grp_locked_error(), the code is holding a spinlock, so ext4_commit_super() must not try to lock the buffer head, or else it will trigger a BUG: BUG: sleeping function called from invalid context at ./include/linux/buffer_head.h:358 in_atomic(): 1, irqs_disabled(): 0, pid: 993, name: mount CPU: 0 PID: 993 Comm: mount Not tainted 4.9.0-rc1-clouder1 #62 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014 ffff880006423548 ffffffff81318c89 ffffffff819ecdd0 0000000000000166 ffff880006423558 ffffffff810810b0 ffff880006423580 ffffffff81081153 ffff880006e5a1a0 ffff88000690e400 0000000000000000 ffff8800064235c0 Call Trace: [<ffffffff81318c89>] dump_stack+0x67/0x9e [<ffffffff810810b0>] ___might_sleep+0xf0/0x140 [<ffffffff81081153>] __might_sleep+0x53/0xb0 [<ffffffff8126c1dc>] ext4_commit_super+0x19c/0x290 [<ffffffff8126e61a>] __ext4_grp_locked_error+0x14a/0x230 [<ffffffff81081153>] ? __might_sleep+0x53/0xb0 [<ffffffff812822be>] ext4_mb_generate_buddy+0x1de/0x320 Since ext4_grp_locked_error() calls ext4_commit_super with sync == 0 (and it is the only caller which does so), avoid locking and unlocking the buffer in this case. This can result in races with ext4_commit_super() if there are other problems (which is what commit 4743f83 was trying to address), but a Warning is better than BUG. Fixes: 4743f83 Cc: [email protected] # 4.9 Reported-by: Nikolay Borisov <[email protected]> Signed-off-by: Theodore Ts'o <[email protected]> Reviewed-by: Jan Kara <[email protected]>
dabrace
pushed a commit
that referenced
this pull request
Sep 5, 2017
netem can fail in ->init due to missing options (either not supplied by user-space or used as a default qdisc) causing a timer->base null pointer deref in its ->destroy() and ->reset() callbacks. Reproduce: $ sysctl net.core.default_qdisc=netem $ ip l set ethX up Crash log: [ 1814.846943] BUG: unable to handle kernel NULL pointer dereference at (null) [ 1814.847181] IP: hrtimer_active+0x17/0x8a [ 1814.847270] PGD 59c34067 [ 1814.847271] P4D 59c34067 [ 1814.847337] PUD 37374067 [ 1814.847403] PMD 0 [ 1814.847468] [ 1814.847582] Oops: 0000 [#1] SMP [ 1814.847655] Modules linked in: sch_netem(O) sch_fq_codel(O) [ 1814.847761] CPU: 3 PID: 1573 Comm: ip Tainted: G O 4.13.0-rc6+ #62 [ 1814.847884] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [ 1814.848043] task: ffff88003723a700 task.stack: ffff88005adc8000 [ 1814.848235] RIP: 0010:hrtimer_active+0x17/0x8a [ 1814.848407] RSP: 0018:ffff88005adcb590 EFLAGS: 00010246 [ 1814.848590] RAX: 0000000000000000 RBX: ffff880058e359d8 RCX: 0000000000000000 [ 1814.848793] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880058e359d8 [ 1814.848998] RBP: ffff88005adcb5b0 R08: 00000000014080c0 R09: 00000000ffffffff [ 1814.849204] R10: ffff88005adcb660 R11: 0000000000000020 R12: 0000000000000000 [ 1814.849410] R13: ffff880058e359d8 R14: 00000000ffffffff R15: 0000000000000001 [ 1814.849616] FS: 00007f733bbca740(0000) GS:ffff88005d980000(0000) knlGS:0000000000000000 [ 1814.849919] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1814.850107] CR2: 0000000000000000 CR3: 0000000059f0d000 CR4: 00000000000406e0 [ 1814.850313] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1814.850518] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1814.850723] Call Trace: [ 1814.850875] hrtimer_try_to_cancel+0x1a/0x93 [ 1814.851047] hrtimer_cancel+0x15/0x20 [ 1814.851211] qdisc_watchdog_cancel+0x12/0x14 [ 1814.851383] netem_reset+0xe6/0xed [sch_netem] [ 1814.851561] qdisc_destroy+0x8b/0xe5 [ 1814.851723] qdisc_create_dflt+0x86/0x94 [ 1814.851890] ? dev_activate+0x129/0x129 [ 1814.852057] attach_one_default_qdisc+0x36/0x63 [ 1814.852232] netdev_for_each_tx_queue+0x3d/0x48 [ 1814.852406] dev_activate+0x4b/0x129 [ 1814.852569] __dev_open+0xe7/0x104 [ 1814.852730] __dev_change_flags+0xc6/0x15c [ 1814.852899] dev_change_flags+0x25/0x59 [ 1814.853064] do_setlink+0x30c/0xb3f [ 1814.853228] ? check_chain_key+0xb0/0xfd [ 1814.853396] ? check_chain_key+0xb0/0xfd [ 1814.853565] rtnl_newlink+0x3a4/0x729 [ 1814.853728] ? rtnl_newlink+0x117/0x729 [ 1814.853905] ? ns_capable_common+0xd/0xb1 [ 1814.854072] ? ns_capable+0x13/0x15 [ 1814.854234] rtnetlink_rcv_msg+0x188/0x197 [ 1814.854404] ? rcu_read_unlock+0x3e/0x5f [ 1814.854572] ? rtnl_newlink+0x729/0x729 [ 1814.854737] netlink_rcv_skb+0x6c/0xce [ 1814.854902] rtnetlink_rcv+0x23/0x2a [ 1814.855064] netlink_unicast+0x103/0x181 [ 1814.855230] netlink_sendmsg+0x326/0x337 [ 1814.855398] sock_sendmsg_nosec+0x14/0x3f [ 1814.855584] sock_sendmsg+0x29/0x2e [ 1814.855747] ___sys_sendmsg+0x209/0x28b [ 1814.855912] ? do_raw_spin_unlock+0xcd/0xf8 [ 1814.856082] ? _raw_spin_unlock+0x27/0x31 [ 1814.856251] ? __handle_mm_fault+0x651/0xdb1 [ 1814.856421] ? check_chain_key+0xb0/0xfd [ 1814.856592] __sys_sendmsg+0x45/0x63 [ 1814.856755] ? __sys_sendmsg+0x45/0x63 [ 1814.856923] SyS_sendmsg+0x19/0x1b [ 1814.857083] entry_SYSCALL_64_fastpath+0x23/0xc2 [ 1814.857256] RIP: 0033:0x7f733b2dd690 [ 1814.857419] RSP: 002b:00007ffe1d3387d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 1814.858238] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007f733b2dd690 [ 1814.858445] RDX: 0000000000000000 RSI: 00007ffe1d338820 RDI: 0000000000000003 [ 1814.858651] RBP: ffff88005adcbf98 R08: 0000000000000001 R09: 0000000000000003 [ 1814.858856] R10: 00007ffe1d3385a0 R11: 0000000000000246 R12: 0000000000000002 [ 1814.859060] R13: 000000000066f1a0 R14: 00007ffe1d3408d0 R15: 0000000000000000 [ 1814.859267] ? trace_hardirqs_off_caller+0xa7/0xcf [ 1814.859446] Code: 10 55 48 89 c7 48 89 e5 e8 45 a1 fb ff 31 c0 5d c3 31 c0 c3 66 66 66 66 90 55 48 89 e5 41 56 41 55 41 54 53 49 89 fd 49 8b 45 30 <4c> 8b 20 41 8b 5c 24 38 31 c9 31 d2 48 c7 c7 50 8e 1d 82 41 89 [ 1814.860022] RIP: hrtimer_active+0x17/0x8a RSP: ffff88005adcb590 [ 1814.860214] CR2: 0000000000000000 Fixes: 87b60cf ("net_sched: fix error recovery at qdisc creation") Fixes: 0fbbeb1 ("[PKT_SCHED]: Fix missing qdisc_destroy() in qdisc_create_dflt()") Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
dabrace
pushed a commit
that referenced
this pull request
Sep 5, 2017
sch_tbf calls qdisc_watchdog_cancel() in both its ->reset and ->destroy callbacks but it may fail before the timer is initialized due to missing options (either not supplied by user-space or set as a default qdisc), also q->qdisc is used by ->reset and ->destroy so we need it initialized. Reproduce: $ sysctl net.core.default_qdisc=tbf $ ip l set ethX up Crash log: [ 959.160172] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 [ 959.160323] IP: qdisc_reset+0xa/0x5c [ 959.160400] PGD 59cdb067 [ 959.160401] P4D 59cdb067 [ 959.160466] PUD 59ccb067 [ 959.160532] PMD 0 [ 959.160597] [ 959.160706] Oops: 0000 [#1] SMP [ 959.160778] Modules linked in: sch_tbf sch_sfb sch_prio sch_netem [ 959.160891] CPU: 2 PID: 1562 Comm: ip Not tainted 4.13.0-rc6+ #62 [ 959.160998] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [ 959.161157] task: ffff880059c9a700 task.stack: ffff8800376d0000 [ 959.161263] RIP: 0010:qdisc_reset+0xa/0x5c [ 959.161347] RSP: 0018:ffff8800376d3610 EFLAGS: 00010286 [ 959.161531] RAX: ffffffffa001b1dd RBX: ffff8800373a2800 RCX: 0000000000000000 [ 959.161733] RDX: ffffffff8215f160 RSI: ffffffff8215f160 RDI: 0000000000000000 [ 959.161939] RBP: ffff8800376d3618 R08: 00000000014080c0 R09: 00000000ffffffff [ 959.162141] R10: ffff8800376d3578 R11: 0000000000000020 R12: ffffffffa001d2c0 [ 959.162343] R13: ffff880037538000 R14: 00000000ffffffff R15: 0000000000000001 [ 959.162546] FS: 00007fcc5126b740(0000) GS:ffff88005d900000(0000) knlGS:0000000000000000 [ 959.162844] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 959.163030] CR2: 0000000000000018 CR3: 000000005abc4000 CR4: 00000000000406e0 [ 959.163233] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 959.163436] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 959.163638] Call Trace: [ 959.163788] tbf_reset+0x19/0x64 [sch_tbf] [ 959.163957] qdisc_destroy+0x8b/0xe5 [ 959.164119] qdisc_create_dflt+0x86/0x94 [ 959.164284] ? dev_activate+0x129/0x129 [ 959.164449] attach_one_default_qdisc+0x36/0x63 [ 959.164623] netdev_for_each_tx_queue+0x3d/0x48 [ 959.164795] dev_activate+0x4b/0x129 [ 959.164957] __dev_open+0xe7/0x104 [ 959.165118] __dev_change_flags+0xc6/0x15c [ 959.165287] dev_change_flags+0x25/0x59 [ 959.165451] do_setlink+0x30c/0xb3f [ 959.165613] ? check_chain_key+0xb0/0xfd [ 959.165782] rtnl_newlink+0x3a4/0x729 [ 959.165947] ? rtnl_newlink+0x117/0x729 [ 959.166121] ? ns_capable_common+0xd/0xb1 [ 959.166288] ? ns_capable+0x13/0x15 [ 959.166450] rtnetlink_rcv_msg+0x188/0x197 [ 959.166617] ? rcu_read_unlock+0x3e/0x5f [ 959.166783] ? rtnl_newlink+0x729/0x729 [ 959.166948] netlink_rcv_skb+0x6c/0xce [ 959.167113] rtnetlink_rcv+0x23/0x2a [ 959.167273] netlink_unicast+0x103/0x181 [ 959.167439] netlink_sendmsg+0x326/0x337 [ 959.167607] sock_sendmsg_nosec+0x14/0x3f [ 959.167772] sock_sendmsg+0x29/0x2e [ 959.167932] ___sys_sendmsg+0x209/0x28b [ 959.168098] ? do_raw_spin_unlock+0xcd/0xf8 [ 959.168267] ? _raw_spin_unlock+0x27/0x31 [ 959.168432] ? __handle_mm_fault+0x651/0xdb1 [ 959.168602] ? check_chain_key+0xb0/0xfd [ 959.168773] __sys_sendmsg+0x45/0x63 [ 959.168934] ? __sys_sendmsg+0x45/0x63 [ 959.169100] SyS_sendmsg+0x19/0x1b [ 959.169260] entry_SYSCALL_64_fastpath+0x23/0xc2 [ 959.169432] RIP: 0033:0x7fcc5097e690 [ 959.169592] RSP: 002b:00007ffd0d5c7b48 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 959.169887] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007fcc5097e690 [ 959.170089] RDX: 0000000000000000 RSI: 00007ffd0d5c7b90 RDI: 0000000000000003 [ 959.170292] RBP: ffff8800376d3f98 R08: 0000000000000001 R09: 0000000000000003 [ 959.170494] R10: 00007ffd0d5c7910 R11: 0000000000000246 R12: 0000000000000006 [ 959.170697] R13: 000000000066f1a0 R14: 00007ffd0d5cfc40 R15: 0000000000000000 [ 959.170900] ? trace_hardirqs_off_caller+0xa7/0xcf [ 959.171076] Code: 00 41 c7 84 24 14 01 00 00 00 00 00 00 41 c7 84 24 98 00 00 00 00 00 00 00 41 5c 41 5d 41 5e 5d c3 66 66 66 66 90 55 48 89 e5 53 <48> 8b 47 18 48 89 fb 48 8b 40 48 48 85 c0 74 02 ff d0 48 8b bb [ 959.171637] RIP: qdisc_reset+0xa/0x5c RSP: ffff8800376d3610 [ 959.171821] CR2: 0000000000000018 Fixes: 87b60cf ("net_sched: fix error recovery at qdisc creation") Fixes: 0fbbeb1 ("[PKT_SCHED]: Fix missing qdisc_destroy() in qdisc_create_dflt()") Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
dabrace
pushed a commit
that referenced
this pull request
Jun 4, 2018
Recent patch forgot to remove nla_data(), upsetting syzkaller a bit. BUG: KASAN: slab-out-of-bounds in nla_strlcpy+0x13d/0x150 lib/nlattr.c:314 Read of size 1 at addr ffff8801ad1f4fdd by task syz-executor189/4509 CPU: 1 PID: 4509 Comm: syz-executor189 Not tainted 4.17.0-rc6+ #62 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1b9/0x294 lib/dump_stack.c:113 print_address_description+0x6c/0x20b mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412 __asan_report_load1_noabort+0x14/0x20 mm/kasan/report.c:430 nla_strlcpy+0x13d/0x150 lib/nlattr.c:314 nfnl_acct_new+0x574/0xc50 net/netfilter/nfnetlink_acct.c:118 nfnetlink_rcv_msg+0xdb5/0xff0 net/netfilter/nfnetlink.c:212 netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2448 nfnetlink_rcv+0x1fe/0x1ba0 net/netfilter/nfnetlink.c:513 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0x58b/0x740 net/netlink/af_netlink.c:1336 netlink_sendmsg+0x9f0/0xfa0 net/netlink/af_netlink.c:1901 sock_sendmsg_nosec net/socket.c:629 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:639 sock_write_iter+0x35a/0x5a0 net/socket.c:908 call_write_iter include/linux/fs.h:1784 [inline] new_sync_write fs/read_write.c:474 [inline] __vfs_write+0x64d/0x960 fs/read_write.c:487 vfs_write+0x1f8/0x560 fs/read_write.c:549 ksys_write+0xf9/0x250 fs/read_write.c:598 __do_sys_write fs/read_write.c:610 [inline] __se_sys_write fs/read_write.c:607 [inline] __x64_sys_write+0x73/0xb0 fs/read_write.c:607 Fixes: 4e09fc8 ("netfilter: prefer nla_strlcpy for dealing with NLA_STRING attributes") Signed-off-by: Eric Dumazet <[email protected]> Acked-by: Florian Westphal <[email protected]> Reported-by: syzbot <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
dabrace
pushed a commit
that referenced
this pull request
Jul 1, 2018
This patch avoids that lockdep reports the following: ====================================================== WARNING: possible circular locking dependency detected 4.18.0-rc1 #62 Not tainted ------------------------------------------------------ kswapd0/84 is trying to acquire lock: 00000000c313516d (&xfs_nondir_ilock_class){++++}, at: xfs_free_eofblocks+0xa2/0x1e0 but task is already holding lock: 00000000591c83ae (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (fs_reclaim){+.+.}: kmem_cache_alloc+0x2c/0x2b0 radix_tree_node_alloc.constprop.19+0x3d/0xc0 __radix_tree_create+0x161/0x1c0 __radix_tree_insert+0x45/0x210 dmz_map+0x245/0x2d0 [dm_zoned] __map_bio+0x40/0x260 __split_and_process_non_flush+0x116/0x220 __split_and_process_bio+0x81/0x180 __dm_make_request.isra.32+0x5a/0x100 generic_make_request+0x36e/0x690 submit_bio+0x6c/0x140 mpage_readpages+0x19e/0x1f0 read_pages+0x6d/0x1b0 __do_page_cache_readahead+0x21b/0x2d0 force_page_cache_readahead+0xc4/0x100 generic_file_read_iter+0x7c6/0xd20 __vfs_read+0x102/0x180 vfs_read+0x9b/0x140 ksys_read+0x55/0xc0 do_syscall_64+0x5a/0x1f0 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #1 (&dmz->chunk_lock){+.+.}: dmz_map+0x133/0x2d0 [dm_zoned] __map_bio+0x40/0x260 __split_and_process_non_flush+0x116/0x220 __split_and_process_bio+0x81/0x180 __dm_make_request.isra.32+0x5a/0x100 generic_make_request+0x36e/0x690 submit_bio+0x6c/0x140 _xfs_buf_ioapply+0x31c/0x590 xfs_buf_submit_wait+0x73/0x520 xfs_buf_read_map+0x134/0x2f0 xfs_trans_read_buf_map+0xc3/0x580 xfs_read_agf+0xa5/0x1e0 xfs_alloc_read_agf+0x59/0x2b0 xfs_alloc_pagf_init+0x27/0x60 xfs_bmap_longest_free_extent+0x43/0xb0 xfs_bmap_btalloc_nullfb+0x7f/0xf0 xfs_bmap_btalloc+0x428/0x7c0 xfs_bmapi_write+0x598/0xcc0 xfs_iomap_write_allocate+0x15a/0x330 xfs_map_blocks+0x1cf/0x3f0 xfs_do_writepage+0x15f/0x7b0 write_cache_pages+0x1ca/0x540 xfs_vm_writepages+0x65/0xa0 do_writepages+0x48/0xf0 __writeback_single_inode+0x58/0x730 writeback_sb_inodes+0x249/0x5c0 wb_writeback+0x11e/0x550 wb_workfn+0xa3/0x670 process_one_work+0x228/0x670 worker_thread+0x3c/0x390 kthread+0x11c/0x140 ret_from_fork+0x3a/0x50 -> #0 (&xfs_nondir_ilock_class){++++}: down_read_nested+0x43/0x70 xfs_free_eofblocks+0xa2/0x1e0 xfs_fs_destroy_inode+0xac/0x270 dispose_list+0x51/0x80 prune_icache_sb+0x52/0x70 super_cache_scan+0x127/0x1a0 shrink_slab.part.47+0x1bd/0x590 shrink_node+0x3b5/0x470 balance_pgdat+0x158/0x3b0 kswapd+0x1ba/0x600 kthread+0x11c/0x140 ret_from_fork+0x3a/0x50 other info that might help us debug this: Chain exists of: &xfs_nondir_ilock_class --> &dmz->chunk_lock --> fs_reclaim Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(fs_reclaim); lock(&dmz->chunk_lock); lock(fs_reclaim); lock(&xfs_nondir_ilock_class); *** DEADLOCK *** 3 locks held by kswapd0/84: #0: 00000000591c83ae (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30 #1: 000000000f8208f5 (shrinker_rwsem){++++}, at: shrink_slab.part.47+0x3f/0x590 #2: 00000000cacefa54 (&type->s_umount_key#43){.+.+}, at: trylock_super+0x16/0x50 stack backtrace: CPU: 7 PID: 84 Comm: kswapd0 Not tainted 4.18.0-rc1 #62 Hardware name: Supermicro Super Server/X10SRL-F, BIOS 2.0 12/17/2015 Call Trace: dump_stack+0x85/0xcb print_circular_bug.isra.36+0x1ce/0x1db __lock_acquire+0x124e/0x1310 lock_acquire+0x9f/0x1f0 down_read_nested+0x43/0x70 xfs_free_eofblocks+0xa2/0x1e0 xfs_fs_destroy_inode+0xac/0x270 dispose_list+0x51/0x80 prune_icache_sb+0x52/0x70 super_cache_scan+0x127/0x1a0 shrink_slab.part.47+0x1bd/0x590 shrink_node+0x3b5/0x470 balance_pgdat+0x158/0x3b0 kswapd+0x1ba/0x600 kthread+0x11c/0x140 ret_from_fork+0x3a/0x50 Reported-by: Masato Suzuki <[email protected]> Fixes: 4218a95 ("dm zoned: use GFP_NOIO in I/O path") Cc: <[email protected]> Signed-off-by: Bart Van Assche <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
dabrace
pushed a commit
that referenced
this pull request
Nov 5, 2018
Commit 1404d6f ("arm64: dump: Add checking for writable and exectuable pages") has successfully identified code that leaves a page with W+X permissions. [ 3.245140] arm64/mm: Found insecure W+X mapping at address (____ptrval____)/0xffff000000d90000 [ 3.245771] WARNING: CPU: 0 PID: 1 at ../arch/arm64/mm/dump.c:232 note_page+0x410/0x420 [ 3.246141] Modules linked in: [ 3.246653] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc5-next-20180928-00001-ge70ae259b853-dirty #62 [ 3.247008] Hardware name: linux,dummy-virt (DT) [ 3.247347] pstate: 80000005 (Nzcv daif -PAN -UAO) [ 3.247623] pc : note_page+0x410/0x420 [ 3.247898] lr : note_page+0x410/0x420 [ 3.248071] sp : ffff00000804bcd0 [ 3.248254] x29: ffff00000804bcd0 x28: ffff000009274000 [ 3.248578] x27: ffff00000921a000 x26: ffff80007dfff000 [ 3.248845] x25: ffff0000093f5000 x24: ffff000009526f6a [ 3.249109] x23: 0000000000000004 x22: ffff000000d91000 [ 3.249396] x21: ffff000000d90000 x20: 0000000000000000 [ 3.249661] x19: ffff00000804bde8 x18: 0000000000000400 [ 3.249924] x17: 0000000000000000 x16: 0000000000000000 [ 3.250271] x15: ffffffffffffffff x14: 295f5f5f5f6c6176 [ 3.250594] x13: 7274705f5f5f5f28 x12: 2073736572646461 [ 3.250941] x11: 20746120676e6970 x10: 70616d20582b5720 [ 3.251252] x9 : 6572756365736e69 x8 : 3039643030303030 [ 3.251519] x7 : 306666666678302f x6 : ffff0000095467b2 [ 3.251802] x5 : 0000000000000000 x4 : 0000000000000000 [ 3.252060] x3 : 0000000000000000 x2 : ffffffffffffffff [ 3.252323] x1 : 4d151327adc50b00 x0 : 0000000000000000 [ 3.252664] Call trace: [ 3.252953] note_page+0x410/0x420 [ 3.253186] walk_pgd+0x12c/0x238 [ 3.253417] ptdump_check_wx+0x68/0xf8 [ 3.253637] mark_rodata_ro+0x68/0x98 [ 3.253847] kernel_init+0x38/0x160 [ 3.254103] ret_from_fork+0x10/0x18 kprobes allocates a writable executable page with module_alloc() in order to store executable code. Reworked to that when allocate a page it sets mode RO. Inspired by commit 63fef14 ("kprobes/x86: Make insn buffer always ROX and use text_poke()"). Suggested-by: Arnd Bergmann <[email protected]> Suggested-by: Ard Biesheuvel <[email protected]> Acked-by: Will Deacon <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Reviewed-by: Laura Abbott <[email protected]> Signed-off-by: Anders Roxell <[email protected]> [[email protected]: removed unnecessary casts] Signed-off-by: Catalin Marinas <[email protected]>
dabrace
pushed a commit
that referenced
this pull request
Mar 25, 2019
This patch fixes the following KASAN report: [ 779.044746] BUG: KASAN: slab-out-of-bounds in string+0xab/0x180 [ 779.044750] Read of size 1 at addr ffff88814f327968 by task trace-cmd/2812 [ 779.044756] CPU: 1 PID: 2812 Comm: trace-cmd Not tainted 5.1.0-rc1+ #62 [ 779.044760] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-0-ga698c89-prebuilt.qemu.org 04/01/2014 [ 779.044761] Call Trace: [ 779.044769] dump_stack+0x5b/0x90 [ 779.044775] ? string+0xab/0x180 [ 779.044781] print_address_description+0x6c/0x23c [ 779.044787] ? string+0xab/0x180 [ 779.044792] ? string+0xab/0x180 [ 779.044797] kasan_report.cold.3+0x1a/0x32 [ 779.044803] ? string+0xab/0x180 [ 779.044809] string+0xab/0x180 [ 779.044816] ? widen_string+0x160/0x160 [ 779.044822] ? vsnprintf+0x5bf/0x7f0 [ 779.044829] vsnprintf+0x4e7/0x7f0 [ 779.044836] ? pointer+0x4a0/0x4a0 [ 779.044841] ? seq_buf_vprintf+0x79/0xc0 [ 779.044848] seq_buf_vprintf+0x62/0xc0 [ 779.044855] trace_seq_printf+0x113/0x210 [ 779.044861] ? trace_seq_puts+0x110/0x110 [ 779.044867] ? trace_raw_output_prep+0xd8/0x110 [ 779.044876] trace_raw_output_smb3_tcon_class+0x9f/0xc0 [ 779.044882] print_trace_line+0x377/0x890 [ 779.044888] ? tracing_buffers_read+0x300/0x300 [ 779.044893] ? ring_buffer_read+0x58/0x70 [ 779.044899] s_show+0x6e/0x140 [ 779.044906] seq_read+0x505/0x6a0 [ 779.044913] vfs_read+0xaf/0x1b0 [ 779.044919] ksys_read+0xa1/0x130 [ 779.044925] ? kernel_write+0xa0/0xa0 [ 779.044931] ? __do_page_fault+0x3d5/0x620 [ 779.044938] do_syscall_64+0x63/0x150 [ 779.044944] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 779.044949] RIP: 0033:0x7f62c2c2db31 [ 779.044955] Code: fe ff ff 48 8d 3d 17 9e 09 00 48 83 ec 08 e8 96 02 02 00 66 0f 1f 44 00 00 8b 05 fa fc 2c 00 48 63 ff 85 c0 75 13 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 57 f3 c3 0f 1f 44 00 00 55 53 48 89 d5 48 89 [ 779.044958] RSP: 002b:00007ffd6e116678 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 779.044964] RAX: ffffffffffffffda RBX: 0000560a38be9260 RCX: 00007f62c2c2db31 [ 779.044966] RDX: 0000000000002000 RSI: 00007ffd6e116710 RDI: 0000000000000003 [ 779.044966] RDX: 0000000000002000 RSI: 00007ffd6e116710 RDI: 0000000000000003 [ 779.044969] RBP: 00007f62c2ef5420 R08: 0000000000000000 R09: 0000000000000003 [ 779.044972] R10: ffffffffffffffa8 R11: 0000000000000246 R12: 00007ffd6e116710 [ 779.044975] R13: 0000000000002000 R14: 0000000000000d68 R15: 0000000000002000 [ 779.044981] Allocated by task 1257: [ 779.044987] __kasan_kmalloc.constprop.5+0xc1/0xd0 [ 779.044992] kmem_cache_alloc+0xad/0x1a0 [ 779.044997] getname_flags+0x6c/0x2a0 [ 779.045003] user_path_at_empty+0x1d/0x40 [ 779.045008] do_faccessat+0x12a/0x330 [ 779.045012] do_syscall_64+0x63/0x150 [ 779.045017] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 779.045019] Freed by task 1257: [ 779.045023] __kasan_slab_free+0x12e/0x180 [ 779.045029] kmem_cache_free+0x85/0x1b0 [ 779.045034] filename_lookup.part.70+0x176/0x250 [ 779.045039] do_faccessat+0x12a/0x330 [ 779.045043] do_syscall_64+0x63/0x150 [ 779.045048] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 779.045052] The buggy address belongs to the object at ffff88814f326600 which belongs to the cache names_cache of size 4096 [ 779.045057] The buggy address is located 872 bytes to the right of 4096-byte region [ffff88814f326600, ffff88814f327600) [ 779.045058] The buggy address belongs to the page: [ 779.045062] page:ffffea00053cc800 count:1 mapcount:0 mapping:ffff88815b191b40 index:0x0 compound_mapcount: 0 [ 779.045067] flags: 0x200000000010200(slab|head) [ 779.045075] raw: 0200000000010200 dead000000000100 dead000000000200 ffff88815b191b40 [ 779.045081] raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000 [ 779.045083] page dumped because: kasan: bad access detected [ 779.045085] Memory state around the buggy address: [ 779.045089] ffff88814f327800: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 779.045093] ffff88814f327880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 779.045097] >ffff88814f327900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 779.045099] ^ [ 779.045103] ffff88814f327980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 779.045107] ffff88814f327a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 779.045109] ================================================================== [ 779.045110] Disabling lock debugging due to kernel taint Correctly assign tree name str for smb3_tcon event. Signed-off-by: Paulo Alcantara (SUSE) <[email protected]> Signed-off-by: Steve French <[email protected]>
dabrace
pushed a commit
that referenced
this pull request
Aug 9, 2019
We have set the mmc_host.max_seg_size to 8M, but the dma max segment size of PCI device is set to 64K by default in function pci_device_add(). The mmc_host.max_seg_size is used to set the max segment size of the blk queue. Then this mismatch will trigger a calltrace like below when a bigger than 64K segment request arrives at mmc dev. So we should consider the limitation of the cvm_mmc_host when setting the mmc_host.max_seg_size. DMA-API: thunderx_mmc 0000:01:01.4: mapping sg segment longer than device claims to support [len=131072] [max=65536] WARNING: CPU: 6 PID: 238 at kernel/dma/debug.c:1221 debug_dma_map_sg+0x2b8/0x350 Modules linked in: CPU: 6 PID: 238 Comm: kworker/6:1H Not tainted 5.3.0-rc1-next-20190724-yocto-standard+ #62 Hardware name: Marvell OcteonTX CN96XX board (DT) Workqueue: kblockd blk_mq_run_work_fn pstate: 80c00009 (Nzcv daif +PAN +UAO) pc : debug_dma_map_sg+0x2b8/0x350 lr : debug_dma_map_sg+0x2b8/0x350 sp : ffff00001770f9e0 x29: ffff00001770f9e0 x28: ffffffff00000000 x27: 00000000ffffffff x26: ffff800bc2c73180 x25: ffff000010e83700 x24: 0000000000000002 x23: 0000000000000001 x22: 0000000000000001 x21: 0000000000000000 x20: ffff800bc48ba0b0 x19: ffff800bc97e8c00 x18: ffffffffffffffff x17: 0000000000000000 x16: 0000000000000000 x15: ffff000010e835c8 x14: 6874207265676e6f x13: 6c20746e656d6765 x12: 7320677320676e69 x11: 7070616d203a342e x10: 31303a31303a3030 x9 : 303020636d6d5f78 x8 : 35363d78616d5b20 x7 : 00000000000002fd x6 : ffff000010fd57dc x5 : 0000000000000000 x4 : ffff0000106c61f0 x3 : 00000000ffffffff x2 : 0000800bee060000 x1 : 7010678df3041a00 x0 : 0000000000000000 Call trace: debug_dma_map_sg+0x2b8/0x350 cvm_mmc_request+0x3c4/0x988 __mmc_start_request+0x9c/0x1f8 mmc_start_request+0x7c/0xb0 mmc_blk_mq_issue_rq+0x5c4/0x7b8 mmc_mq_queue_rq+0x11c/0x278 blk_mq_dispatch_rq_list+0xb0/0x568 blk_mq_do_dispatch_sched+0x6c/0x108 blk_mq_sched_dispatch_requests+0x110/0x1b8 __blk_mq_run_hw_queue+0xb0/0x118 blk_mq_run_work_fn+0x28/0x38 process_one_work+0x210/0x490 worker_thread+0x48/0x458 kthread+0x130/0x138 ret_from_fork+0x10/0x1c Signed-off-by: Kevin Hao <[email protected]> Fixes: ba3869f ("mmc: cavium: Add core MMC driver for Cavium SOCs") Cc: [email protected] Signed-off-by: Ulf Hansson <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.