forked from torvalds/linux
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update from base repo #6
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Chip datasheet mentions that word addresses other than the actual start position of the MAC delivers undefined results. So fix this. Current implementation doesn't work due to this wrong offset. Cc: [email protected] Fixes: 0b81365 ("eeprom: at24: add support for at24mac series") Signed-off-by: Heiner Kallweit <[email protected]> Signed-off-by: Bartosz Golaszewski <[email protected]>
So far we completely rely on the caller to provide valid arguments. To be on the safe side perform an own sanity check. Cc: [email protected] Signed-off-by: Heiner Kallweit <[email protected]> Signed-off-by: Bartosz Golaszewski <[email protected]>
Uniformize STMicroelectronics copyrights header Signed-off-by: Benjamin Gaignard <[email protected]> CC: Alexandre Torgue <[email protected]> Acked-by: Alexandre TORGUE <[email protected]> Signed-off-by: David S. Miller <[email protected]>
register_shrinker() might return -ENOMEM error since Linux 3.12. Call panic() as with other failure checks in this function if register_shrinker() failed. Fixes: 1d3d443 ("vmscan: per-node deferred work") Signed-off-by: Tetsuo Handa <[email protected]> Cc: Jan Kara <[email protected]> Cc: Michal Hocko <[email protected]> Reviewed-by: Michal Hocko <[email protected]> Signed-off-by: Jan Kara <[email protected]>
Pull NVMe fixes from Christoph: "A few more nvme updates for 4.15. A single small PCIe fix, and a number of patches for RDMA that are a little larger than what I'd like to see for -rc2, but they fix important issues seen in the wild."
This reverts commit 152e93a. It was a nice cleanup in theory, but as Nicolai Stange points out, we do need to make the page dirty for the copy-on-write case even when we didn't end up making it writable, since the dirty bit is what we use to check that we've gone through a COW cycle. Reported-by: Michal Hocko <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
…g/linux Pull printk pointer hashing update from Tobin Harding: "Here is the patch set that implements hashing of printk specifier %p. First we have two clean up patches then we do the hashing. Hashing is done via the SipHash algorithm. The next patch adds printk specifier %px for printing pointers when we _really_ want to see the address i.e %px is functionally equivalent to %lx. Final patch in the set fixes KASAN since we break it by hashing %p. For the record here is the justification for the series: Currently there exist approximately 14 000 places in the Kernel where addresses are being printed using an unadorned %p. This potentially leaks sensitive information about the Kernel layout in memory. Many of these calls are stale, instead of fixing every call we hash the address by default before printing. We then add %px to provide a way to print the actual address. Although this is achievable using %lx, using %px will assist us if we ever want to change pointer printing behaviour. %px is more uniquely grep'able (there are already >50 000 uses of %lx). The added advantage of hashing %p is that security is now opt-out, if you _really_ want the address you have to work a little harder and use %px. This will of course break some users, forcing code printing needed addresses to be updated" [ I do expect this to be an annoyance, and a number of %px users to be added for debuggability. But nobody is willing to audit existing %p users for information leaks, and a number of places really only use the pointer as an object identifier rather than really 'I need the address'. IOW - sorry for the inconvenience, but it's the least inconvenient of the options. - Linus ] * tag 'printk-hash-pointer-4.15-rc2' of git://github.com/tcharding/linux: kasan: use %px to print addresses instead of %p vsprintf: add printk specifier %px printk: hash addresses printed with %p vsprintf: refactor %pK code out of pointer() docs: correct documentation for %pK
The conditional kallsym hex printing used a special fixed-width '%lx' output (KALLSYM_FMT) in preparation for the hashing of %p, but that series ended up adding a %px specifier to help with the conversions. Use it, and avoid the "print pointer as an unsigned long" code. Signed-off-by: Linus Torvalds <[email protected]>
gcc 4.4.4 is too old to have full C11 anonymous union support, so the current initialiser fails to compile. Reported-by: Boris Ostrovsky <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> (compile-)Tested-by: Boris Ostrovsky <[email protected]> Reviewed-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
Reported-by: Dmitry Vyukov <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Tested-by: Dmitry Vyukov <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
Instead, just fall back on the new '%p' behavior which hashes the pointer. Otherwise, '%pK' - that was intended to mark a pointer as restricted - just ends up leaking pointers that a normal '%p' wouldn't leak. Which just make the whole thing pointless. I suspect we should actually get rid of '%pK' entirely, and make it just work as '%p' regardless, but this is the minimal obvious fix. People who actually use 'kptr_restrict' should weigh in on which behavior they want. Cc: Tobin Harding <[email protected]> Cc: Kees Cook <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
This reverts "drm/ttm: Fix configuration error around populate_and_map() functions". This fix has gone into the wrong direction. Those helpers should be available even when neither CONFIG_INTEL_IOMMU nor CONFIG_SWIOTLB are set. Signed-off-by: Christian König <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
VMIDs 8-16 in Kaveri were reserved for use by the amdkfd driver. Because we removed amdkfd support from radeon, those VMIDs are now used by radeon and are initialized by radeon. This patch removes the function that initialized those VMIDs for amdkfd use. This initialization overridden the radeon initialization and caused GPU faults and GUI crashed. Fixes: f4fa88a ("drm/radeon: deprecate and remove KFD interface") Rported-by: Michel Dänzer <[email protected]> Acked-by: Christian König <[email protected]> Reviewed-and-Tested-by: Michel Dänzer <[email protected]> Signed-off-by: Oded Gabbay <[email protected]> Signed-off-by: Michel Dänzer <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
If we don't put the NG4fls.o object into the same part of the link as the generic sparc64 objects for fls() and __fls() then the relocation in the branch we use for patching will not fit. Move NG4fls.o into lib-y to fix this problem. Fixes: 46ad8d2 ("sparc64: Use sparc optimized fls and __fls for T4 and above") Signed-off-by: David S. Miller <[email protected]> Reported-by: Anatoly Pugachev <[email protected]> Tested-by: Anatoly Pugachev <[email protected]>
Pull networking fixes from David Miller: 1) The forcedeth conversion from pci_*() DMA interfaces to dma_*() ones missed one spot. From Zhu Yanjun. 2) Missing CRYPTO_SHA256 Kconfig dep in cfg80211, from Johannes Berg. 3) Fix checksum offloading in thunderx driver, from Sunil Goutham. 4) Add SPDX to vm_sockets_diag.h, from Stephen Hemminger. 5) Fix use after free of packet headers in TIPC, from Jon Maloy. 6) "sizeof(ptr)" vs "sizeof(*ptr)" bug in i40e, from Gustavo A R Silva. 7) Tunneling fixes in mlxsw driver, from Petr Machata. 8) Fix crash in fanout_demux_rollover() of AF_PACKET, from Mike Maloney. 9) Fix race in AF_PACKET bind() vs. NETDEV_UP notifier, from Eric Dumazet. 10) Fix regression in sch_sfq.c due to one of the timer_setup() conversions. From Paolo Abeni. 11) SCTP does list_for_each_entry() using wrong struct member, fix from Xin Long. 12) Don't use big endian netlink attribute read for IFLA_BOND_AD_ACTOR_SYSTEM, it is in cpu endianness. Also from Xin Long. 13) Fix mis-initialization of q->link.clock in CBQ scheduler, preventing adding filters there. From Jiri Pirko. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (67 commits) ethernet: dwmac-stm32: Fix copyright net: via: via-rhine: use %p to format void * address instead of %x net: ethernet: xilinx: Mark XILINX_LL_TEMAC broken on 64-bit myri10ge: Update MAINTAINERS net: sched: cbq: create block for q->link.block atm: suni: remove extraneous space to fix indentation atm: lanai: use %p to format kernel addresses instead of %x VSOCK: Don't set sk_state to TCP_CLOSE before testing it atm: fore200e: use %pK to format kernel addresses instead of %x ambassador: fix incorrect indentation of assignment statement vxlan: use __be32 type for the param vni in __vxlan_fdb_delete bonding: use nla_get_u64 to extract the value for IFLA_BOND_AD_ACTOR_SYSTEM sctp: use right member as the param of list_for_each_entry sch_sfq: fix null pointer dereference at timer expiration cls_bpf: don't decrement net's refcount when offload fails net/packet: fix a race in packet_bind() and packet_notifier() packet: fix crash in fanout_demux_rollover() sctp: remove extern from stream sched sctp: force the params with right types for sctp csum apis sctp: force SCTP_ERROR_INV_STRM with __u32 when calling sctp_chunk_fail ...
Pull sparc fix from David Miller: "Sparc T4 and later cpu bootup regression fix" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: Fix boot on T4 and later.
…oblaze Pull Microblaze fix from Michal Simek: "Add missing header to mmu_context_mm.h" * tag 'microblaze-4.15-rc2' of git://git.monstr.eu/linux-2.6-microblaze: microblaze: add missing include to mmu_context_mm.h
…rnel/git/kdave/linux Pull btrfs fixes from David Sterba: "We've collected some fixes in since the pre-merge window freeze. There's technically only one regression fix for 4.15, but the rest seems important and candidates for stable. - fix missing flush bio puts in error cases (is serious, but rarely happens) - fix reporting stat::st_blocks for buffered append writes - fix space cache invalidation - fix out of bound memory access when setting zlib level - fix potential memory corruption when fsync fails in the middle - fix crash in integrity checker - incremetnal send fix, path mixup for certain unlink/rename combination - pass flags to writeback so compressed writes can be throttled properly - error handling fixes" * tag 'for-4.15-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: Btrfs: incremental send, fix wrong unlink path after renaming file btrfs: tree-checker: Fix false panic for sanity test Btrfs: fix list_add corruption and soft lockups in fsync btrfs: Fix wild memory access in compression level parser btrfs: fix deadlock when writing out space cache btrfs: clear space cache inode generation always Btrfs: fix reported number of inode blocks after buffered append writes Btrfs: move definition of the function btrfs_find_new_delalloc_bytes Btrfs: bail out gracefully rather than BUG_ON btrfs: dev_alloc_list is not protected by RCU, use normal list_del btrfs: add missing device::flush_bio puts btrfs: Fix transaction abort during failure in btrfs_rm_dev_item Btrfs: add write_flags for compression bio
Pull nfsd fixes from Bruce Fields: "I screwed up my merge window pull request; I only sent half of what I meant to. There were no new features, just bugfixes of various importance and some very minor cleanup, so I think it's all still appropriate for -rc2. Highlights: - Fixes from Trond for some races in the NFSv4 state code. - Fix from Naofumi Honda for a typo in the blocked lock notificiation code - Fixes from Vasily Averin for some problems starting and stopping lockd especially in network namespaces" * tag 'nfsd-4.15-1' of git://linux-nfs.org/~bfields/linux: (23 commits) lockd: fix "list_add double add" caused by legacy signal interface nlm_shutdown_hosts_net() cleanup race of nfsd inetaddr notifiers vs nn->nfsd_serv change race of lockd inetaddr notifiers vs nlmsvc_rqst change SUNRPC: make cache_detail structures const NFSD: make cache_detail structures const sunrpc: make the function arg as const nfsd: check for use of the closed special stateid nfsd: fix panic in posix_unblock_lock called from nfs4_laundromat lockd: lost rollback of set_grace_period() in lockd_down_net() lockd: added cleanup checks in exit_net hook grace: replace BUG_ON by WARN_ONCE in exit_net hook nfsd: fix locking validator warning on nfs4_ol_stateid->st_mutex class lockd: remove net pointer from messages nfsd: remove net pointer from debug messages nfsd: Fix races with check_stateid_generation() nfsd: Ensure we check stateid validity in the seqid operation checks nfsd: Fix race in lock stateid creation nfsd4: move find_lock_stateid nfsd: Ensure we don't recognise lock stateids after freeing them ...
If bad or unrecognised parameters are specified after JSON output is requested, `usage()` will try to output null JSON object before the writer is created. To prevent this, create the writer as soon as the `--json` option is parsed. Fixes: 004b45c ("tools: bpftool: provide JSON output for all possible commands") Reported-by: Jakub Kicinski <[email protected]> Signed-off-by: Quentin Monnet <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
The writer is cleaned at the end of the main function, but not if the program exits sooner in usage(). Let's keep it clean and destroy the writer before exiting. Destruction and actual call to exit() are moved to another function so that clean exit can also be performed without printing usage() hints. Fixes: d35efba ("tools: bpftool: introduce --json and --pretty options") Signed-off-by: Quentin Monnet <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
If `getopt_long()` meets an unknown option, it prints its own error message to standard error output. While this does not strictly break JSON output, it is the only case bpftool prints something to standard error output if JSON output is required. All other errors are printed on standard output as JSON objects, so that an external program does not have to parse stderr. This is changed by setting the global variable `opterr` to 0. Furthermore, p_err() is used to reproduce the error message in a more JSON-friendly way, so that users still get to know what the erroneous option is. Signed-off-by: Quentin Monnet <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
The end-of-line character inside the string would break JSON compliance. Remove it, `p_err()` already adds a '\n' character for plain output anyway. Fixes: 9a5ab8b ("tools: bpftool: turn err() and info() macros into functions") Reported-by: Jakub Kicinski <[email protected]> Signed-off-by: Quentin Monnet <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
Programs and documentation not managed by package manager are generally installed under /usr/local/, instead of the user's home directory. In particular, `man` is generally able to find manual pages under `/usr/local/share/man`. bpftool generally follows perf's example, and perf installs to home directory. However bpftool requires root credentials, so it seems sensible to follow the more common convention of installing files under /usr/local instead. So, make /usr/local the default prefix for installing the binary with `make install`, and the documentation with `make doc-install`. Also, create /usr/local/sbin if it does not exist. Note that the bash-completion file, however, is still installed under /usr/share/bash-completion/completions, as the default setup for bash does not attempt to load completion files under /usr/local/. Reported-by: David Beckett <[email protected]> Signed-off-by: Quentin Monnet <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
In the Makefile, targets install, doc and doc-install should be added to .PHONY. Let's fix this. Fixes: 71bb428 ("tools: bpf: add bpftool") Signed-off-by: Quentin Monnet <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
Quentin Monnet says: ==================== First commit in this series fixes a crash that occurs when incorrect arguments are passed to bpftool after the `--json` option. It comes from the usage() function trying to use the JSON writer, although the latter has not been created yet at that point. Other patches add destruction of the writer in case the program exits in usage(), fix error messages handling when an unrecognized option is encountered, remove a spurious new-line character in an error message. Last patches are related to the Makefiles. They fix the installation directory prefix and .PHONY targets. ==================== Signed-off-by: Daniel Borkmann <[email protected]>
…meter list We meet this compile warning, which caused by missing bpf.h in xdp.h. In file included from ./include/trace/events/xdp.h:10:0, from ./include/linux/bpf_trace.h:6, from drivers/net/ethernet/intel/i40e/i40e_txrx.c:29: ./include/trace/events/xdp.h:93:17: warning: ‘struct bpf_map’ declared inside parameter list will not be visible outside of this definition or declaration const struct bpf_map *map, u32 map_index), ^ ./include/linux/tracepoint.h:187:34: note: in definition of macro ‘__DECLARE_TRACE’ static inline void trace_##name(proto) \ ^~~~~ ./include/linux/tracepoint.h:352:24: note: in expansion of macro ‘PARAMS’ __DECLARE_TRACE(name, PARAMS(proto), PARAMS(args), \ ^~~~~~ ./include/linux/tracepoint.h:477:2: note: in expansion of macro ‘DECLARE_TRACE’ DECLARE_TRACE(name, PARAMS(proto), PARAMS(args)) ^~~~~~~~~~~~~ ./include/linux/tracepoint.h:477:22: note: in expansion of macro ‘PARAMS’ DECLARE_TRACE(name, PARAMS(proto), PARAMS(args)) ^~~~~~ ./include/trace/events/xdp.h:89:1: note: in expansion of macro ‘DEFINE_EVENT’ DEFINE_EVENT(xdp_redirect_template, xdp_redirect, ^~~~~~~~~~~~ ./include/trace/events/xdp.h:90:2: note: in expansion of macro ‘TP_PROTO’ TP_PROTO(const struct net_device *dev, ^~~~~~~~ ./include/trace/events/xdp.h:93:17: warning: ‘struct bpf_map’ declared inside parameter list will not be visible outside of this definition or declaration const struct bpf_map *map, u32 map_index), ^ ./include/linux/tracepoint.h:203:38: note: in definition of macro ‘__DECLARE_TRACE’ register_trace_##name(void (*probe)(data_proto), void *data) \ ^~~~~~~~~~ ./include/linux/tracepoint.h:354:4: note: in expansion of macro ‘PARAMS’ PARAMS(void *__data, proto), \ ^~~~~~ Reported-by: Huang Daode <[email protected]> Cc: Hanjun Guo <[email protected]> Fixes: 8d3b778 ("xdp: tracepoint xdp_redirect also need a map argument") Signed-off-by: Xie XiuQi <[email protected]> Acked-by: Jesper Dangaard Brouer <[email protected]> Acked-by: Steven Rostedt (VMware) <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
…ker context drain_all_pages backs off when called from a kworker context since commit 0ccce3b ("mm, page_alloc: drain per-cpu pages from workqueue context") because the original IPI based pcp draining has been replaced by a WQ based one and the check wanted to prevent from recursion and inter workers dependencies. This has made some sense at the time because the system WQ has been used and one worker holding the lock could be blocked while waiting for new workers to emerge which can be a problem under OOM conditions. Since then commit ce61287 ("mm: move pcp and lru-pcp draining into single wq") has moved draining to a dedicated (mm_percpu_wq) WQ with a rescuer so we shouldn't depend on any other WQ activity to make a forward progress so calling drain_all_pages from a worker context is safe as long as this doesn't happen from mm_percpu_wq itself which is not the case because all workers are required to _not_ depend on any MM locks. Why is this a problem in the first place? ACPI driven memory hot-remove (acpi_device_hotplug) is executed from the worker context. We end up calling __offline_pages to free all the pages and that requires both lru_add_drain_all_cpuslocked and drain_all_pages to do their job otherwise we can have dangling pages on pcp lists and fail the offline operation (__test_page_isolated_in_pageblock would see a page with 0 ref count but without PageBuddy set). Fix the issue by removing the worker check in drain_all_pages. lru_add_drain_all_cpuslocked doesn't have this restriction so it works as expected. Link: http://lkml.kernel.org/r/[email protected] Fixes: 0ccce3b ("mm, page_alloc: drain per-cpu pages from workqueue context") Signed-off-by: Michal Hocko <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Tejun Heo <[email protected]> Cc: <[email protected]> [4.11+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
tlb_gather_mmu(&tlb, mm, 0, -1) means gathering the whole virtual memory space. In this case, tlb->fullmm is true. Some archs like arm64 doesn't flush TLB when tlb->fullmm is true: commit 5a7862e ("arm64: tlbflush: avoid flushing when fullmm == 1"). Which causes leaking of tlb entries. Will clarifies his patch: "Basically, we tag each address space with an ASID (PCID on x86) which is resident in the TLB. This means we can elide TLB invalidation when pulling down a full mm because we won't ever assign that ASID to another mm without doing TLB invalidation elsewhere (which actually just nukes the whole TLB). I think that means that we could potentially not fault on a kernel uaccess, because we could hit in the TLB" There could be a window between complete_signal() sending IPI to other cores and all threads sharing this mm are really kicked off from cores. In this window, the oom reaper may calls tlb_flush_mmu_tlbonly() to flush TLB then frees pages. However, due to the above problem, the TLB entries are not really flushed on arm64. Other threads are possible to access these pages through TLB entries. Moreover, a copy_to_user() can also write to these pages without generating page fault, causes use-after-free bugs. This patch gathers each vma instead of gathering full vm space. In this case tlb->fullmm is not true. The behavior of oom reaper become similar to munmapping before do_exit, which should be safe for all archs. Link: http://lkml.kernel.org/r/[email protected] Fixes: aac4536 ("mm, oom: introduce oom reaper") Signed-off-by: Wang Nan <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: David Rientjes <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Will Deacon <[email protected]> Cc: Bob Liu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
If the call __alloc_contig_migrate_range() in alloc_contig_range returns -EBUSY, processing continues so that test_pages_isolated() is called where there is a tracepoint to identify the busy pages. However, it is possible for busy pages to become available between the calls to these two routines. In this case, the range of pages may be allocated. Unfortunately, the original return code (ret == -EBUSY) is still set and returned to the caller. Therefore, the caller believes the pages were not allocated and they are leaked. Update the comment to indicate that allocation is still possible even if __alloc_contig_migrate_range returns -EBUSY. Also, clear return code in this case so that it is not accidentally used or returned to caller. Link: http://lkml.kernel.org/r/[email protected] Fixes: 8ef5849 ("mm/cma: always check which page caused allocation failure") Signed-off-by: Mike Kravetz <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: Johannes Weiner <[email protected]> Acked-by: Joonsoo Kim <[email protected]> Cc: Michal Nazarewicz <[email protected]> Cc: Laura Abbott <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mel Gorman <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
Pull ARM fix from Russell King: "Just one fix this time around, for the late commit in the merge window that triggered a problem with qemu. Qemu is apparently also going to receive a fix for the discovered issue" * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: avoid faulting on qemu
James Morris reported kernel stack corruption bug [1] while running the SELinux testsuite, and bisected to a recent commit bffa72c ("net: sk_buff rbnode reorg") We believe this commit is fine, but exposes an older bug. SELinux code runs from tcp_filter() and might send an ICMP, expecting IP options to be found in skb->cb[] using regular IPCB placement. We need to defer TCP mangling of skb->cb[] after tcp_filter() calls. This patch adds tcp_v4_fill_cb()/tcp_v4_restore_cb() in a very similar way we added them for IPv6. [1] [ 339.806024] SELinux: failure in selinux_parse_skb(), unable to parse packet [ 339.822505] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff81745af5 [ 339.822505] [ 339.852250] CPU: 4 PID: 3642 Comm: client Not tainted 4.15.0-rc1-test #15 [ 339.868498] Hardware name: LENOVO 10FGS0VA1L/30BC, BIOS FWKT68A 01/19/2017 [ 339.885060] Call Trace: [ 339.896875] <IRQ> [ 339.908103] dump_stack+0x63/0x87 [ 339.920645] panic+0xe8/0x248 [ 339.932668] ? ip_push_pending_frames+0x33/0x40 [ 339.946328] ? icmp_send+0x525/0x530 [ 339.958861] ? kfree_skbmem+0x60/0x70 [ 339.971431] __stack_chk_fail+0x1b/0x20 [ 339.984049] icmp_send+0x525/0x530 [ 339.996205] ? netlbl_skbuff_err+0x36/0x40 [ 340.008997] ? selinux_netlbl_err+0x11/0x20 [ 340.021816] ? selinux_socket_sock_rcv_skb+0x211/0x230 [ 340.035529] ? security_sock_rcv_skb+0x3b/0x50 [ 340.048471] ? sk_filter_trim_cap+0x44/0x1c0 [ 340.061246] ? tcp_v4_inbound_md5_hash+0x69/0x1b0 [ 340.074562] ? tcp_filter+0x2c/0x40 [ 340.086400] ? tcp_v4_rcv+0x820/0xa20 [ 340.098329] ? ip_local_deliver_finish+0x71/0x1a0 [ 340.111279] ? ip_local_deliver+0x6f/0xe0 [ 340.123535] ? ip_rcv_finish+0x3a0/0x3a0 [ 340.135523] ? ip_rcv_finish+0xdb/0x3a0 [ 340.147442] ? ip_rcv+0x27c/0x3c0 [ 340.158668] ? inet_del_offload+0x40/0x40 [ 340.170580] ? __netif_receive_skb_core+0x4ac/0x900 [ 340.183285] ? rcu_accelerate_cbs+0x5b/0x80 [ 340.195282] ? __netif_receive_skb+0x18/0x60 [ 340.207288] ? process_backlog+0x95/0x140 [ 340.218948] ? net_rx_action+0x26c/0x3b0 [ 340.230416] ? __do_softirq+0xc9/0x26a [ 340.241625] ? do_softirq_own_stack+0x2a/0x40 [ 340.253368] </IRQ> [ 340.262673] ? do_softirq+0x50/0x60 [ 340.273450] ? __local_bh_enable_ip+0x57/0x60 [ 340.285045] ? ip_finish_output2+0x175/0x350 [ 340.296403] ? ip_finish_output+0x127/0x1d0 [ 340.307665] ? nf_hook_slow+0x3c/0xb0 [ 340.318230] ? ip_output+0x72/0xe0 [ 340.328524] ? ip_fragment.constprop.54+0x80/0x80 [ 340.340070] ? ip_local_out+0x35/0x40 [ 340.350497] ? ip_queue_xmit+0x15c/0x3f0 [ 340.361060] ? __kmalloc_reserve.isra.40+0x31/0x90 [ 340.372484] ? __skb_clone+0x2e/0x130 [ 340.382633] ? tcp_transmit_skb+0x558/0xa10 [ 340.393262] ? tcp_connect+0x938/0xad0 [ 340.403370] ? ktime_get_with_offset+0x4c/0xb0 [ 340.414206] ? tcp_v4_connect+0x457/0x4e0 [ 340.424471] ? __inet_stream_connect+0xb3/0x300 [ 340.435195] ? inet_stream_connect+0x3b/0x60 [ 340.445607] ? SYSC_connect+0xd9/0x110 [ 340.455455] ? __audit_syscall_entry+0xaf/0x100 [ 340.466112] ? syscall_trace_enter+0x1d0/0x2b0 [ 340.476636] ? __audit_syscall_exit+0x209/0x290 [ 340.487151] ? SyS_connect+0xe/0x10 [ 340.496453] ? do_syscall_64+0x67/0x1b0 [ 340.506078] ? entry_SYSCALL64_slow_path+0x25/0x25 Fixes: 971f10e ("tcp: better TCP_SKB_CB layout to reduce cache line misses") Signed-off-by: Eric Dumazet <[email protected]> Reported-by: James Morris <[email protected]> Tested-by: James Morris <[email protected]> Tested-by: Casey Schaufler <[email protected]> Signed-off-by: David S. Miller <[email protected]>
After this fix : ("tcp: add tcp_v4_fill_cb()/tcp_v4_restore_cb()"), socket lookups happen while skb->cb[] has not been mangled yet by TCP. Fixes: a04a480 ("net: Require exact match for TCP socket lookups if dif is l3mdev") Signed-off-by: David Ahern <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Eric Dumazet says: ==================== tcp: add tcp_v4_fill_cb()/tcp_v4_restore_cb() James Morris reported kernel stack corruption bug that we tracked back to commit 971f10e ("tcp: better TCP_SKB_CB layout to reduce cache line misses") First patch needs to be backported to kernels >= 3.18, while second patch needs to be backported to kernels >= 4.9, since this was the time when inet_exact_dif_match appeared. ==================== Signed-off-by: David S. Miller <[email protected]>
Daniel Borkmann says: ==================== pull-request: bpf 2017-12-02 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix a compilation warning in xdp redirect tracepoint due to missing bpf.h include that pulls in struct bpf_map, from Xie. 2) Limit the maximum number of attachable BPF progs for a given perf event as long as uabi is not frozen yet. The hard upper limit is now 64 and therefore the same as with BPF multi-prog for cgroups. Also add related error checking for the sample BPF loader when enabling and attaching to the perf event, from Yonghong. 3) Specifically set the RLIMIT_MEMLOCK for the test_verifier_log case, so that the test case can always pass and not fail in some environments due to too low default limit, also from Yonghong. 4) Fix up a missing license header comment for kernel/bpf/offload.c, from Jakub. 5) Several fixes for bpftool, among others a crash on incorrect arguments when json output is used, error message handling fixes on unknown options and proper destruction of json writer for some exit cases, all from Quentin. ==================== Signed-off-by: David S. Miller <[email protected]>
The pci/htirq.c file was removed so remove it from the documentation file also. Error: Cannot open file ../drivers/pci/htirq.c WARNING: kernel-doc '../scripts/kernel-doc -rst -enable-lineno -export ../drivers/pci/htirq.c' failed with return code 2 Fixes: fd2fa6c ("x86/PCI: Remove unused HyperTransport interrupt support") Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Jonathan Corbet <[email protected]>
The chip family of TILEPro and TILE-Gx was developed by Tilera, which was eventually acquired by Mellanox. The tile architecture was added to the kernel in 2010 and first appeared in 2.6.36. Now at Mellanox we are developing new chips based on the ARM64 architecture; our last TILE-Gx chip (the Gx72) was released in 2013, and our customers using tile architecture products are not, as far as we know, looking to upgrade to newer kernel releases. In the absence of someone in the community stepping up to take over maintainership, this commit marks the architecture as orphaned. Cc: Chris Metcalf <[email protected]> Signed-off-by: Chris Metcalf <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
Pull networking fixes from David Miller: 1) Various TCP control block fixes, including one that crashes with SELinux, from David Ahern and Eric Dumazet. 2) Fix ACK generation in rxrpc, from David Howells. 3) ipvlan doesn't set the mark properly in the ipv4 route lookup key, from Gao Feng. 4) SIT configuration doesn't take on the frag_off ipv4 field configuration properly, fix from Hangbin Liu. 5) TSO can fail after device down/up on stmmac, fix from Lars Persson. 6) Various bpftool fixes (mostly in JSON handling) from Quentin Monnet. 7) Various SKB leak fixes in vhost/tun/tap (mostly observed as performance problems). From Wei Xu. 8) mvpps's TX descriptors were not zero initialized, from Yan Markman. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (57 commits) tcp: use IPCB instead of TCP_SKB_CB in inet_exact_dif_match() tcp: add tcp_v4_fill_cb()/tcp_v4_restore_cb() rxrpc: Fix the MAINTAINERS record rxrpc: Use correct netns source in rxrpc_release_sock() liquidio: fix incorrect indentation of assignment statement stmmac: reset last TSO segment size after device open ipvlan: Add the skb->mark as flow4's member to lookup route s390/qeth: build max size GSO skbs on L2 devices s390/qeth: fix GSO throughput regression s390/qeth: fix thinko in IPv4 multicast address tracking tap: free skb if flags error tun: free skb in early errors vhost: fix skb leak in handle_rx() bnxt_en: Fix a variable scoping in bnxt_hwrm_do_send_msg() bnxt_en: fix dst/src fid for vxlan encap/decap actions bnxt_en: wildcard smac while creating tunnel decap filter bnxt_en: Need to unconditionally shut down RoCE in bnxt_shutdown phylink: ensure we take the link down when phylink_stop() is called sfp: warn about modules requiring address change sequence sfp: improve RX_LOS handling ...
…t/mst/vhost Pull virtio fixes from Michael Tsirkin: "virtio and qemu bugfixes A couple of bugfixes that just became ready" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio_balloon: fix increment of vb->num_pfns in fill_balloon() virtio: release virtio index when fail to device_register fw_cfg: fix driver remove
Pull documentation fixes from Jonathan Corbet: "A handful of documentation fixes. The most significant of these addresses a problem with the new warning mode: it can break the build when confronted with a source file containing malformed kerneldoc comments" * tag 'docs-4.15-fixes' of git://git.lwn.net/linux: Documentation: fix docs build error after source file removed scsi: documentation: Fix case of 'scsi_device' struct mention(s) genericirq.rst: Remove :c:func:`...` in code blocks dmaengine: doc : Fix warning "Title underline too short" while make xmldocs scripts/kernel-doc: Don't fail with status != 0 if error encountered with -none
Add Todd Kjos and myself, remove Riley (who no longer works at Google). Signed-off-by: Martijn Coenen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Geert Uytterhoeven reported a NFS oops, and pointed out that some of the numbers were hashed and useless. We could just turn them from '%p' into '%px', but those numbers are really just legacy, and useless even when not hashed. So just remove them entirely. Reported-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
…/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: "Three small fixes for GPIO. Not much, I'm surprised by the silence in my subsystems. All driver fixes: - fix a crash in the 74x164 driver - fix IRQ banks in the DaVinci driver - fix the vendor prefix in the PCA953x driver" * tag 'gpio-v4.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: pca953x: fix vendor prefix for PCA9654 gpio: davinci: Assign first bank regs for unbanked case gpio: 74x164: Fix crash during .remove()
…nel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: "As with GPIO not much action in pin control. All are driver fixes: - fix the UART2 RTS pin mode on Intel Denverton - fix the direction_output() behaviour on the Armada 37xx - fix the groups selection per-SoC on the Gemini - fix the interrupt pin bank on the Sunxi A80 - fix the UART mux on the Sunxi A64 - disable the strict mode on the Sunxi H5 driver" * tag 'pinctrl-v4.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: sunxi: Disable strict mode for H5 driver pinctrl: sunxi: Fix A64 UART mux value pinctrl: sunxi: Fix A80 interrupt pin bank pinctrl: gemini: Fix usage of 3512 groups pinctrl: armada-37xx: Fix direction_output() callback behavior pinctrl: denverton: Fix UART2 RTS pin mode
…/git/gregkh/usb Pull USB fixes from Greg KH: "Here are a few minor USB fixes for 4.15-rc3. The largest here is the Kconfig text and configuration changes for the USB TypeC build options that you reported during the -rc1 merge window. The others are all just small fixes for reported issues, as well as some new device ids. The most "interesting" of anything here is the usbip fixes as it seems lots of people are starting to pay attention to that driver at the moment. These fixes should resolve all of the reported problems as of now. Of course there are the usual xhci and gadget fixes as well, can't go a pull request without those... All of these have been in linux-next for a while with no reported issues" * tag 'usb-4.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (22 commits) usb: xhci: fix panic in xhci_free_virt_devices_depth_first xhci: Don't show incorrect WARN message about events for empty rings usbip: fix usbip attach to find a port that matches the requested speed usbip: Fix USB device hang due to wrong enabling of scatter-gather uas: Always apply US_FL_NO_ATA_1X quirk to Seagate devices usb: quirks: Add no-lpm quirk for KY-688 USB 3.1 Type-C Hub usb: build drivers/usb/common/ when USB_SUPPORT is set usb: hub: Cycle HUB power when initialization fails USB: core: Add type-specific length check of BOS descriptors usb: host: fix incorrect updating of offset USB: ulpi: fix bus-node lookup USB: usbfs: Filter flags passed in from user space usb: add user selectable option for the whole USB Type-C Support usb: f_fs: Force Reserved1=1 in OS_DESC_EXT_COMPAT usb: gadget: core: Fix ->udc_set_speed() speed handling usb: gadget: allow to enable legacy drivers without USB_ETH usb: gadget: udc: renesas_usb3: fix number of the pipes usb: gadget: don't dereference g until after it has been null checked USB: serial: usb_debug: add new USB device id usb: bdc: fix platform_no_drv_owner.cocci warnings ...
…/git/gregkh/tty Pull tty/serial driver fixes from Greg KH: "Here are some small serdev and serial fixes for 4.15-rc3. They resolve some reported problems: - a number of serdev fixes to resolve crashes - MIPS build fixes for their serial port - a new 8250 device id All of these have been in linux-next for a while with no reported issues" * tag 'tty-4.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: MIPS: Add custom serial.h with BASE_BAUD override for generic kernel serdev: ttyport: fix tty locking in close serdev: ttyport: fix NULL-deref on hangup serdev: fix receive_buf return value when no callback serdev: ttyport: add missing receive_buf sanity checks serial: 8250_early: Only set divisor if valid clk & baud serial: 8250_pci: Add Amazon PCI serial device ID
…rnel/git/gregkh/staging Pull staging and iio driver fixes from Greg KH: "Here are a number of small staging and iio driver fixes for reported issues for 4.15-rc3. Nothing major here, the majority is IIO issues, like normal, but there are also some small bugfixes for a few staging drivers as well. Full details are in the shortlog. All of these have been in linux-next for a while with no reported issues" * tag 'staging-4.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: iio: stm32: fix adc/trigger link error iio: health: max30102: Temperature should be in milli Celsius iio: fix kernel-doc build errors iio: adc: meson-saradc: Meson8 and Meson8b do not have REG11 and REG13 iio: adc: meson-saradc: initialize the bandgap correctly on older SoCs iio: adc: meson-saradc: fix the bit_idx of the adc_en clock iio: proximity: sx9500: Assign interrupt from GpioIo() iio: adc: cpcap: fix incorrect validation staging: octeon-usb: use __delay() instead of cvmx_wait() staging: rtl8188eu: Fix incorrect response to SIOCGIWESSID staging: ccree: fix leak of import() after init() staging: comedi: ni_atmio: fix license warning.
…x/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are 3 small fixes for some reported issues: - a debugfs build error that lots of people have reported - a Kconfig help text cleanup now that the firmware is not in the kernel tree - an ISA bus bug fix for a reported issue that has been there since 2.6.18. All of these have been in linux-next with no reported issues" * tag 'driver-core-4.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: firmware: cleanup FIRMWARE_IN_KERNEL message isa: Prevent NULL dereference in isa_bus driver callbacks debugfs: fix debugfs_real_fops() build error
…kernel/git/gregkh/char-misc Pull char/misc fixes from Greg KH: "Here are some small misc driver fixes for 4.15-rc3 to resolve reported issues. Specifically these are: - binder fix for a memory leak - vpd driver fixes for a number of reported problems - hyperv driver fix for memory accesses where it shouldn't be. All of these have been in linux-next for a while. There's also one more MAINTAINERS file update that came in today to get the Android developer's emails correct, which is also in this pull request, that was not in linux-next, but should not be an issue" * tag 'char-misc-4.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: MAINTAINERS: update Android driver maintainers. firmware: vpd: Fix platform driver and device registration/unregistration firmware: vpd: Tie firmware kobject to device lifetime firmware: vpd: Destroy vpd sections in remove function hv: kvp: Avoid reading past allocated blocks from KVP file Drivers: hv: vmbus: Fix a rescind issue ANDROID: binder: fix transaction leak.
…t/rdma/rdma Pull rdma fixes from Jason Gunthorpe: "Here is the first rc pull request for RDMA. This includes an important core fix for a regression in iWarp if SELinux is enabled, a fix for a compilation regression introduced in this merge window, and one obscure kconfig combination that oops's the kernel. For drivers, we have hns fixes needed to make their devices work on certain ARM IOMMU configurations, a stack data leak for hfi1, and various testing discovered -rc bug fixes for i40iw. This cycle we pushed back on the driver maintainers to have better commit messages for -rc material" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: IB/core: Only enforce security for InfiniBand RDMA/hns: Get rid of page operation after dma_alloc_coherent RDMA/hns: Get rid of virt_to_page and vmap calls after dma_alloc_coherent RDMA/hns: Fix the issue of IOVA not page continuous in hip08 IB/core: Init subsys if compiled to vmlinuz-core RDMA/cma: Make sure that PSN is not over max allowed i40iw: Notify user of established connection after QP in RTS i40iw: Move MPA request event for loopback after connect i40iw: Correct ARP index mask i40iw: Do not free sqbuf when event is I40IW_TIMER_TYPE_CLOSE i40iw: Allocate a sdbuf per CQP WQE IB: INFINIBAND should depend on HAS_DMA IB/hfi1: Initialize bth1 in 16B rc ack builder
…it/jejb/scsi Pull SCSI fixes from James Bottomley: "A bunch of fixes for aacraid, a set of coherency fixes that only affect non-coherent platforms and one coccinelle detected null check after use" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: libsas: align sata_device's rps_resp on a cacheline scsi: use dma_get_cache_alignment() as minimum DMA alignment scsi: dma-mapping: always provide dma_get_cache_alignment scsi: ufs: ufshcd: fix potential NULL pointer dereference in ufshcd_config_vreg scsi: aacraid: Prevent crash in case of free interrupt during scsi EH path scsi: aacraid: Perform initialization reset only once scsi: aacraid: Check for PCI state of device in a generic way
The refcount_t protection on x86 was not intended to use the stricter GPL export. This adjusts the linkage again to avoid a regression in the availability of the refcount API. Reported-by: Dave Airlie <[email protected]> Fixes: 7a46ec0 ("locking/refcounts, x86/asm: Implement fast refcount overflow protection") Cc: [email protected] Signed-off-by: Kees Cook <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
Things like this will probably keep showing up for other architectures and other special cases. I actually thought we already used %lx for this, and that is indeed _historically_ the case, but we moved to %p when merging the 32-bit and 64-bit cases as a convenient way to get the formatting right (ie automatically picking "%08lx" vs "%016lx" based on register size). So just turn this %p into %px. Reported-by: Sergey Senozhatsky <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
Esteban-Rocha
pushed a commit
that referenced
this pull request
Feb 8, 2018
Scenario: 1. Port down and do fail over 2. Ap do rds_bind syscall PID: 47039 TASK: ffff89887e2fe640 CPU: 47 COMMAND: "kworker/u:6" #0 [ffff898e35f159f0] machine_kexec at ffffffff8103abf9 #1 [ffff898e35f15a60] crash_kexec at ffffffff810b96e3 #2 [ffff898e35f15b30] oops_end at ffffffff8150f518 #3 [ffff898e35f15b60] no_context at ffffffff8104854c #4 [ffff898e35f15ba0] __bad_area_nosemaphore at ffffffff81048675 #5 [ffff898e35f15bf0] bad_area_nosemaphore at ffffffff810487d3 #6 [ffff898e35f15c00] do_page_fault at ffffffff815120b8 #7 [ffff898e35f15d10] page_fault at ffffffff8150ea95 [exception RIP: unknown or invalid address] RIP: 0000000000000000 RSP: ffff898e35f15dc8 RFLAGS: 00010282 RAX: 00000000fffffffe RBX: ffff889b77f6fc00 RCX:ffffffff81c99d88 RDX: 0000000000000000 RSI: ffff896019ee08e8 RDI:ffff889b77f6fc00 RBP: ffff898e35f15df0 R8: ffff896019ee08c8 R9:0000000000000000 R10: 0000000000000400 R11: 0000000000000000 R12:ffff896019ee08c0 R13: ffff889b77f6fe68 R14: ffffffff81c99d80 R15: ffffffffa022a1e0 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #8 [ffff898e35f15dc8] cma_ndev_work_handler at ffffffffa022a228 [rdma_cm] #9 [ffff898e35f15df8] process_one_work at ffffffff8108a7c6 #10 [ffff898e35f15e58] worker_thread at ffffffff8108bda0 #11 [ffff898e35f15ee8] kthread at ffffffff81090fe6 PID: 45659 TASK: ffff880d313d2500 CPU: 31 COMMAND: "oracle_45659_ap" #0 [ffff881024ccfc98] __schedule at ffffffff8150bac4 #1 [ffff881024ccfd40] schedule at ffffffff8150c2cf #2 [ffff881024ccfd50] __mutex_lock_slowpath at ffffffff8150cee7 #3 [ffff881024ccfdc0] mutex_lock at ffffffff8150cdeb #4 [ffff881024ccfde0] rdma_destroy_id at ffffffffa022a027 [rdma_cm] #5 [ffff881024ccfe10] rds_ib_laddr_check at ffffffffa0357857 [rds_rdma] #6 [ffff881024ccfe50] rds_trans_get_preferred at ffffffffa0324c2a [rds] #7 [ffff881024ccfe80] rds_bind at ffffffffa031d690 [rds] #8 [ffff881024ccfeb0] sys_bind at ffffffff8142a670 PID: 45659 PID: 47039 rds_ib_laddr_check /* create id_priv with a null event_handler */ rdma_create_id rdma_bind_addr cma_acquire_dev /* add id_priv to cma_dev->id_list */ cma_attach_to_dev cma_ndev_work_handler /* event_hanlder is null */ id_priv->id.event_handler Signed-off-by: Guanglei Li <[email protected]> Signed-off-by: Honglei Wang <[email protected]> Reviewed-by: Junxiao Bi <[email protected]> Reviewed-by: Yanjun Zhu <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Acked-by: Santosh Shilimkar <[email protected]> Acked-by: Doug Ledford <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Esteban-Rocha
pushed a commit
that referenced
this pull request
Mar 3, 2018
It was reported by Sergey Senozhatsky that if THP (Transparent Huge Page) and frontswap (via zswap) are both enabled, when memory goes low so that swap is triggered, segfault and memory corruption will occur in random user space applications as follow, kernel: urxvt[338]: segfault at 20 ip 00007fc08889ae0d sp 00007ffc73a7fc40 error 6 in libc-2.26.so[7fc08881a000+1ae000] #0 0x00007fc08889ae0d _int_malloc (libc.so.6) #1 0x00007fc08889c2f3 malloc (libc.so.6) #2 0x0000560e6004bff7 _Z14rxvt_wcstoutf8PKwi (urxvt) #3 0x0000560e6005e75c n/a (urxvt) #4 0x0000560e6007d9f1 _ZN16rxvt_perl_interp6invokeEP9rxvt_term9hook_typez (urxvt) #5 0x0000560e6003d988 _ZN9rxvt_term9cmd_parseEv (urxvt) #6 0x0000560e60042804 _ZN9rxvt_term6pty_cbERN2ev2ioEi (urxvt) #7 0x0000560e6005c10f _Z17ev_invoke_pendingv (urxvt) #8 0x0000560e6005cb55 ev_run (urxvt) #9 0x0000560e6003b9b9 main (urxvt) #10 0x00007fc08883af4a __libc_start_main (libc.so.6) #11 0x0000560e6003f9da _start (urxvt) After bisection, it was found the first bad commit is bd4c82c ("mm, THP, swap: delay splitting THP after swapped out"). The root cause is as follows: When the pages are written to swap device during swapping out in swap_writepage(), zswap (fontswap) is tried to compress the pages to improve performance. But zswap (frontswap) will treat THP as a normal page, so only the head page is saved. After swapping in, tail pages will not be restored to their original contents, causing memory corruption in the applications. This is fixed by refusing to save page in the frontswap store functions if the page is a THP. So that the THP will be swapped out to swap device. Another choice is to split THP if frontswap is enabled. But it is found that the frontswap enabling isn't flexible. For example, if CONFIG_ZSWAP=y (cannot be module), frontswap will be enabled even if zswap itself isn't enabled. Frontswap has multiple backends, to make it easy for one backend to enable THP support, the THP checking is put in backend frontswap store functions instead of the general interfaces. Link: http://lkml.kernel.org/r/[email protected] Fixes: bd4c82c ("mm, THP, swap: delay splitting THP after swapped out") Signed-off-by: "Huang, Ying" <[email protected]> Reported-by: Sergey Senozhatsky <[email protected]> Tested-by: Sergey Senozhatsky <[email protected]> Suggested-by: Minchan Kim <[email protected]> [put THP checking in backend] Cc: Konrad Rzeszutek Wilk <[email protected]> Cc: Dan Streetman <[email protected]> Cc: Seth Jennings <[email protected]> Cc: Tetsuo Handa <[email protected]> Cc: Shaohua Li <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Juergen Gross <[email protected]> Cc: <[email protected]> [4.14] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
Esteban-Rocha
pushed a commit
that referenced
this pull request
Mar 18, 2018
Currently we can crash perf record when running in pipe mode, like: $ perf record ls | perf report # To display the perf.data header info, please use --header/--header-only options. # perf: Segmentation fault Error: The - file has no samples! The callstack of the crash is: 0x0000000000515242 in perf_event__synthesize_event_update_name 3513 ev = event_update_event__new(len + 1, PERF_EVENT_UPDATE__NAME, evsel->id[0]); (gdb) bt #0 0x0000000000515242 in perf_event__synthesize_event_update_name #1 0x00000000005158a4 in perf_event__synthesize_extra_attr #2 0x0000000000443347 in record__synthesize #3 0x00000000004438e3 in __cmd_record #4 0x000000000044514e in cmd_record #5 0x00000000004cbc95 in run_builtin #6 0x00000000004cbf02 in handle_internal_command #7 0x00000000004cc054 in run_argv #8 0x00000000004cc422 in main The reason of the crash is that the evsel does not have ids array allocated and the pipe's synthesize code tries to access it. We don't force evsel ids allocation when we have single event, because it's not needed. However we need it when we are in pipe mode even for single event as a key for evsel update event. Fixing this by forcing evsel ids allocation event for single event, when we are in pipe mode. Signed-off-by: Jiri Olsa <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Esteban-Rocha
pushed a commit
that referenced
this pull request
Apr 22, 2018
Patch series "kexec_file, x86, powerpc: refactoring for other architecutres", v2. This is a preparatory patchset for adding kexec_file support on arm64. It was originally included in a arm64 patch set[1], but Philipp is also working on their kexec_file support on s390[2] and some changes are now conflicting. So these common parts were extracted and put into a separate patch set for better integration. What's more, my original patch#4 was split into a few small chunks for easier review after Dave's comment. As such, the resulting code is basically identical with my original, and the only *visible* differences are: - renaming of _kexec_kernel_image_probe() and _kimage_file_post_load_cleanup() - change one of types of arguments at prepare_elf64_headers() Those, unfortunately, require a couple of trivial changes on the rest (#1, #6 to #13) of my arm64 kexec_file patch set[1]. Patch #1 allows making a use of purgatory optional, particularly useful for arm64. Patch #2 commonalizes arch_kexec_kernel_{image_probe, image_load, verify_sig}() and arch_kimage_file_post_load_cleanup() across architectures. Patches #3-#7 are also intended to generalize parse_elf64_headers(), along with exclude_mem_range(), to be made best re-use of. [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2018-February/561182.html [2] http://lkml.iu.edu//hypermail/linux/kernel/1802.1/02596.html This patch (of 7): On arm64, crash dump kernel's usable memory is protected by *unmapping* it from kernel virtual space unlike other architectures where the region is just made read-only. It is highly unlikely that the region is accidentally corrupted and this observation rationalizes that digest check code can also be dropped from purgatory. The resulting code is so simple as it doesn't require a bit ugly re-linking/relocation stuff, i.e. arch_kexec_apply_relocations_add(). Please see: http://lists.infradead.org/pipermail/linux-arm-kernel/2017-December/545428.html All that the purgatory does is to shuffle arguments and jump into a new kernel, while we still need to have some space for a hash value (purgatory_sha256_digest) which is never checked against. As such, it doesn't make sense to have trampline code between old kernel and new kernel on arm64. This patch introduces a new configuration, ARCH_HAS_KEXEC_PURGATORY, and allows related code to be compiled in only if necessary. [[email protected]: fix trivial screwup] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: AKASHI Takahiro <[email protected]> Acked-by: Dave Young <[email protected]> Tested-by: Dave Young <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Baoquan He <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
Esteban-Rocha
pushed a commit
that referenced
this pull request
May 6, 2018
When the module is removed the led workqueue is destroyed in the remove callback, before the led device is unregistered from the led subsystem. This leads to a NULL pointer derefence when the led device is unregistered automatically later as part of the module removal cleanup. Bellow is the backtrace showing the problem. BUG: unable to handle kernel NULL pointer dereference at (null) IP: __queue_work+0x8c/0x410 PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI Modules linked in: ccm edac_mce_amd kvm_amd kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 joydev crypto_simd asus_nb_wmi glue_helper uvcvideo snd_hda_codec_conexant snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel asus_wmi snd_hda_codec cryptd snd_hda_core sparse_keymap videobuf2_vmalloc arc4 videobuf2_memops snd_hwdep input_leds videobuf2_v4l2 ath9k psmouse videobuf2_core videodev ath9k_common snd_pcm ath9k_hw media fam15h_power ath k10temp snd_timer mac80211 i2c_piix4 r8169 mii mac_hid cfg80211 asus_wireless(-) snd soundcore wmi shpchp 8250_dw ip_tables x_tables amdkfd amd_iommu_v2 amdgpu radeon chash i2c_algo_bit drm_kms_helper syscopyarea serio_raw sysfillrect sysimgblt fb_sys_fops ahci ttm libahci drm video CPU: 3 PID: 2177 Comm: rmmod Not tainted 4.15.0-5-generic #6+dev94.b4287e5bem1-Endless Hardware name: ASUSTeK COMPUTER INC. X555DG/X555DG, BIOS 5.011 05/05/2015 RIP: 0010:__queue_work+0x8c/0x410 RSP: 0018:ffffbe8cc249fcd8 EFLAGS: 00010086 RAX: ffff992ac6810800 RBX: 0000000000000000 RCX: 0000000000000008 RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff992ac6400e18 RBP: ffffbe8cc249fd18 R08: ffff992ac6400db0 R09: 0000000000000000 R10: 0000000000000040 R11: ffff992ac6400dd8 R12: 0000000000002000 R13: ffff992abd762e00 R14: ffff992abd763e38 R15: 000000000001ebe0 FS: 00007f318203e700(0000) GS:ffff992aced80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000001c720e000 CR4: 00000000001406e0 Call Trace: queue_work_on+0x38/0x40 led_state_set+0x2c/0x40 [asus_wireless] led_set_brightness_nopm+0x14/0x40 led_set_brightness+0x37/0x60 led_trigger_set+0xfc/0x1d0 led_classdev_unregister+0x32/0xd0 devm_led_classdev_release+0x11/0x20 release_nodes+0x109/0x1f0 devres_release_all+0x3c/0x50 device_release_driver_internal+0x16d/0x220 driver_detach+0x3f/0x80 bus_remove_driver+0x55/0xd0 driver_unregister+0x2c/0x40 acpi_bus_unregister_driver+0x15/0x20 asus_wireless_driver_exit+0x10/0xb7c [asus_wireless] SyS_delete_module+0x1da/0x2b0 entry_SYSCALL_64_fastpath+0x24/0x87 RIP: 0033:0x7f3181b65fd7 RSP: 002b:00007ffe74bcbe18 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3181b65fd7 RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000555ea2559258 RBP: 0000555ea25591f0 R08: 00007ffe74bcad91 R09: 000000000000000a R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000003 R13: 00007ffe74bcae00 R14: 0000000000000000 R15: 0000555ea25591f0 Code: 01 00 00 02 0f 85 7d 01 00 00 48 63 45 d4 48 c7 c6 00 f4 fa 87 49 8b 9d 08 01 00 00 48 03 1c c6 4c 89 f7 e8 87 fb ff ff 48 85 c0 <48> 8b 3b 0f 84 c5 01 00 00 48 39 f8 0f 84 bc 01 00 00 48 89 c7 RIP: __queue_work+0x8c/0x410 RSP: ffffbe8cc249fcd8 CR2: 0000000000000000 ---[ end trace 7aa4f4a232e9c39c ]--- Unregistering the led device on the remove callback before destroying the workqueue avoids this problem. https://bugzilla.kernel.org/show_bug.cgi?id=196097 Reported-by: Dun Hum <[email protected]> Cc: [email protected] Signed-off-by: João Paulo Rechi Vita <[email protected]> Signed-off-by: Darren Hart (VMware) <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.