Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems in CentOS 8 with netflow. CPU stack #145

Closed
jamezbond13 opened this issue Aug 6, 2020 · 7 comments
Closed

Problems in CentOS 8 with netflow. CPU stack #145

jamezbond13 opened this issue Aug 6, 2020 · 7 comments
Assignees

Comments

@jamezbond13
Copy link

A couple of days netflow worked with no problems. But today server stucked, i could't even log in. So a had to hard reset server.
in log i saw a message about skuck cpu:
Aug 6 10:24:42 uz kernel: watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [kworker/6:2:42513]
Aug 6 10:24:42 uz kernel: Modules linked in: nft_chain_nat_ipv4 ipt_MASQUERADE nf_nat_ipv4 xt_nat nf_nat nf_log_ipv4 nf_log_common xt_LOG xt_pkttype xt_state xt_conntrack ipt_NETFLOW(OE) nft_chain_route_ipv4 nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 nft_counter xt_CT nf_conntrack nft_compat nf_tables nfnetlink nct6775 hwmon_vid jc42 sunrpc ext4 mbcache jbd2 intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass iTCO_wdt crct10dif_pclmul iTCO_vendor_support crc32_pclmul ppdev ghash_clmulni_intel intel_cstate intel_uncore ipmi_ssif parport_pc pcspkr intel_rapl_perf parport i2c_i801 ipmi_si ie31200_edac lpc_ich ipmi_devintf ipmi_msghandler ip_tables xfs libcrc32c sr_mod sd_mod cdrom sg ata_generic ast i2c_algo_bit drm_vram_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm crc32c_intel ata_piix libata e1000e serio_raw
Aug 6 10:24:42 uz kernel: CPU: 6 PID: 42513 Comm: kworker/6:2 Kdump: loaded Tainted: G OE --------- - - 4.18.0-193.14.2.el8_2.x86_64 #1
Aug 6 10:24:42 uz kernel: Hardware name: ASUSTek Computer INC. RS300-E7-PS4/P8B-E Series, BIOS 1101 08/25/2011
Aug 6 10:24:42 uz kernel: Workqueue: events netflow_work_fn [ipt_NETFLOW]
Aug 6 10:24:42 uz kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x5b/0x1d0
Aug 6 10:24:42 uz kernel: Code: 6d f0 0f ba 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 75 47 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 <8b> 07 84 c0 75 f8 b8 01 00 00 00 66 89 07 c3 8b 37 81 fe 00 01 00
Aug 6 10:24:42 uz kernel: RSP: 0018:ffff8c65d7b838c0 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
Aug 6 10:24:42 uz kernel: RAX: 0000000000200101 RBX: 000000000002f22a RCX: ffff8c65d7b83968
Aug 6 10:24:42 uz kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffc0b73540
Aug 6 10:24:42 uz kernel: RBP: ffff8c65d7b839d8 R08: 0000000000000000 R09: ffff8c65d2ad60e0
Aug 6 10:24:42 uz kernel: R10: 000000000002c240 R11: 0000000000000001 R12: 000000000000007e
Aug 6 10:24:42 uz kernel: R13: ffff8c65d7b83940 R14: 000000000000002a R15: ffff8c64c2573800
Aug 6 10:24:42 uz kernel: FS: 0000000000000000(0000) GS:ffff8c65d7b80000(0000) knlGS:0000000000000000
Aug 6 10:24:42 uz kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 6 10:24:42 uz kernel: CR2: 00007fda1291f000 CR3: 000000006f80a003 CR4: 00000000000606e0
Aug 6 10:24:42 uz kernel: Call Trace:
Aug 6 10:24:42 uz kernel:
Aug 6 10:24:42 uz kernel: _raw_spin_lock+0x1c/0x20
Aug 6 10:24:42 uz kernel: netflow_target+0x246/0x1040 [ipt_NETFLOW]
Aug 6 10:24:42 uz kernel: ? ___slab_alloc+0x26a/0x4e0
Aug 6 10:24:42 uz kernel: nft_target_eval_xt+0x35/0x50 [nft_compat]
Aug 6 10:24:42 uz kernel: nft_do_chain+0xce/0x3f0 [nf_tables]
Aug 6 10:24:42 uz kernel: ? nft_do_chain+0x3d7/0x3f0 [nf_tables]
Aug 6 10:24:42 uz kernel: ? fib_validate_source+0x9b/0xf0
Aug 6 10:24:42 uz kernel: ? ip_route_input_slow+0x403/0xb10
Aug 6 10:24:42 uz kernel: ? selinux_peerlbl_enabled+0x19/0x30
Aug 6 10:24:42 uz kernel: ? selinux_ip_forward+0x9c/0x1d0
Aug 6 10:24:42 uz kernel: nft_do_chain_ipv4+0x66/0x80 [nf_tables]
Aug 6 10:24:42 uz kernel: nf_hook_slow+0x44/0xc0
Aug 6 10:24:42 uz kernel: ip_forward+0x437/0x460
Aug 6 10:24:42 uz kernel: ? ip_defrag.cold.13+0x33/0x33
Aug 6 10:24:42 uz kernel: ip_rcv+0x273/0x362
Aug 6 10:24:42 uz kernel: ? inet_add_protocol.cold.1+0x1e/0x1e
Aug 6 10:24:42 uz kernel: __netif_receive_skb_core+0xb35/0xc30
Aug 6 10:24:42 uz kernel: ? inet_gro_receive+0x2b3/0x2d0
Aug 6 10:24:42 uz kernel: ? recalibrate_cpu_khz+0x10/0x10
Aug 6 10:24:42 uz kernel: netif_receive_skb_internal+0x3d/0xb0
Aug 6 10:24:42 uz kernel: napi_gro_receive+0xba/0xe0
Aug 6 10:24:42 uz kernel: e1000_clean_rx_irq+0x199/0x440 [e1000e]
Aug 6 10:24:42 uz kernel: e1000e_poll+0xb9/0x290 [e1000e]
Aug 6 10:24:42 uz kernel: net_rx_action+0x149/0x3b0
Aug 6 10:24:42 uz kernel: __do_softirq+0xe3/0x30a
Aug 6 10:24:42 uz kernel: irq_exit+0x100/0x110
Aug 6 10:24:42 uz kernel: do_IRQ+0x7f/0xe0
Aug 6 10:24:42 uz kernel: common_interrupt+0xf/0xf
Aug 6 10:24:42 uz kernel:
Aug 6 10:24:42 uz kernel: RIP: 0010:memcmp+0xb/0x40
Aug 6 10:24:42 uz kernel: Code: 07 0f b6 0f 39 f1 74 0a 48 83 c7 01 48 39 f8 75 f0 c3 48 89 f8 c3 66 0f 1f 84 00 00 00 00 00 48 85 d2 74 2e 0f b6 07 0f b6 0e <29> c8 75 23 b9 01 00 00 00 eb 13 44 0f b6 04 0f 44 0f b6 0c 0e 48
Aug 6 10:24:42 uz kernel: RSP: 0018:ffff9e2481277728 EFLAGS: 00000202 ORIG_RAX: ffffffffffffffda
Aug 6 10:24:42 uz kernel: RAX: 000000000000007f RBX: 000000000002c240 RCX: 000000000000007f
Aug 6 10:24:42 uz kernel: RDX: 0000000000000029 RSI: ffff8c659a982c70 RDI: ffff9e24812777a0
Aug 6 10:24:42 uz kernel: RBP: ffff9e2481277838 R08: ffff8c65a6465800 R09: ffff8c65b194b402
Aug 6 10:24:42 uz kernel: R10: 000000000002c240 R11: 000000000000002c R12: ffff8c659a982c60
Aug 6 10:24:42 uz kernel: R13: ffff9e24812777a0 R14: 000000000000002a R15: ffff8c65a6465800
Aug 6 10:24:42 uz kernel: netflow_target+0x27f/0x1040 [ipt_NETFLOW]
Aug 6 10:24:42 uz kernel: ? __alloc_skb+0x82/0x1c0
Aug 6 10:24:42 uz kernel: nft_target_eval_xt+0x35/0x50 [nft_compat]
Aug 6 10:24:42 uz kernel: nft_do_chain+0xce/0x3f0 [nf_tables]
Aug 6 10:24:42 uz kernel: ? __kmalloc_node_track_caller+0x1c3/0x290
Aug 6 10:24:42 uz kernel: ? __alloc_skb+0x82/0x1c0
Aug 6 10:24:42 uz kernel: ? __kmalloc_reserve.isra.54+0x2e/0x80
Aug 6 10:24:42 uz kernel: ? __alloc_skb+0x96/0x1c0
Aug 6 10:24:42 uz kernel: ? finish_wait+0x80/0x80
Aug 6 10:24:42 uz kernel: ? alloc_skb_with_frags+0x50/0x1b0
Aug 6 10:24:42 uz kernel: ? nf_ct_get_tuple+0x61/0xa0 [nf_conntrack]
Aug 6 10:24:42 uz kernel: nft_do_chain_ipv4+0x66/0x80 [nf_tables]
Aug 6 10:24:42 uz kernel: nf_hook_slow+0x44/0xc0
Aug 6 10:24:42 uz kernel: ? __ip_select_ident+0x3f/0x70
Aug 6 10:24:42 uz kernel: __ip_local_out+0xf2/0x150
Aug 6 10:24:42 uz kernel: ? ip_forward_options.cold.7+0x27/0x27
Aug 6 10:24:42 uz kernel: ip_local_out+0x17/0x40
Aug 6 10:24:42 uz kernel: ip_send_skb+0x15/0x40
Aug 6 10:24:42 uz kernel: udp_send_skb.isra.45+0x155/0x360
Aug 6 10:24:42 uz kernel: udp_sendmsg+0xac2/0xd30
Aug 6 10:24:42 uz kernel: ? sock_sendmsg+0x3e/0x50
Aug 6 10:24:42 uz kernel: sock_sendmsg+0x3e/0x50
Aug 6 10:24:42 uz kernel: netflow_sendmsg+0xa3/0x2b0 [ipt_NETFLOW]
Aug 6 10:24:42 uz kernel: netflow_export_pdu_v5+0xb5/0x110 [ipt_NETFLOW]
Aug 6 10:24:42 uz kernel: netflow_scan_and_export+0x2bd/0x6a0 [ipt_NETFLOW]
Aug 6 10:24:42 uz kernel: netflow_work_fn+0x45/0x100 [ipt_NETFLOW]
Aug 6 10:24:42 uz kernel: process_one_work+0x1a7/0x3b0
Aug 6 10:24:42 uz kernel: worker_thread+0x30/0x390
Aug 6 10:24:42 uz kernel: ? create_worker+0x1a0/0x1a0
Aug 6 10:24:42 uz kernel: kthread+0x112/0x130
Aug 6 10:24:42 uz kernel: ? kthread_flush_work_fn+0x10/0x10
Aug 6 10:24:42 uz kernel: ret_from_fork+0x35/0x40
Aug 6 10:24:42 uz kernel: watchdog: BUG: soft lockup - CPU#7 stuck for 22s! [swapper/7:0]
Aug 6 10:24:42 uz kernel: Modules linked in: nft_chain_nat_ipv4 ipt_MASQUERADE nf_nat_ipv4 xt_nat nf_nat nf_log_ipv4 nf_log_common xt_LOG xt_pkttype xt_state xt_conntrack ipt_NETFLOW(OE) nft_chain_route_ipv4 nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 nft_counter xt_CT nf_conntrack nft_compat nf_tables nfnetlink nct6775 hwmon_vid jc42 sunrpc ext4 mbcache jbd2 intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass iTCO_wdt crct10dif_pclmul iTCO_vendor_support crc32_pclmul ppdev ghash_clmulni_intel intel_cstate intel_uncore ipmi_ssif parport_pc pcspkr intel_rapl_perf parport i2c_i801 ipmi_si ie31200_edac lpc_ich ipmi_devintf ipmi_msghandler ip_tables xfs libcrc32c sr_mod sd_mod cdrom sg ata_generic ast i2c_algo_bit drm_vram_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm crc32c_intel ata_piix libata e1000e serio_raw
Aug 6 10:24:42 uz kernel: CPU: 7 PID: 0 Comm: swapper/7 Kdump: loaded Tainted: G OEL --------- - - 4.18.0-193.14.2.el8_2.x86_64 #1
Aug 6 10:24:42 uz kernel: Hardware name: ASUSTek Computer INC. RS300-E7-PS4/P8B-E Series, BIOS 1101 08/25/2011
Aug 6 10:24:42 uz kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x114/0x1d0
Aug 6 10:24:42 uz kernel: Code: 00 00 f0 44 0f b1 07 85 c0 74 47 83 c6 01 c1 e1 10 c1 e6 12 09 ce 89 f0 c1 e8 10 66 87 47 02 89 c1 c1 e1 10 75 55 31 c9 eb 02 90 8b 07 66 85 c0 75 f7 41 89 c0 66 45 31 c0 44 39 c6 0f 84 80
Aug 6 10:24:42 uz kernel: RSP: 0018:ffff8c65d7bc38d0 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
Aug 6 10:24:42 uz kernel: RAX: 0000000000200101 RBX: 000000000004ca2a RCX: 0000000000000000
Aug 6 10:24:42 uz kernel: RDX: ffff8c65d7bea980 RSI: 0000000000200000 RDI: ffffffffc0b73540
Aug 6 10:24:42 uz kernel: RBP: ffff8c65d7bc39e8 R08: 0000000000000000 R09: ffff8c65c34c0498
Aug 6 10:24:42 uz kernel: R10: 000000000002c240 R11: 0000000000034650 R12: 000000000000007e
Aug 6 10:24:42 uz kernel: R13: ffff8c65d7bc3950 R14: 000000000000002a R15: ffff8c64c9973d00
Aug 6 10:24:42 uz kernel: FS: 0000000000000000(0000) GS:ffff8c65d7bc0000(0000) knlGS:0000000000000000
Aug 6 10:24:42 uz kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 6 10:24:42 uz kernel: CR2: 00007fda12920000 CR3: 000000006f80a003 CR4: 00000000000606e0
Aug 6 10:24:42 uz kernel: Call Trace:
Aug 6 10:24:42 uz kernel:
Aug 6 10:24:42 uz kernel: _raw_spin_lock+0x1c/0x20
Aug 6 10:24:42 uz kernel: netflow_target+0x246/0x1040 [ipt_NETFLOW]
Aug 6 10:24:42 uz kernel: ? check_preempt_curr+0x10/0x90
Aug 6 10:24:42 uz kernel: ? try_to_wake_up+0x54/0x570
Aug 6 10:24:42 uz kernel: ? __wake_up_common+0x7a/0x190
Aug 6 10:24:42 uz kernel: ? nft_meta_get_eval+0x360/0x6a0 [nf_tables]
Aug 6 10:24:42 uz kernel: nft_target_eval_xt+0x35/0x50 [nft_compat]
Aug 6 10:24:42 uz kernel: nft_do_chain+0xce/0x3f0 [nf_tables]
Aug 6 10:24:42 uz kernel: ? __fib_validate_source+0x1e1/0x410
Aug 6 10:24:42 uz kernel: nft_do_chain_ipv4+0x66/0x80 [nf_tables]
Aug 6 10:24:42 uz kernel: ? xt_check_match+0x181/0x1b0
Aug 6 10:24:42 uz kernel: nf_hook_slow+0x44/0xc0
Aug 6 10:24:42 uz kernel: ip_local_deliver+0xcc/0xe0
Aug 6 10:24:42 uz kernel: ? ip_rcv_finish+0x410/0x410
Aug 6 10:24:42 uz kernel: ip_rcv+0x273/0x362
Aug 6 10:24:42 uz kernel: ? inet_add_protocol.cold.1+0x1e/0x1e
Aug 6 10:24:42 uz kernel: __netif_receive_skb_core+0xb35/0xc30
Aug 6 10:24:42 uz kernel: ? inet_gro_receive+0x2b3/0x2d0
Aug 6 10:24:42 uz kernel: ? recalibrate_cpu_khz+0x10/0x10
Aug 6 10:24:42 uz kernel: netif_receive_skb_internal+0x3d/0xb0
Aug 6 10:24:42 uz kernel: napi_gro_receive+0xba/0xe0
Aug 6 10:24:42 uz kernel: e1000_clean_rx_irq+0x199/0x440 [e1000e]
Aug 6 10:24:42 uz kernel: e1000e_poll+0xb9/0x290 [e1000e]
Aug 6 10:24:42 uz kernel: net_rx_action+0x149/0x3b0
Aug 6 10:24:42 uz kernel: __do_softirq+0xe3/0x30a
Aug 6 10:24:42 uz kernel: irq_exit+0x100/0x110
Aug 6 10:24:42 uz kernel: do_IRQ+0x7f/0xe0
Aug 6 10:24:42 uz kernel: common_interrupt+0xf/0xf
Aug 6 10:24:42 uz kernel:
Aug 6 10:24:42 uz kernel: RIP: 0010:cpuidle_enter_state+0xb9/0x420
Aug 6 10:24:42 uz kernel: Code: 90 31 ff e8 89 ef a3 ff 80 7c 24 13 00 74 17 9c 58 66 66 90 66 90 f6 c4 02 0f 85 37 03 00 00 31 ff e8 2b fd a9 ff fb 66 66 90 <66> 66 90 45 85 e4 0f 88 6c 02 00 00 49 63 cc 4c 8b 3c 24 4c 2b 7c
and so on...

my cat
[root@uz var]# cat /proc/net/stat/ipt_netflow
ipt_NETFLOW 2.5, srcversion 8590C3A2CCE3E3ADD23B5FA; llist
Protocol version 5 (netflow)
Timeouts: active 1800s, inactive 15s. Maxflows 2000000
Flows: active 3079 (peak 3741 reached 0d4h11m ago), mem 5552K, worker delay 100/1000 [1..100] (73 ms, 0 us, 281:0 0 [cpu7]).
Hash: size 655360 (mem 5120K), metric 1.00 [1.00, 1.00, 1.00]. InHash: 2221566 pkt, 2091822 K, InPDU 0, 0.
Rate: 49911826 bits/sec, 6567 packets/sec; Avg 1 min: 43999582 bps, 6084 pps; 5 min: 44104310 bps, 5992 pps
cpu# pps; <search found new [metric], trunc frag alloc maxflows>, traffic: <pkt, bytes>, drop: <pkt, bytes>
Total 6566; 130250 59812244 2208679 [1.00], 0 0 0 0, traffic: 62020923, 51859 MB, drop: 0, 0 K
cpu0 0; 18 9477 4040 [1.00], 0 0 0 0, traffic: 13517, 8 MB, drop: 0, 0 K
cpu1 1; 11 9560 3329 [1.00], 0 0 0 0, traffic: 12889, 7 MB, drop: 0, 0 K
cpu2 3511; 8846 32398852 138259 [1.00], 0 0 0 0, traffic: 32537111, 41822 MB, drop: 0, 0 K
cpu3 0; 29 6254 5237 [1.00], 0 0 0 0, traffic: 11491, 5 MB, drop: 0, 0 K
cpu4 0; 10 7182 4228 [1.00], 0 0 0 0, traffic: 11410, 10 MB, drop: 0, 0 K
cpu5 1; 16 4038 3528 [1.00], 0 0 0 0, traffic: 7566, 3 MB, drop: 0, 0 K
cpu6 0; 20 11489 5685 [1.00], 0 0 0 0, traffic: 17174, 40 MB, drop: 0, 0 K
cpu7 3053; 121300 27365392 2044373 [1.00], 0 0 0 0, traffic: 29409765, 9961 MB, drop: 0, 0 K
Export: Rate 8052 bytes/s; Total 73520 pkts, 102 MB, 2205600 flows; Errors 0 pkts; Traffic lost 0 pkts, 0 Kbytes, 0 flows.
sock0: 127.0.0.1:2055, sndbuf 212992, filled 1, peak 1; err: sndbuf reached 0, connect 0, cberr 1, other 0

[root@uz var]# sysctl net.netflow
net.netflow.active_timeout = 1800
net.netflow.debug = 0
net.netflow.destination = 127.0.0.1:2055
net.netflow.flush = 0
net.netflow.hashsize = 655360
net.netflow.inactive_timeout = 15
net.netflow.maxflows = 2000000
net.netflow.protocol = 5
net.netflow.refresh-rate = 20
net.netflow.scan-min = 1
net.netflow.sndbuf = 212992
net.netflow.timeout-rate = 30
[root@uz var]# sysctl -a | grep net.netflow
net.netflow.active_timeout = 1800
net.netflow.debug = 0
net.netflow.destination = 127.0.0.1:2055
net.netflow.flush = 0
net.netflow.hashsize = 655360
net.netflow.inactive_timeout = 15
net.netflow.maxflows = 2000000
net.netflow.protocol = 5
net.netflow.refresh-rate = 20
net.netflow.scan-min = 1
net.netflow.sndbuf = 212992
net.netflow.timeout-rate = 30

Is it a bug or what? Maybe i need to insrease or decrease hashsize?

@aabc
Copy link
Owner

aabc commented Aug 7, 2020

This looks related to this #141 (sorry, non-English discussion). It seems, that cause is that iptables target is called not via xtables kernel sub-system, but via new nftables subsystem. I was unable to reproduce it in #141. (It was reported to happen on Debian). I may try it again on Centos 8 some time.

If you have test environment, you can help by running -debug kernel (from the centos repo) and then show more detailed trace it would report on the lockup.

@aabc aabc self-assigned this Aug 7, 2020
@jamezbond13
Copy link
Author

jamezbond13 commented Aug 7, 2020

Честно говоря, debug включить сейчас смертеподобно =(
Очень много сервисов и служб на серваке и очень много пользователей. Может быть есть какое-то решение другое? В CentOS 7 я ставил и все было без проблем работал четко. тут похоже да, вместо iptables идут nftables
Был бы очень благодарен, если бы у вас получилось воссоздать условия и ситуацию на centos8. Даунгрейдиться на 7 уже не могу.

aabc added a commit that referenced this issue Aug 9, 2020
This appeared on kernels where xtables target is called though nftables.

Lockdep message: WARNING: inconsistent lock state

  [  292.286849] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
  [  292.287776] kworker/0:0/5 [HC0[0]:SC1[1]:HE1:SE0] takes:
  [  292.288618] ffffffffc0de6a98 (&(&htable_stripes[i].lock)->rlock){+.?.}, at: netflow_target+0x604/0x4150 [ipt_NETFLOW]
  [  292.290323] {SOFTIRQ-ON-W} state was registered at:
  [  292.291095]   lock_acquire+0x14f/0x3b0
  [  292.291679]   _raw_spin_lock+0x30/0x70
  [  292.292276]   netflow_target+0x604/0x4150 [ipt_NETFLOW]
  [  292.293125]   nft_target_eval_xt+0x11a/0x220 [nft_compat]
  [  292.294027]   nft_do_chain+0x25a/0x10e0 [nf_tables]
  [  292.294813]   nft_do_chain_ipv4+0x17e/0x200 [nf_tables]
  [  292.295622]   nf_hook_slow+0xb1/0x180

Non-debug kernel error message:

  UG: soft lockup - CPU#0 stuck for 22s!

Fixes #141 and #145.
aabc added a commit that referenced this issue Aug 9, 2020
This appeared on kernels where xtables target is called though nftables.

Lockdep message: WARNING: inconsistent lock state

  [  292.286849] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
  [  292.287776] kworker/0:0/5 [HC0[0]:SC1[1]:HE1:SE0] takes:
  [  292.288618] ffffffffc0de6a98 (&(&htable_stripes[i].lock)->rlock){+.?.}, at: netflow_target+0x604/0x4150 [ipt_NETFLOW]
  [  292.290323] {SOFTIRQ-ON-W} state was registered at:
  [  292.291095]   lock_acquire+0x14f/0x3b0
  [  292.291679]   _raw_spin_lock+0x30/0x70
  [  292.292276]   netflow_target+0x604/0x4150 [ipt_NETFLOW]
  [  292.293125]   nft_target_eval_xt+0x11a/0x220 [nft_compat]
  [  292.294027]   nft_do_chain+0x25a/0x10e0 [nf_tables]
  [  292.294813]   nft_do_chain_ipv4+0x17e/0x200 [nf_tables]
  [  292.295622]   nf_hook_slow+0xb1/0x180

Non-debug kernel error message:

  watchdog: BUG: soft lockup - CPU#6 stuck for 22s!

Fixes #141 and #145.
@aabc
Copy link
Owner

aabc commented Aug 9, 2020

Сделал фикс, сообщите, пожалуйста, помогло ли.

@aabc aabc closed this as completed Aug 9, 2020
@jamezbond13
Copy link
Author

jamezbond13 commented Aug 9, 2020

Применил пач только что. Правильно ли я сделал, терзают сомнения. Вообщем я наживую в файле ipt-netflow.c вставил ваши строчки и удалил ненужные. Далее снова сделал ./configure, make all install, depmod , reboot?
(Я только учусь, прошу сильно камнями не кидать)
Кстати, там лезет ошибка, которую я победил следующим образом и вот теперь тоже терзают сомнения:

./configure
Kernel version: 4.18.0-193.14.2.el8_2.x86_64 (uname)
Kernel sources: /lib/modules/4.18.0-193.14.2.el8_2.x86_64/build (found)
Checking for presence of include/linux/netfilter.h... Yes
netfilter.h uses CONFIG_NF_NAT_NEEDED... Yes
Checking for presence of include/linux/llist.h... Yes
Checking for presence of include/linux/grsecurity.h... No
Iptables binary version: 1.8.4 (nf_tables) (detected from /usr/sbin/iptables)
pkg-config for version 1.8.4 (nf_tables) exists: No (reported: 1.8.4)
Check for working gcc: Yes (gcc)
Checking for presence of xtables.h... Yes
Searching for iptables-1.8.4 (nf_tables) sources..
! Can not find iptables source directory, you may try setting it with --ipt-src=
! This is not fatal error, yet. Will be just using default include dir.
Iptables include flags: none (default)
Iptables module path: /usr/lib64/xtables (from libxtables.so, from binary)
Searching for net-snmp-config... Yes /usr/bin/net-snmp-config
Searching for net-snmp agent... Yes.
Checking for DKMS... Yes.
Creating Makefile.. done.

If you need some options enabled run ./configure --help
Now run: make all install
make all install
Compiling for kernel 4.18.0-193.14.2.el8_2.x86_64
make -C /lib/modules/4.18.0-193.14.2.el8_2.x86_64/build M=/root/ipt-netflow-2.5 modules CONFIG_DEBUG_INFO=y
make[1]: Entering directory '/usr/src/kernels/4.18.0-193.14.2.el8_2.x86_64'
CC [M] /root/ipt-netflow-2.5/ipt_NETFLOW.o
/root/ipt-netflow-2.5/ipt_NETFLOW.c: In function ‘ipt_netflow_fini’:
/root/ipt-netflow-2.5/ipt_NETFLOW.c:5787:2: error: implicit declaration of funct ion ‘synchronize_sched’; did you mean ‘synchronize_net’? [-Werror=implicit-funct ion-declaration]
synchronize_sched();
^~~~~~~~~~~~~~~~~
synchronize_net
cc1: some warnings being treated as errors
make[2]: *** [scripts/Makefile.build:319: /root/ipt-netflow-2.5/ipt_NETFLOW.o] E rror 1
make[1]: *** [Makefile:1545: module/root/ipt-netflow-2.5] Error 2
make[1]: Leaving directory '/usr/src/kernels/4.18.0-193.14.2.el8_2.x86_64'
make: *** [Makefile:25: ipt_NETFLOW.ko] Error 2

при попытке make all install лезет вот это
а эта строчка /root/ipt-netflow-2.5/ipt_NETFLOW.c:5787:2: error: implicit declaration of funct ion ‘synchronize_sched’; did you mean ‘synchronize_net’? [-Werror=implicit-funct ion-declaration]

я в этом файле тупо как он предлагает сделать сделал, т.е. в файле ipt-netflowc в строчке 5782 вместо synchronize_sched я написал synchronize_net после этого make all install отрабатывает вроде как без проблем

После ребута в логе
Aug 9 17:15:50 uz kernel: ipt_NETFLOW: loading out-of-tree module taints kernel.
Aug 9 17:15:50 uz kernel: ipt_NETFLOW: module verification failed: signature and/or required key missing - tainting kernel
Aug 9 17:15:50 uz kernel: ipt_NETFLOW version 2.5, srcversion 308C593884E337D9F1E1773
Aug 9 17:15:50 uz kernel: ipt_NETFLOW: hashsize 655360 (5120K)
Aug 9 17:15:50 uz kernel: netflow: registering: /proc/net/stat/ipt_netflow
Aug 9 17:15:50 uz kernel: netflow: registered: /proc/net/stat/ipt_netflow
Aug 9 17:15:50 uz kernel: netflow: registering: /proc/net/stat/ipt_netflow_snmp
Aug 9 17:15:50 uz kernel: netflow: registered: /proc/net/stat/ipt_netflow_snmp
Aug 9 17:15:50 uz kernel: netflow: registering: /proc/net/stat/ipt_netflow_flows
Aug 9 17:15:50 uz kernel: netflow: registered: /proc/net/stat/ipt_netflow_flows
Aug 9 17:15:50 uz kernel: netflow: registered: sysctl net.netflow
Aug 9 17:15:50 uz kernel: ipt_NETFLOW: added destination 127.0.0.1:2055
Aug 9 17:15:50 uz kernel: ipt_NETFLOW protocol version 5 (NetFlow) enabled.
Aug 9 17:15:50 uz kernel: ipt_NETFLOW is loaded.
Aug 9 17:15:51 uz kernel: xt_ratelimit: 0.3.1 load success.

Вот эта строчка ipt_NETFLOW: module verification failed: signature and/or required key missing - tainting kernel - это нормально?

@aabc
Copy link
Owner

aabc commented Aug 9, 2020

Не нужно вручную накладывать патчи. Просто сделайте git clone заново в новую диру, или, если хотите в этой же, то можно обновить через git reset --hard (чтоб стереть ваши изменения), git pull --rebase (скачать новые коммиты).

module verification failed: signature and/or required key missing - tainting kernel - это нормально?

Нормально.

@jamezbond13
Copy link
Author

Почти сутки с нагрузкой, полет нормальный. Благодарю!
Вопрос куда и как можно задонатить за это отличное приложение?

@aabc
Copy link
Owner

aabc commented Aug 11, 2020

PayPal [email protected].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants