-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
<POD NETWORK DELAY> : cmd exec failed, err: RTNETLINK answers: No such file or directory exit status 2 #173
Comments
但是我做了一个pod-process-kill的演练:
由此我感觉上述的问题是不是不在于说在node主机上输入/opt/chaosblade/bin/nsexec的报错呢? |
补充:在进行pod-network-delay的演练时:
|
operator日志: |
yum install -y kernel-modules-extra 可以安装该模块,问题似乎是由于 pod 内关于 linux 内核流控工具 tc 引起的相关问题,由于内核默认缺少 netem 流控队列,所以会报错 Error: Specified qdisc not found. 但安装该模块不一定能解决 RTNETLINK answers: No such file or directory exit status 2 该问题,可以先尝试安装kernel-modules-extra该模块看是否能解决问题 |
你系统什么版本 centos 8.x 吗,这个包 8.x 需要安装,并且安装后还要重启机器 |
好嘞,我再尝试安装下此模块,但是的确之前尝试安装遇到了问题,我尝试的yum源均提示没有此包可以安装。。 |
是的,你先确认下你内核版本和发行版本吧 |
我的系统是CentOS Linux release 7.6.1810 (Core) |
4.18.0-193.el8.x86_64 |
当前进展: 1、我的系统今天进行了一次变更。现在的版本是CentOS Linux release 8.2.2004 (Core)
|
恢复阶段报错的operator日志: time="2022-08-17T07:14:08Z" level=info msg="execute identifier: {ContainerObjectMeta:{Id:fff09b30e7e8f4a2 ContainerRuntime:docker ContainerId:3e1db8dce103 ContainerName:centos-tc-done PodName:centos-tc-done-6b584445b9-g5hnw NodeName:192.168.0.4 Namespace:centos-tc-done} Command: --container-label-selector io.kubernetes.pod.name=centos-tc-done-6b584445b9-g5hnw,io.kubernetes.pod.namespace=centos-tc-done,io.kubernetes.docker.type=podsandbox --container-runtime docker Error: Code:0 ChaosBladePodName:chaosblade-tool-ljpcv ChaosBladeNamespace:chaosblade ChaosBladeContainerName:chaosblade-tool}" experiment=f9e03fa41bc7c31f |
这可能是平台侧设置的轮训时间太短导致的异常,实际上实验不久后就被正常销毁了,你可以通过观察现象判断 实验是否被正常销毁 |
感谢回复 通过查看报错信息"error": "pods/exec: k8s exec failed, err: command terminated with exit code 126",考虑是因为恢复时并没有成功进入对应的pod,故障注入是能够成功进入的,而恢复不能进入pod就有点问题 |
在pod内看看chaosblade的执行日志呢? 日志一般在/opt/chaosblade下 |
日志如下,其中10:43为成功执行,10:45为恢复日志 |
pod network delay 实验时,销毁实验失败: 在响应node节点的 chaosblade-tool 容器中执行 /opt/chaosblade/bin/nsexec -t 11077 -p -n -- /bin/sh -c tc qdisc del dev eth0 root 同样报错,需要将执行名字加“引号”,然后再执行就可以了。 是否是因为演练工具的 exec 模块执行命令的格式不对。 |
Issue Description
Type: bug report
Describe what happened (or what feature you want)
/opt/chaosblade/bin/nsexec -t 77143 -p -n -- /bin/sh -c tc qdisc add dev eth0 root netem delay 100ms 10ms
: cmd exec failed, err: RTNETLINK answers: No such file or directory exit status 2Describe what you expected to happen
希望可以提供相关解决方法or解决思路,thanks!
How to reproduce it (as minimally and precisely as possible)
Tell us your environment
K8s:v1.18.18
chaosblade-box:v1.0.1
chaos-agent:v1.0.0
chaos-operator:v1.6.0
chaos-tool:v1.6.0
Anything else we need to know?
机器执行信息:
{
}
}
The text was updated successfully, but these errors were encountered: