Kubernetes Pod 无法终结问题
问题现象
某pod运行出现异常,被终结后一直无法正常退出,在kubernetes端的现象:
root@h009028:~# kubectl get pods -o wide --all-namespaces | grep bsys | grep Ter
default bsysprod-5bf5f44b84-2f5qw 1/1 Terminating 0 6d 172.26.0.50 h009028
pod一直处于Terminating状态
在docker端的现象:
root@h009028:~# docker ps | grep bsys
6b7697bc161e nginx@sha256:c792a8c50279261c7c72f144f86e91c76366a95877a7ed02927ee5b15f3c67a8 "/bin/bash -c 'wge..." 6 days ago Up 6 days k8s_bsysprod_bsysprod-5bf5f44b84-2f5qw_default_80a684b9-db5e-11e9-b952-141877468256_0
排查
1.尝试docker exec 进入容器内或执行命令,均失败:
root@h009028:~# docker exec -it 6b7697bc161e bash
rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:240: creating new parent process caused "container_linux.go:1254: running lstat on namespace path \"/proc/120770/ns/ipc\" caused \"lstat /proc/120770/ns/ipc: no such file or directory\""
原因是容器的初始pid已经被回收:
root@h009028:~# ps -ef | grep 120770
root 45063 124565 0 10:02 pts/5 00:00:00 grep --color=auto 120770
2.查看docker daemon日志:
root@h009028:~# journalctl -fu docker.service
Sep 27 09:50:44 h009028 dockerd[2231]: time="2019-09-27T09:50:44.274858423+08:00" level=error msg="Error running exec in container: rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:240: creating new parent process caused \"container_linux.go:1254: running lstat on namespace path \\\"/proc/120770/ns/ipc\\\" caused \\\"lstat /proc/120770/ns/ipc: no such file or directory\\\"\"\n"
Sep 27 09:51:09 h009028 dockerd[2231]: time="2019-09-27T09:51:09.694681419+08:00" level=error msg="Handler for POST /v1.27/containers/2ef9c159685fea3ae04ca6f7213c232d1e3b20c88de70a11be16361745cd5909/stop returned error: Container 2ef9c159685fea3ae04ca6f7213c232d1e3b20c88de70a11be16361745cd5909 is already stopped"
Sep 27 09:51:40 h009028 dockerd[2231]: time="2019-09-27T09:51:40.557659277+08:00" level=info msg="Container 6b7697bc161e482a761be87cb39f5d178ea6825d8a40710aec10a3a0e7988875 failed to exit within 30 seconds of signal 15 - using the force"
Sep 27 09:51:50 h009028 dockerd[2231]: time="2019-09-27T09:51:50.558415715+08:00" level=info msg="Container 6b7697bc161e failed to exit within 10 seconds of kill - trying direct SIGKILL"
## 这个post请求是用来停止pod中附属的pause容器的,pause容器正常退出没有问题
Sep 27 09:53:10 h009028 dockerd[2231]: time="2019-09-27T09:53:10.559098158+08:00" level=error msg="Handler for POST /v1.27/containers/2ef9c159685fea3ae04ca6f7213c232d1e3b20c88de70a11be16361745cd5909/stop returned error: Container 2ef9c159685fea3ae04ca6f7213c232d1e3b20c88de70a11be16361745cd5909 is already stopped"
Sep 27 09:53:41 h009028 dockerd[2231]: time="2019-09-27T09:53:41.659795797+08:00" level=info msg="Container 6b7697bc161e482a761be87cb39f5d178ea6825d8a40710aec10a3a0e7988875 failed to exit within 30 seconds of signal 15 - using the force"
Sep 27 09:53:51 h009028 dockerd[2231]: time="2019-09-27T09:53:51.660637583+08:00" level=info msg="Container 6b7697bc161e failed to exit within 10 seconds of kill - trying direct SIGKILL"
Sep 27 09:55:11 h009028 dockerd[2231]: time="2019-09-27T09:55:11.497119594+08:00" level=error msg="Handler for POST /v1.27/containers/2ef9c159685fea3ae04ca6f7213c232d1e3b20c88de70a11be16361745cd5909/stop returned error: Container 2ef9c159685fea3ae04ca6f7213c232d1e3b20c88de70a11be16361745cd5909 is already stopped"
Sep 27 09:55:42 h009028 dockerd[2231]: time="2019-09-27T09:55:42.415539477+08:00" level=info msg="Container 6b7697bc161e482a761be87cb39f5d178ea6825d8a40710aec10a3a0e7988875 failed to exit within 30 seconds of signal 15 - using the force"
Sep 27 09:55:52 h009028 dockerd[2231]: time="2019-09-27T09:55:52.416319734+08:00" level=info msg="Container 6b7697bc161e failed to exit within 10 seconds of kill - trying direct SIGKILL"
日志分析可以得知:
默认向容器发送的sig 15(SIGTREM)并没有使容器正常退出,10S后,执行SIGKILL使其强制退出,但实际的结果却是,SIGKILL也没能奏效,容器依然没有退出。
3.查看容器内现有未终结的进程
root@h009028:~# docker top 6b7697bc161e
UID PID PPID C STIME TTY TIME CMD
www-data 81091 121143 11 08:49 ? 00:08:39 php-fpm: pool www
www-data 81097 121143 13 08:49 ? 00:09:42 php-fpm: pool www
www-data 81098 121143 16 08:49 ? 00:12:14 php-fpm: pool www
www-data 81099 121143 0 08:49 ? 00:00:02 php-fpm: pool www
www-data 81100 121143 14 08:49 ? 00:10:29 php-fpm: pool www
发现容器内部应该作为PID 1的初始进程早已终结,但却有一些子进程还存在,应该就是这些进程导致容器无法彻底退出,查看它们的pid为121143的父进程的信息:
root@h009028:~# ps -ef | grep 121143
root 5930 124565 0 09:49 pts/5 00:00:00 grep --color=auto 121143
www-data 81091 121143 10 08:49 ? 00:06:08 php-fpm: pool www
www-data 81097 121143 12 08:49 ? 00:07:34 php-fpm: pool www
www-data 81098 121143 16 08:49 ? 00:09:51 php-fpm: pool www
www-data 81099 121143 0 08:49 ? 00:00:02 php-fpm: pool www
www-data 81100 121143 13 08:49 ? 00:07:48 php-fpm: pool www
www-data 81181 121143 14 08:49 ? 00:08:43 php-fpm: pool www
root 121143 121094 0 Sep20 ? 00:00:01 [bash]
root@h009028:~# cat /proc/121143/status
Name: bash
State: S (sleeping)
Tgid: 121143
Ngid: 0
Pid: 121143
PPid: 121094
TracerPid: 0
Uid: 0 0 0 0
Gid: 0 0 0 0
FDSize: 0
Groups:
NStgid: 121143 1
NSpid: 121143 1
NSpgid: 121143 1
NSsid: 121143 1
Threads: 1
SigQ: 7/515154
SigPnd: 0000000000000000
ShdPnd: 0000000000014100
SigBlk: 0000000000010000
SigIgn: 0000000000010004
SigCgt: 0000000000000002
CapInh: 00000000a80425fb
CapPrm: 00000000a80425fb
CapEff: 00000000a80425fb
CapBnd: 00000000a80425fb
CapAmb: 0000000000000000
Seccomp: 0
Cpus_allowed: 0000,00000000,00000000,00000000,ffffffff
Cpus_allowed_list: 0-31
Mems_allowed: 00000000,00000003
Mems_allowed_list: 0-1
voluntary_ctxt_switches: 19402
nonvoluntary_ctxt_switches: 45
此进程处于sleep状态,kill掉它和它的子进程:
ps -ef | grep 121143 | awk '{print $2}' | xargs kill -s 9
再来查看一下容器状态:
root@h009028:~# docker ps -a | grep bsys
root@h009028:~#
容器已退出
原因
容器内所有的其他进程都是pid 1号进程的子进程,正常情况下,pid收到sig 15退出后,会发送信号处理好依附自身的子进程,但偶然情况下还是会出现一些问题:父进程退出前,未来得及处理完容器内的子进程,导致它们变成了孤儿进程,而容器本身的pid 1号进程此时已经退出,docker daemon向其发送sig kill,此信号只针对pid 1的主进程,对这些孤儿进程不生效,从而导致了容器始终无法终结。处理了这些孤儿进程后,容器就可以正常退出了。