Kubernetes Pod 无法终结问题

问题现象

某pod运行出现异常,被终结后一直无法正常退出,在kubernetes端的现象:

root@h009028:~# kubectl get pods -o wide --all-namespaces | grep bsys | grep Ter
default       bsysprod-5bf5f44b84-2f5qw                                  1/1       Terminating   0          6d        172.26.0.50      h009028

pod一直处于Terminating状态

在docker端的现象:

root@h009028:~# docker ps | grep bsys
6b7697bc161e       nginx@sha256:c792a8c50279261c7c72f144f86e91c76366a95877a7ed02927ee5b15f3c67a8                         "/bin/bash -c 'wge..."   6 days ago          Up 6 days                               k8s_bsysprod_bsysprod-5bf5f44b84-2f5qw_default_80a684b9-db5e-11e9-b952-141877468256_0

排查

1.尝试docker exec 进入容器内或执行命令,均失败:

root@h009028:~# docker exec -it 6b7697bc161e bash
rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:240: creating new parent process caused "container_linux.go:1254: running lstat on namespace path \"/proc/120770/ns/ipc\" caused \"lstat /proc/120770/ns/ipc: no such file or directory\""

原因是容器的初始pid已经被回收:

root@h009028:~# ps -ef | grep 120770
root      45063 124565  0 10:02 pts/5    00:00:00 grep --color=auto 120770

2.查看docker daemon日志:

root@h009028:~# journalctl -fu docker.service
Sep 27 09:50:44 h009028 dockerd[2231]: time="2019-09-27T09:50:44.274858423+08:00" level=error msg="Error running exec in container: rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:240: creating new parent process caused \"container_linux.go:1254: running lstat on namespace path \\\"/proc/120770/ns/ipc\\\" caused \\\"lstat /proc/120770/ns/ipc: no such file or directory\\\"\"\n"
Sep 27 09:51:09 h009028 dockerd[2231]: time="2019-09-27T09:51:09.694681419+08:00" level=error msg="Handler for POST /v1.27/containers/2ef9c159685fea3ae04ca6f7213c232d1e3b20c88de70a11be16361745cd5909/stop returned error: Container 2ef9c159685fea3ae04ca6f7213c232d1e3b20c88de70a11be16361745cd5909 is already stopped"
Sep 27 09:51:40 h009028 dockerd[2231]: time="2019-09-27T09:51:40.557659277+08:00" level=info msg="Container 6b7697bc161e482a761be87cb39f5d178ea6825d8a40710aec10a3a0e7988875 failed to exit within 30 seconds of signal 15 - using the force"
Sep 27 09:51:50 h009028 dockerd[2231]: time="2019-09-27T09:51:50.558415715+08:00" level=info msg="Container 6b7697bc161e failed to exit within 10 seconds of kill - trying direct SIGKILL"
## 这个post请求是用来停止pod中附属的pause容器的,pause容器正常退出没有问题
Sep 27 09:53:10 h009028 dockerd[2231]: time="2019-09-27T09:53:10.559098158+08:00" level=error msg="Handler for POST /v1.27/containers/2ef9c159685fea3ae04ca6f7213c232d1e3b20c88de70a11be16361745cd5909/stop returned error: Container 2ef9c159685fea3ae04ca6f7213c232d1e3b20c88de70a11be16361745cd5909 is already stopped"
Sep 27 09:53:41 h009028 dockerd[2231]: time="2019-09-27T09:53:41.659795797+08:00" level=info msg="Container 6b7697bc161e482a761be87cb39f5d178ea6825d8a40710aec10a3a0e7988875 failed to exit within 30 seconds of signal 15 - using the force"
Sep 27 09:53:51 h009028 dockerd[2231]: time="2019-09-27T09:53:51.660637583+08:00" level=info msg="Container 6b7697bc161e failed to exit within 10 seconds of kill - trying direct SIGKILL"
Sep 27 09:55:11 h009028 dockerd[2231]: time="2019-09-27T09:55:11.497119594+08:00" level=error msg="Handler for POST /v1.27/containers/2ef9c159685fea3ae04ca6f7213c232d1e3b20c88de70a11be16361745cd5909/stop returned error: Container 2ef9c159685fea3ae04ca6f7213c232d1e3b20c88de70a11be16361745cd5909 is already stopped"
Sep 27 09:55:42 h009028 dockerd[2231]: time="2019-09-27T09:55:42.415539477+08:00" level=info msg="Container 6b7697bc161e482a761be87cb39f5d178ea6825d8a40710aec10a3a0e7988875 failed to exit within 30 seconds of signal 15 - using the force"
Sep 27 09:55:52 h009028 dockerd[2231]: time="2019-09-27T09:55:52.416319734+08:00" level=info msg="Container 6b7697bc161e failed to exit within 10 seconds of kill - trying direct SIGKILL"

日志分析可以得知:

默认向容器发送的sig 15(SIGTREM)并没有使容器正常退出,10S后,执行SIGKILL使其强制退出,但实际的结果却是,SIGKILL也没能奏效,容器依然没有退出。

3.查看容器内现有未终结的进程

root@h009028:~# docker top 6b7697bc161e
UID                 PID                 PPID                C                   STIME               TTY                 TIME                CMD
www-data            81091               121143              11                  08:49               ?                   00:08:39            php-fpm: pool www
www-data            81097               121143              13                  08:49               ?                   00:09:42            php-fpm: pool www
www-data            81098               121143              16                  08:49               ?                   00:12:14            php-fpm: pool www
www-data            81099               121143              0                   08:49               ?                   00:00:02            php-fpm: pool www
www-data            81100               121143              14                  08:49               ?                   00:10:29            php-fpm: pool www

发现容器内部应该作为PID 1的初始进程早已终结,但却有一些子进程还存在,应该就是这些进程导致容器无法彻底退出,查看它们的pid为121143的父进程的信息:

root@h009028:~# ps -ef | grep 121143
root       5930 124565  0 09:49 pts/5    00:00:00 grep --color=auto 121143
www-data  81091 121143 10 08:49 ?        00:06:08 php-fpm: pool www
www-data  81097 121143 12 08:49 ?        00:07:34 php-fpm: pool www
www-data  81098 121143 16 08:49 ?        00:09:51 php-fpm: pool www
www-data  81099 121143  0 08:49 ?        00:00:02 php-fpm: pool www
www-data  81100 121143 13 08:49 ?        00:07:48 php-fpm: pool www
www-data  81181 121143 14 08:49 ?        00:08:43 php-fpm: pool www
root     121143 121094  0 Sep20 ?        00:00:01 [bash]

root@h009028:~# cat /proc/121143/status
Name:    bash
State:    S (sleeping)
Tgid:    121143
Ngid:    0
Pid:    121143
PPid:    121094
TracerPid:    0
Uid:    0    0    0    0
Gid:    0    0    0    0
FDSize:    0
Groups:
NStgid:    121143    1
NSpid:    121143    1
NSpgid:    121143    1
NSsid:    121143    1
Threads:    1
SigQ:    7/515154
SigPnd:    0000000000000000
ShdPnd:    0000000000014100
SigBlk:    0000000000010000
SigIgn:    0000000000010004
SigCgt:    0000000000000002
CapInh:    00000000a80425fb
CapPrm:    00000000a80425fb
CapEff:    00000000a80425fb
CapBnd:    00000000a80425fb
CapAmb:    0000000000000000
Seccomp:    0
Cpus_allowed:    0000,00000000,00000000,00000000,ffffffff
Cpus_allowed_list:    0-31
Mems_allowed:    00000000,00000003
Mems_allowed_list:    0-1
voluntary_ctxt_switches:    19402
nonvoluntary_ctxt_switches:    45

此进程处于sleep状态,kill掉它和它的子进程:

 ps -ef | grep 121143 | awk '{print $2}' | xargs kill -s 9

再来查看一下容器状态:

root@h009028:~# docker ps -a | grep bsys
root@h009028:~#

容器已退出

原因

容器内所有的其他进程都是pid 1号进程的子进程,正常情况下,pid收到sig 15退出后,会发送信号处理好依附自身的子进程,但偶然情况下还是会出现一些问题:父进程退出前,未来得及处理完容器内的子进程,导致它们变成了孤儿进程,而容器本身的pid 1号进程此时已经退出,docker daemon向其发送sig kill,此信号只针对pid 1的主进程,对这些孤儿进程不生效,从而导致了容器始终无法终结。处理了这些孤儿进程后,容器就可以正常退出了。

results matching ""

    No results matching ""