k8s报错starting container process caused "process_linux.go:299: copying bootstrap data to pipe caused \"write init-p: broken pipe\"": unknown

尝试的解决方案:

  1. 升级docker,因为通过查看,集群中的机器docker进程版本并不完全相同,升级完之后并且重启docker进程
  2. 通过describe信息查看得到以下输出
    State:          Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Mon, 23 Mar 2020 16:24:15 +0800
      Finished:     Mon, 23 Mar 2020 16:24:27 +0800
    Ready:          False
    Restart Count:  29
    
    
    Events:
  Type     Reason                  Age                      From             Message
  ----     ------                  ----                     ----             -------
  Warning  FailedCreatePodSandBox  48m (x43134 over 17h)    kubelet, master  Failed create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "kube-flannel-ds-amd64-d7xxk": Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "process_linux.go:303: getting the final child's pid from pipe caused \"read init-p: connection reset by peer\"": unknown
  Warning  FailedCreatePodSandBox  13m (x8967 over 16h)     kubelet, master  Failed create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "kube-flannel-ds-amd64-d7xxk": Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "process_linux.go:299: copying bootstrap data to pipe caused \"write init-p: broken pipe\"": unknown
  Normal   SandboxChanged          3m40s (x54265 over 22h)  kubelet, master  Pod sandbox changed, it will be killed and re-created.

oomkilld,内存不够吗?只有master上的flannel有这个错误,node上的没有,限制的同样的内存和CPU资源啊。但是查看node上的flannel组件并没有出现类似信息。

kubectl patch ds -n=kube-system kube-flannel-ds-amd64 -p '{"spec": {"template":{"spec":{"containers": [{"name":"kube-flannel", "resources": {"limits": {"cpu": "250m","memory": "550Mi"},"requests": {"cpu": "100m","memory": "100Mi"}}}]}}}}'

但是我还是通过命令将内存和CPU资源扩展了一点,之后再查看会不会发生。如果不发生,那就是资源限制除了问题吧