当前位置:网站首页>The pod is evicted due to insufficient disk space of tke node

The pod is evicted due to insufficient disk space of tke node

2022-06-24 12:34:00 Nieweixing

I have a problem recently ,TKE There are a lot of failed State of pod, After checking the event, there is no obvious abnormal error , What the hell is going on here ?

Actually here failed The status is because the memory or disk of the node is full , Led to pod Deportation leads to , here kubectl Command view pod The state of is Evicted,tke The console displays failed, In fact, node eviction generally has no impact , Node resources are insufficient , take pod Eviction to other nodes is expected .

So what exactly is expulsion ?

Kubelet Active monitoring and prevention The overall shortage of computing resources . When resources are scarce ,kubelet Can actively end one or more Pod To recover scarce resources . When kubelet End one Pod when , It will terminate Pod All containers in , and Pod Of Phase Will be Failed. If the expelled Pod from Deployment management , This Deployment Will create another Pod to Kubernetes To dispatch .

Which resource shortages trigger eviction strategies , Here, you can refer to the document to configure the corresponding parameters to the node kubelet Parameters in

https://kubernetes.io/zh/docs/tasks/administer-cluster/out-of-resource/

General node eviction occurs , Mainly because the container storage directory takes up a lot of disk space , Today we will talk about the eviction problem caused by insufficient disk space , When it comes to the eviction caused by insufficient disk space , There will be a problem here , I mount the container storage directory on the data disk , There will be the following 2 Species phenomenon :

  • The container storage directory is full , The node did not trigger eviction
  • The container storage directory is not full , Node triggered eviction

Let's talk about tke When the node disk is full, the eviction will be triggered , Why does the above phenomenon occur when the container storage directory is hung on the data disk , With regard to expulsion , What we need to do with .

So let's look at this first tke Node default kubelet Eviction configuration

[[email protected] ~]# cat /etc/kubernetes/kubelet
SERIALIZE_IMAGE_PULLS="--serialize-image-pulls=false"
REGISTER_SCHEDULABLE="--register-schedulable=true"
V="--v=2"
CLOUD_PROVIDER="--cloud-provider=qcloud"
FAIL_SWAP_ON="--fail-swap-on=false"
AUTHORIZATION_MODE="--authorization-mode=Webhook"
CLOUD_CONFIG="--cloud-config=/etc/kubernetes/qcloud.conf"
CLUSTER_DNS="--cluster-dns=172.16.52.140"
IMAGE_PULL_PROGRESS_DEADLINE="--image-pull-progress-deadline=10m0s"
HOSTNAME_OVERRIDE="--hostname-override=10.0.0.3"
EVICTION_HARD="--eviction-hard=nodefs.available<10%,nodefs.inodesFree<5%,memory.available<100Mi"
CLIENT_CA_FILE="--client-ca-file=/etc/kubernetes/cluster-ca.crt"
NON_MASQUERADE_CIDR="--non-masquerade-cidr=0.0.0.0/0"
KUBE_RESERVED="--kube-reserved=cpu=90m,memory=1300Mi"
MAX_PODS="--max-pods=61"
AUTHENTICATION_TOKEN_WEBHOOK="--authentication-token-webhook=true"
POD_INFRA_CONTAINER_IMAGE="--pod-infra-container-image=ccr.ccs.tencentyun.com/library/pause:latest"
ANONYMOUS_AUTH="--anonymous-auth=false"
KUBECONFIG="--kubeconfig=/etc/kubernetes/kubelet-kubeconfig"
NETWORK_PLUGIN="--network-plugin=cni"
CLUSTER_DOMAIN="--cluster-domain=cluster.local"

EVICTION_HARD This field is the configured eviction policy ,tke The default policy of the node is that the disk space is less than 10% And memory is less than 100Mi When expulsion occurs , But which directory is on the disk ?

Actually in kubelet Source code inside ,kubelet The startup method is configured by default kubelet Of root-dir Namely /var/lib/kubelet, in other words , Eviction occurs when the node disk space is insufficient , Mainly due to /var/lib/kubelet This is caused by insufficient disk space in this directory .

const (
   // Ports of different e2e services.
   kubeletReadOnlyPort = "10255"
   // KubeletRootDirectory specifies the directory where the kubelet runtime information is stored.
   KubeletRootDirectory = "/var/lib/kubelet"
   // Health check url of kubelet
   kubeletHealthCheckURL = "http://127.0.0.1:" + kubeletReadOnlyPort + "/healthz"
)
func (e *E2EServices) startKubelet() (*server, error) {
......
cmdArgs = append(cmdArgs,
		"--kubeconfig", kubeconfigPath,
		"--root-dir", KubeletRootDirectory,
		"--v", LogVerbosityLevel, "--logtostderr",
	)
......
}	

Now that you know which directory is evicted due to insufficient disk space , Here we can explain the above problems . Here we will docker The storage directory of is mounted on the data disk , however kubelet Of root-dir But not on the data disk , Still in the default root directory , That is, under the system disk , When your data disk is full , But the system disk does not reach the expulsion condition , There is no way to trigger eviction .

Same thing , When you write some logs in the system disk directory of the node , The system disk is full , But the data disk space is still sufficient , This will still trigger expulsion .

When we have not docker When the storage directory is mounted on the data disk ,docker Store directories and kubelet Of root-dir The default is on the system disk , So in this case, as long as the system disk is full , Will trigger expulsion , Usually docker It takes up a lot of disk space .

If your node has only one system disk , The disk is full , Triggered eviction , You can clean up the disk space first

# This command will clear all the following resources by default : Stopped container (container)、 A volume that is not used by any container (volume)、 A network that is not associated with any container (network)、 All suspended mirrors (image).
docker system prune  -a -f  

If the above command fails to reclaim disk space , You can refer to the document to clean up the log files that occupy space https://cloud.tencent.com/document/product/457/43126

Free up disk space ,evicted State of pod There will be no automatic cleaning here , You need to delete it manually

kubectl get pods -n [xxxxx | grep Evicted |awk '{print $1}' |xargs kubectl -n xxxxx delete pod

If your docker Stored on a data disk , But there was an expulsion , Here you only need to refer to the documentation https://cloud.tencent.com/document/product/457/43126

Release the disk space of the lower system disk , Then refer to the above command to delete it manually evited State of pod

Some people actually have a question in their mind , Is that I need to docker The storage directory is attached to the data disk , But the hope of triggering the eviction is docker Storage directory , It also triggers eviction when the disk space of the data disk is insufficient .

Here you are 2 Solutions , The first is to modify kubelet Of root-dir Parameters , The second is to increase kubelet Expulsion parameter configuration of

Adopt the first scheme , You need to initialize the configuration on the node kubelet Custom parameters for , Here, you need to submit a work order to enable customization kubelet Parameter function , After opening , You configure it on the console kubelet Of root-dir Parameters

Here we will kubelet Of root-dir and docker The storage directories of are attached to the data disk /data Under the directory , When docker Too much data causes the disk to be full , This will also trigger expulsion .

If it is the second plan , In fact, it is necessary to docker The storage data disk is added to the eviction range , We can configure the following image eviction parameters , Here, the images are stored on the data disk , Set this 2 Parameters ,docker When the disk where the storage directory is located is full, it will also trigger kubelet Expulsion of

imagefs.available

imagefs.available := node.stats.runtime.imagefs.available

imagefs.inodesFree

imagefs.inodesFree := node.stats.runtime.imagefs.inodesFree

EVICTION_HARD="--eviction-hard=nodefs.available<10%,nodefs.inodesFree<5%,memory.available<100Mi,imagefs.available
<10%,imagefs.inodesFree<5%"

After adding the above parameters , When docker The availability of the data disk is less than 10% When ,kubelet Will be deported .

原网站

版权声明
本文为[Nieweixing]所创,转载请带上原文链接,感谢
https://yzsam.com/2021/05/20210531134817403y.html