当前位置:网站首页>The pod is evicted due to insufficient disk space of tke node
The pod is evicted due to insufficient disk space of tke node
2022-06-24 12:34:00 【Nieweixing】
I have a problem recently ,TKE There are a lot of failed State of pod, After checking the event, there is no obvious abnormal error , What the hell is going on here ?
Actually here failed The status is because the memory or disk of the node is full , Led to pod Deportation leads to , here kubectl Command view pod The state of is Evicted,tke The console displays failed, In fact, node eviction generally has no impact , Node resources are insufficient , take pod Eviction to other nodes is expected .
So what exactly is expulsion ?
Kubelet Active monitoring and prevention The overall shortage of computing resources . When resources are scarce ,
kubeletCan actively end one or more Pod To recover scarce resources . WhenkubeletEnd one Pod when , It will terminate Pod All containers in , and Pod OfPhaseWill beFailed. If the expelled Pod from Deployment management , This Deployment Will create another Pod to Kubernetes To dispatch .
Which resource shortages trigger eviction strategies , Here, you can refer to the document to configure the corresponding parameters to the node kubelet Parameters in
https://kubernetes.io/zh/docs/tasks/administer-cluster/out-of-resource/
General node eviction occurs , Mainly because the container storage directory takes up a lot of disk space , Today we will talk about the eviction problem caused by insufficient disk space , When it comes to the eviction caused by insufficient disk space , There will be a problem here , I mount the container storage directory on the data disk , There will be the following 2 Species phenomenon :
- The container storage directory is full , The node did not trigger eviction
- The container storage directory is not full , Node triggered eviction
Let's talk about tke When the node disk is full, the eviction will be triggered , Why does the above phenomenon occur when the container storage directory is hung on the data disk , With regard to expulsion , What we need to do with .
So let's look at this first tke Node default kubelet Eviction configuration
[[email protected] ~]# cat /etc/kubernetes/kubelet SERIALIZE_IMAGE_PULLS="--serialize-image-pulls=false" REGISTER_SCHEDULABLE="--register-schedulable=true" V="--v=2" CLOUD_PROVIDER="--cloud-provider=qcloud" FAIL_SWAP_ON="--fail-swap-on=false" AUTHORIZATION_MODE="--authorization-mode=Webhook" CLOUD_CONFIG="--cloud-config=/etc/kubernetes/qcloud.conf" CLUSTER_DNS="--cluster-dns=172.16.52.140" IMAGE_PULL_PROGRESS_DEADLINE="--image-pull-progress-deadline=10m0s" HOSTNAME_OVERRIDE="--hostname-override=10.0.0.3" EVICTION_HARD="--eviction-hard=nodefs.available<10%,nodefs.inodesFree<5%,memory.available<100Mi" CLIENT_CA_FILE="--client-ca-file=/etc/kubernetes/cluster-ca.crt" NON_MASQUERADE_CIDR="--non-masquerade-cidr=0.0.0.0/0" KUBE_RESERVED="--kube-reserved=cpu=90m,memory=1300Mi" MAX_PODS="--max-pods=61" AUTHENTICATION_TOKEN_WEBHOOK="--authentication-token-webhook=true" POD_INFRA_CONTAINER_IMAGE="--pod-infra-container-image=ccr.ccs.tencentyun.com/library/pause:latest" ANONYMOUS_AUTH="--anonymous-auth=false" KUBECONFIG="--kubeconfig=/etc/kubernetes/kubelet-kubeconfig" NETWORK_PLUGIN="--network-plugin=cni" CLUSTER_DOMAIN="--cluster-domain=cluster.local"
EVICTION_HARD This field is the configured eviction policy ,tke The default policy of the node is that the disk space is less than 10% And memory is less than 100Mi When expulsion occurs , But which directory is on the disk ?
Actually in kubelet Source code inside ,kubelet The startup method is configured by default kubelet Of root-dir Namely /var/lib/kubelet, in other words , Eviction occurs when the node disk space is insufficient , Mainly due to /var/lib/kubelet This is caused by insufficient disk space in this directory .
const (
// Ports of different e2e services.
kubeletReadOnlyPort = "10255"
// KubeletRootDirectory specifies the directory where the kubelet runtime information is stored.
KubeletRootDirectory = "/var/lib/kubelet"
// Health check url of kubelet
kubeletHealthCheckURL = "http://127.0.0.1:" + kubeletReadOnlyPort + "/healthz"
)
func (e *E2EServices) startKubelet() (*server, error) {
......
cmdArgs = append(cmdArgs,
"--kubeconfig", kubeconfigPath,
"--root-dir", KubeletRootDirectory,
"--v", LogVerbosityLevel, "--logtostderr",
)
......
} Now that you know which directory is evicted due to insufficient disk space , Here we can explain the above problems . Here we will docker The storage directory of is mounted on the data disk , however kubelet Of root-dir But not on the data disk , Still in the default root directory , That is, under the system disk , When your data disk is full , But the system disk does not reach the expulsion condition , There is no way to trigger eviction .
Same thing , When you write some logs in the system disk directory of the node , The system disk is full , But the data disk space is still sufficient , This will still trigger expulsion .
When we have not docker When the storage directory is mounted on the data disk ,docker Store directories and kubelet Of root-dir The default is on the system disk , So in this case, as long as the system disk is full , Will trigger expulsion , Usually docker It takes up a lot of disk space .
If your node has only one system disk , The disk is full , Triggered eviction , You can clean up the disk space first
# This command will clear all the following resources by default : Stopped container (container)、 A volume that is not used by any container (volume)、 A network that is not associated with any container (network)、 All suspended mirrors (image). docker system prune -a -f
If the above command fails to reclaim disk space , You can refer to the document to clean up the log files that occupy space https://cloud.tencent.com/document/product/457/43126
Free up disk space ,evicted State of pod There will be no automatic cleaning here , You need to delete it manually
kubectl get pods -n [xxxxx | grep Evicted |awk '{print $1}' |xargs kubectl -n xxxxx delete podIf your docker Stored on a data disk , But there was an expulsion , Here you only need to refer to the documentation https://cloud.tencent.com/document/product/457/43126
Release the disk space of the lower system disk , Then refer to the above command to delete it manually evited State of pod
Some people actually have a question in their mind , Is that I need to docker The storage directory is attached to the data disk , But the hope of triggering the eviction is docker Storage directory , It also triggers eviction when the disk space of the data disk is insufficient .
Here you are 2 Solutions , The first is to modify kubelet Of root-dir Parameters , The second is to increase kubelet Expulsion parameter configuration of
Adopt the first scheme , You need to initialize the configuration on the node kubelet Custom parameters for , Here, you need to submit a work order to enable customization kubelet Parameter function , After opening , You configure it on the console kubelet Of root-dir Parameters
Here we will kubelet Of root-dir and docker The storage directories of are attached to the data disk /data Under the directory , When docker Too much data causes the disk to be full , This will also trigger expulsion .
If it is the second plan , In fact, it is necessary to docker The storage data disk is added to the eviction range , We can configure the following image eviction parameters , Here, the images are stored on the data disk , Set this 2 Parameters ,docker When the disk where the storage directory is located is full, it will also trigger kubelet Expulsion of
imagefs.available | imagefs.available := node.stats.runtime.imagefs.available |
|---|---|
imagefs.inodesFree | imagefs.inodesFree := node.stats.runtime.imagefs.inodesFree |
EVICTION_HARD="--eviction-hard=nodefs.available<10%,nodefs.inodesFree<5%,memory.available<100Mi,imagefs.available <10%,imagefs.inodesFree<5%"
After adding the above parameters , When docker The availability of the data disk is less than 10% When ,kubelet Will be deported .
边栏推荐
- What are the low threshold financial products in 2022? Not much money
- Pipeline post instruction
- Tencent security monthly report - zero trust development trend forum, digital Expo Technology Award, Mercedes Benz security research results
- Cryptography series: collision defense and collision attack
- Can Tencent's tendis take the place of redis?
- Use go to process millions of requests per minute
- mRNA疫苗的研制怎么做?27+ 胰腺癌抗原和免疫亚型的解析来告诉你答案!
- 嵌入式必学!硬件资源接口详解——基于ARM AM335X开发板 (上)
- Opencv learning notes - Discrete Fourier transform
- 怎么可以打新债 开户是安全的吗
猜你喜欢
![[go language questions] go from 0 to entry 4: advanced usage of slice, elementary review and introduction to map](/img/7a/16b481753d7d57f50dc8787eec8a1a.png)
[go language questions] go from 0 to entry 4: advanced usage of slice, elementary review and introduction to map
Deep parsing and implementation of redis pub/sub publish subscribe mode message queue

Opencv learning notes - loading and saving images

Opencv learning notes - Discrete Fourier transform

How can a shell script (.Sh file) not automatically close or flash back after execution?

Group planning - General Review

使用开源工具 k8tz 优雅设置 Kubernetes Pod 时区

GTEST from getting started to getting started

How to write controller layer code gracefully?
Database migration tool flyway vs liquibase (II)
随机推荐
数据标注科普:十种常见的图像标注方法
Process of solving easydss virtual live video jam and instability problems by replacing push-pull stream Library
Flink snapshot analysis: operators for locating large states and data skew
11+的基于甲基化组和转录组综合分析识别葡萄膜黑色素瘤中新的预后 DNA 甲基化特征~
A scheme for crawlers to collect public opinion data
Opencv learning notes - matrix normalization normalize() function
mRNA疫苗的研制怎么做?27+ 胰腺癌抗原和免疫亚型的解析来告诉你答案!
怎样打新债具体操作 开户是安全的吗
LS-DYNA新手入门经验
Getting started with scrapy
Google hacking search engine attack and Prevention
Listed JD Logistics: breaking through again
Jupyter notebook service installation and startup
Tencent Youtu, together with Tencent security Tianyu and wechat, jointly launched an infringement protection scheme
Node cache vs browser cache
怎样购买打新债 开户是安全的吗
Jenkins pipeline syntax
As one of the bat, what open source projects does Tencent have?
Data stack technology sharing: open source · data stack - extend flinksql to realize the join of flow and dimension tables
Continuous testing | making testing more free: practicing automated execution of use cases in coding