当前位置:网站首页>How to correctly calculate the CPU utilization of kubernetes container
How to correctly calculate the CPU utilization of kubernetes container
2022-07-26 03:06:00 【JavaShark】

Parameter interpretation
Use Prometheus To configure kubernetes Environment Container Of CPU Usage rate , Will often meet CPU Use exceeds 100%, Let's explain :
1.container_spec_cpu_period
When the container is CPU When the limit ,CFS Scheduling time window , Also called container CPU The clock cycle of is usually 100,000 Microsecond
2.container_spec_cpu_quota
It refers to the use of containers CPU Total time period , If quota The settings are 700,000, It means that the container is available CPU Time is 7*100,000 Microsecond , Usually corresponding kubernetes Of resource.cpu.limits Value
3.container_spec_cpu_share
Refer to container Use allocation host CPU Relative value , such as share The settings are 500m, When the representative window starts, apply to the host node 0.5 individual CPU, That is to say 50,000 Microsecond , Usually corresponding kubernetes Of resource.cpu.requests Value
4.container_cpu_usage_seconds_total
Statistics container of CPU Consume usage in one second , It should be noted that container be-all CORE
5.container_cpu_system_seconds_total
Count the consumption of container kernel state in one second CPU
6.container_cpu_user_seconds_total
Count the consumption of container user status in one second CPU
Refer to official address https://docs.signalfx.com/en/latest/integrations/agent/monitors/cadvisor.html https://github.com/google/cadvisor/blob/master/docs/storage/prometheus.md
The specific formula
1. By default, if you directly use container_cpu_usage_seconds_total Words , as follows
sum(irate(container_cpu_usage_seconds_total{container="$Container",instance="$Node",pod="$Pod"}[5m])*100)by(pod)The default statistical data is all of the container CORE The average usage rate of

2. If you want to accurately calculate the CPU Usage rate , Use % Presentation form , as follows
sum(irate(container_cpu_usage_seconds_total{container="$Container",instance="$Node",pod="$Pod"}[5m])*100)by(pod)/sum(container_spec_cpu_quota{container="$Container",instance="$Node",pod="$Pod"}/container_spec_cpu_period{container="$Container",instance="$Node",pod="$Pod"})by(pod)among container_spec_cpu_quota/container_spec_cpu_period, It means how many containers there are CORE

2. Refer to the official git issue
https://github.com/google/cadvisor/issues/2026#issuecomment-415819667
docker stats
docker stats How to calculate the output indicator column , as follows :
First docker stats It's through Docker API /containers/(id)/stats Interface to get live data stream, Re pass docker stats Integration .
stay Linux Use in docker stats Output memory usage (MEM USAGE), In fact, the calculation of this column does not include Cache Of memory .
cache usage stay ≤ docker 19.03 Version of API The field corresponding to the interface output is memory_stats.total_inactive_file, and > docker 19.03 The corresponding field of the version of is memory_stats.cache.
docker stats Output PIDS One column represents the number of processes or threads created by the container ,threads yes Linux kernel A term in , also called lightweight process & kernel task .
1. How to use Docker API View container resource usage , as follows
$ curl -s --unix-socket /var/run/docker.sock "http://localhost/v1.40/containers/10f2db238edc/stats" | jq -r
{
"read": "2022-01-05T06:14:47.705943252Z",
"preread": "0001-01-01T00:00:00Z",
"pids_stats": {
"current": 240
},
"blkio_stats": {
"io_service_bytes_recursive": [
{
"major": 253,
"minor": 0,
"op": "Read",
"value": 0
},
{
"major": 253,
"minor": 0,
"op": "Write",
"value": 917504
},
{
"major": 253,
"minor": 0,
"op": "Sync",
"value": 0
},
{
"major": 253,
"minor": 0,
"op": "Async",
"value": 917504
},
{
"major": 253,
"minor": 0,
"op": "Discard",
"value": 0
},
{
"major": 253,
"minor": 0,
"op": "Total",
"value": 917504
}
],
"io_serviced_recursive": [
{
"major": 253,
"minor": 0,
"op": "Read",
"value": 0
},
{
"major": 253,
"minor": 0,
"op": "Write",
"value": 32
},
{
"major": 253,
"minor": 0,
"op": "Sync",
"value": 0
},
{
"major": 253,
"minor": 0,
"op": "Async",
"value": 32
},
{
"major": 253,
"minor": 0,
"op": "Discard",
"value": 0
},
{
"major": 253,
"minor": 0,
"op": "Total",
"value": 32
}
],
"io_queue_recursive": [],
"io_service_time_recursive": [],
"io_wait_time_recursive": [],
"io_merged_recursive": [],
"io_time_recursive": [],
"sectors_recursive": []
},
"num_procs": 0,
"storage_stats": {},
"cpu_stats": {
"cpu_usage": {
"total_usage": 251563853433744,
"percpu_usage": [
22988555937059,
6049382848016,
22411490707722,
5362525449957,
25004835766513,
6165050456944,
27740046633494,
6245013152748,
29404953317631,
5960151933082,
29169053441816,
5894880727311,
25772990860310,
5398581194412,
22856145246881,
5140195759848
],
"usage_in_kernelmode": 30692640000000,
"usage_in_usermode": 213996900000000
},
"system_cpu_usage": 22058735930000000,
"online_cpus": 16,
"throttling_data": {
"periods": 10673334,
"throttled_periods": 1437,
"throttled_time": 109134709435
}
},
"precpu_stats": {
"cpu_usage": {
"total_usage": 0,
"usage_in_kernelmode": 0,
"usage_in_usermode": 0
},
"throttling_data": {
"periods": 0,
"throttled_periods": 0,
"throttled_time": 0
}
},
"memory_stats": {
"usage": 8589447168,
"max_usage": 8589926400,
"stats": {
"active_anon": 0,
"active_file": 260198400,
"cache": 1561460736,
"dirty": 3514368,
"hierarchical_memory_limit": 8589934592,
"hierarchical_memsw_limit": 8589934592,
"inactive_anon": 6947250176,
"inactive_file": 1300377600,
"mapped_file": 0,
"pgfault": 3519153,
"pgmajfault": 0,
"pgpgin": 184508478,
"pgpgout": 184052901,
"rss": 6947373056,
"rss_huge": 6090129408,
"total_active_anon": 0,
"total_active_file": 260198400,
"total_cache": 1561460736,
"total_dirty": 3514368,
"total_inactive_anon": 6947250176,
"total_inactive_file": 1300377600,
"total_mapped_file": 0,
"total_pgfault": 3519153,
"total_pgmajfault": 0,
"total_pgpgin": 184508478,
"total_pgpgout": 184052901,
"total_rss": 6947373056,
"total_rss_huge": 6090129408,
"total_unevictable": 0,
"total_writeback": 0,
"unevictable": 0,
"writeback": 0
},
"limit": 8589934592
},
"name": "/k8s_prod-xc-fund_prod-xc-fund-646dfc657b-g4px4_prod_523dcf9d-6137-4abf-b4ad-bd3999abcf25_0",
"id": "10f2db238edc13f538716952764d6c9751e5519224bcce83b72ea7c876cc0475"2. How to calculate
Official address
https://docs.docker.com/engine/api/v1.40/#operation/ContainerStats
The precpu_stats is the CPU statistic of the previous read, and is used to calculate the CPU usage percentage. It is not an exact copy of the cpu_stats field.
If either precpu_stats.online_cpus or cpu_stats.online_cpus is nil then for compatibility with older daemons the length of the corresponding cpu_usage.percpu_usage array should be used.
To calculate the values shown by the stats command of the docker cli tool the following formulas can be used:
- used_memory =
memory_stats.usage - memory_stats.stats.cache - available_memory =
memory_stats.limit - Memory usage % =
(used_memory / available_memory) * 100.0 - cpu_delta =
cpu_stats.cpu_usage.total_usage - precpu_stats.cpu_usage.total_usage - system_cpu_delta =
cpu_stats.system_cpu_usage - precpu_stats.system_cpu_usage - number_cpus =
lenght(cpu_stats.cpu_usage.percpu_usage) orcpu_stats.online_cpus - CPU usage % =
(cpu_delta / system_cpu_delta) * number_cpus * 100.0
边栏推荐
- [translation] cloud like internal load balancer for kubernetes?
- Anti electronic ink screen st7302
- dataframe整理:datetime格式分拆;删除特定行;分组整合。
- Standardize your own debug process
- Design of golang lottery system
- GoLang 抽奖系统 设计
- How to design test cases according to the requirements of login testing?
- 图像识别(七)| 池化层是什么?有什么作用?
- Chen Yili, China Academy of communications technology: cost reduction and efficiency increase are the greatest value of Enterprise Cloud native applications
- 信息系统项目管理师必背核心考点(五十)合同内容约定不明确规定
猜你喜欢

【C语言】深入理解 整型提升 和 算术转换

图像识别(七)| 池化层是什么?有什么作用?

How to reinstall win7 system?

Keyboardtraffic, a tool developed by myself to solve CTF USB keyboard traffic

How to design test cases according to the requirements of login testing?

STM32——DMA笔记

(9) Attribute introspection

Information system project managers must recite the core examination site (50). The contract content is not clearly stipulated

Be highly vigilant! Weaponization of smartphone location data on the battlefield

Win11更改磁盘驱动器号的方法
随机推荐
Multithreaded programming
Wechat official account mutual aid, open white groups, and small white newspaper groups to keep warm
Quick check of OGC WebGIS common service standards (wms/wmts/tms/wfs)
Autojs cloud control source code + display
STM32 - serial port learning notes (one byte, 16 bit data, string, array)
Qt 信号在多层次对象间传递 多层嵌套类对象之间信号传递
STM32 - PWM learning notes
MySQL教程:MySQL数据库学习宝典(从入门到精通)
Machine learning foundation plan 0-2: what is machine learning? What does it have to do with AI?
Swin Transformer【Backbone】
OxyCon 2022 网络抓取前沿大会即将开启!
万维网、因特网和互联网的区别
如何用U盘进行装机?
Pbootcms upload thumbnail size automatically reduces and blurs
dataframe整理:datetime格式分拆;删除特定行;分组整合。
Self-supervised learning method to solve the inverse problem of Fokker-Planck Equation
(9) Attribute introspection
Shardingsphere data slicing
Neo4j import CSV data error: neo4j load CSV error: couldn't load the external resource
Anti electronic ink screen st7302