当前位置:网站首页>Prometheus TSDB analysis
Prometheus TSDB analysis
2022-07-28 08:58:00 【Brother Xing plays with the clouds】
summary
Prometheus It is a famous open source monitoring project , Its monitoring tasks are scheduled to specific The server , The The server Grab monitoring data from the target , Then save it in the local TSDB in . Custom powerful PromQL Language queries real-time and historical time series data , Support rich query combinations . Prometheus 1.0 Version of TSDB(V2 Storage engine ) be based on LevelDB, And used and Facebook Gorilla Same compression algorithm , To be able to 16 Data points of bytes are compressed to an average 1.37 Bytes . Prometheus 2.0 The new version of V3 Storage engine , Provides higher write and query performance . This paper mainly analyzes the design idea of the storage engine .
Design thinking
Prometheus take Timeseries Data press 2 One hour block For storage . Every block It consists of a directory , This directory contains : One or more chunk file ( preservation timeseries data )、 One metadata file 、 One index file ( adopt metric name and labels lookup timeseries The data is in chunk The location of the file ). The latest data written is stored in memory block in , achieve 2 Write to disk in hours . To prevent data loss due to program crash , Realized WAL(write-ahead-log) Mechanism , take timeseries The original data is added and written log Persistence in . Delete timeseries when , Deleted entries will be recorded in a separate tombstone In file , Not immediately from chunk File deletion . these 2 Hours of block It will be compressed into larger in the background block, Data compression is combined into higher level Of block After deleting the file level Of block file . This and leveldb、rocksdb etc. LSM The tree has the same idea . These designs and Gorilla The design is highly similar , therefore Prometheus Almost equal to a cache TSDB. The characteristics of its local storage determine that it cannot be used for long-term data storage , Can only be used for short window timeseries Data saving and query , And it doesn't have high availability ( Downtime will cause historical data to be unreadable ). Prometheus Limitations of local storage , So it provides API Interface for and long-term Storage integration , Save data to remote TSDB On . The API Interfaces use custom protocol buffer over HTTP And not stable , Consider switching to gRPC.
Disk file structure
In memory block
In memory block When the data is not flushed ,block Under the directory, it mainly saves wal file .
./data/01BKGV7JBM69T2G1BGBGM6KB12 ./data/01BKGV7JBM69T2G1BGBGM6KB12/meta.json ./data/01BKGV7JBM69T2G1BGBGM6KB12/wal/000002 ./data/01BKGV7JBM69T2G1BGBGM6KB12/wal/000001
persistent block
persistent block Under the table of contents wal File deleted ,timeseries Data saved in chunk In the document .index Used to index timeseries stay wal Location in the file .
./data/01BKGV7JC0RY8A6MACW02A2PJD ./data/01BKGV7JC0RY8A6MACW02A2PJD/meta.json ./data/01BKGV7JC0RY8A6MACW02A2PJD/index ./data/01BKGV7JC0RY8A6MACW02A2PJD/chunks ./data/01BKGV7JC0RY8A6MACW02A2PJD/chunks/000001 ./data/01BKGV7JC0RY8A6MACW02A2PJD/tombstones
mmap
Use mmap Read the compressed and merged large file ( Do not occupy too many handles ), Establish the mapping relationship between process virtual address and file offset , Only when querying and reading the corresponding location can the data be really read to the physical memory . Bypass the file system page cache, Reduced one copy of data . After query , The corresponding memory consists of Linux The system automatically recycles according to the memory pressure , It can be used for the next query hit before recycling . Therefore use mmap Automatically manage the memory cache required for queries , With simple management , The advantage of dealing with efficiency . It can also be seen from here , It's not entirely memory based TSDB, and Gorilla The difference is that querying historical data requires reading disk files .
Compaction
Compaction The main operations include merging block、 Delete expired data 、 restructure chunk data . Where multiple block Become a bigger block, Can effectively reduce block Number , When the query covers a long time range , Avoid the need to merge many block Query results of . To improve deletion efficiency , When deleting time series data , The deleted location will be recorded , Only block When all data needs to be deleted , Only then block Delete the entire directory . therefore block The size of the merge also needs to be limited , Avoid reserving too much deleted space ( Additional space occupation ). A better method is to keep the data for a long time , By percentage ( Such as 10%) Calculation block The maximum duration of .
Inverted Index
Inverted Index( Inverted index ) Provide fast search of data items based on a subset of its content . In short , I can view all tags as app=“nginx” The data of , Instead of going through every timeseries, And check whether the label is included . So , Every time series key Be assigned a unique ID, It allows you to retrieve... In a constant time , under these circumstances ,ID Forward index . Take a chestnut : Such as ID by 9,10,29 Of series contain label app="nginx", be lable "nginx" The inverted index of is [9,10,29] Used for quick queries containing this label Of series.
performance
In the article Writing a Time Series Database from Scratch in , The author gives benchmark The test result is Macbook Pro Write on to 2000 Ten thousand seconds . This data ratio Gorilla The goal in the paper 7 Billion writes per minute (1000 More than ten million per second ) Provides higher stand-alone performance .
边栏推荐
- Hundreds of billions of it operation and maintenance market has come to the era of speaking by "effect"
- Why can ThreadLocal achieve thread isolation?
- Customer first | domestic Bi leader, smart software completes round C financing
- JS inheritance method
- Analysis and recurrence of network security vulnerabilities
- When will brain like intelligence, which is popular in academia, land? Let's listen to what the industry masters say - qubits, colliders, x-knowledge Technology
- Explain cache consistency and memory barrier
- C #, introductory tutorial -- debugging skills and logical error probe technology and source code when the program is running
- Win the bid! Nantah general gbase 8s won the bid for the 2022 database framework project of NARI Group
- 谷歌 Material Design 的文本框为什么没人用?
猜你喜欢

谷歌 Material Design 的文本框为什么没人用?

Introduction of functions in C language (blood Book 20000 words!!!)

Day112. Shangyitong: Mobile verification code login function

第2章-14 求整数段和

The five pictures tell you: why is there such a big gap between people in the workplace?

Competition: diabetes genetic risk detection challenge (iFLYTEK)

Business digitalization is running rapidly, and management digitalization urgently needs to start

Line generation (matrix)

Top all major platforms, 22 versions of interview core knowledge analysis notes, strong on the list

NDK series (6): let's talk about the way and time to register JNI functions
随机推荐
Wechat applet - wechat applet browsing PDF files
Business digitalization is running rapidly, and management digitalization urgently needs to start
Why setting application.targetframerate doesn't work
Blog Building 9: add search function to Hugo
Detailed explanation of the basic use of express, body parse and express art template modules (use, route, path matching, response method, managed static files, official website)
阿里巴巴内部面试资料
PostgreSQL:无法更改视图或规则使用的列的类型
Slice function of JS handwriting function (thoroughly understand the header but not the footer)
Go interface advanced
How CI framework integrates Smarty templates
C轮融资已完成!思迈特软件领跑国内BI生态赋能,产品、服务竿头一步
postgresql查询【表字段类型】和库中【所有序列】
Post it notes -- 45 {packaging of the uniapp component picker, for data transmission and processing -- Based on the from custom packaging that will be released later}
JS inheritance method
说透缓存一致性与内存屏障
Solution: indexerror: index 13 is out of bounds for dimension 0 with size 13
Source code analysis of linkedblockingqueue
TXT文本文件存储
Machine learning how to achieve epidemic visualization -- epidemic data analysis and prediction practice
解决:IndexError: index 13 is out of bounds for dimension 0 with size 13