当前位置:网站首页>In depth analysis of Apache bookkeeper series: Part 3 - reading principle
In depth analysis of Apache bookkeeper series: Part 3 - reading principle
2022-06-25 07:04:00 【StreamNative】
This article is translated from 《Apache BookKeeper Internals — Part 3 — Reads》, author Jack Vanlightly.
translator : Wang lunhui , Open source enthusiasts , Local developers in Guizhou .
This series is based on Apache Pulsar Configured BookKeeper 4.14.
stay Last article in , We discussed every time from Netty Layer to file IO Write path to , And all the threads and components involved . In this article , We will do the same for the read path . The read request is submitted to the read thread pool to be executed , And there can be multiple threads to do the work ( So it's a pool ). By default , Long polling reads are also submitted to the reader thread for polling , But it can be configured to run on a separate pool of long polling threads .
When reading , The selected read thread calls DbLedgerStorage getEntry(long ledgerId, long entryId) Method , And the reads are synchronized .

chart 1. The reading process only enters DbLedgerStorage And execute synchronously .
Read requests are processed in the following order :
1. Check the write cache , If the cache hits, it returns entry.
2. If the write cache misses , Then check the read cache ; If the cache hits, it returns entry. If the read cache misses , Indicates that the entry Only exists on disk .
3. Use Entry Location Index (RocksDB) obtain entry Location on disk . The index will return a location (entry Log file and offset in file ).
4. Reads the specified at the specified offset entry Log files .
5. Perform read ahead .
6. Will read all the previews entry Load into the read cache .
7. return entry.
The read ahead operation involves starting from entry Read from the log file , From the next entry Start reading all the time entry, Until one of the following occurs :
1. The batch read ahead size limit has been reached ( The default is 1000 individual entry, from Pulsar To configure );
2. At the end of the file ;
3. Arrive at different Ledger Of entry.
By default , The total size of the read cache ( Every DbLedgerStorage The instance has a read cache ) Available direct memory 25%. therefore , For the two Ledger Contents and 2GB Total read cache , Every DbLedgerStorage Instances will get 1GB Read cache .
read-ahead (Readahead) It's efficient , Because the same Ledger Of entry In contiguous blocks on the disk , This is because you press... Before writing files to the disk Ledger and entry id Yes entry Sorted . This means that the read ahead does not require a location index and the reads are ordered .
Use sticky reads on the client (sticky read) Very helpful for performance , Because from the given Bookie All reads from the client are sent to a single Bookie, This kind of operation will make good use of this kind of pre reading . If the reading is scattered in different Bookie in , Will reduce the effectiveness of the read ahead operation . for example , If a client ( One Pulsar Broker) In three Bookie Randomly assigned between 0-99 individual entry The read , Then each Bookie It will eventually cover most of the pre reading , But each will only handle about ⅓ The reading of , At the same time, we are already Bookie Will each entry A copy of is loaded 3 Time .
besides , There may be other situations , For example, cache jitter (cache thrashing), The effectiveness of the read cache will be reduced , Even affect performance .
Read cache
Each read cache consists of one or more read caches with a size of no more than 1GB The fragmentation of (segment) form , Logically, it can be regarded as a single ring buffer (ring buffer). As a ring buffer , Memory is pre allocated , Newly added entry Eventually, the old entry. Each cache has one that it contains entry The index of , For quick search and retrieval .
Read cache jitter
If the read cache is too small to meet the needs of the read ahead operation ,entry Will be continuously evicted from the read cache before being read , This is cache jitter . Cache jitter can result in previously read from disk entry Read again to place in cache .
To avoid read cache jitter ,Bookie Your read cache needs to be large enough , To accommodate all current Pulsar Consumer's preview . When one Topic Or cross Topic There are a lot of different positions of Pulsar Consumers and these locations are Broker Out of your own cache ,BookKeeper You may receive a lot of read load . In these scenarios ,Pulsar You have to go to Bookie Send read request , The worst case scenario is entry It doesn't intersect with the preview at all , And the total size of these disjoint read ahead exceeds the capacity of the cache .

chart 2. client (Pulsar Broker Consumer objects in ) Read each other before using entry
When the required data size exceeds the cache size , More and more unread entry Evicted from cache , Therefore, it is necessary to re read from the disk .

chart 3. How much read cache is needed and eventually read from disk ( And reread ) Every entry The relationship between the average number of times . The above figures are based on real production events .
This read cache jitter can be eliminated or reduced in the following ways :
• Increase the amount of memory available in the read cache ;
• Reduce the bulk read cache size .
summary
If we ignore the overall height CPU Utilization rate , Then the bottleneck in the read path may be caused by the disk IO. The simple case is that the underlying storage volume is not fast enough , Unable to meet demand . Another reason may be that there are not enough read threads to push the volume to its limit . Finally, read cache jitter , This causes the disk to IO The number has risen sharply , So that the storage volume reaches the limit faster than before .
stay Next article [1] in , We're going to look at BookKeeper Back pressure mechanism to protect yourself from overload .
Related reading
• Apache BookKeeper Insight ( One ) — External consensus and dynamic integration
• In depth analysis of Apache BookKeeper series : Second articles — Write operation principle
• In depth analysis of Apache BookKeeper series : Chapter one — Framework principle
Reference link
[1] Next article : https://medium.com/splunk-maas/apache-bookkeeper-internals-part-4-back-pressure-7847bd6d1257
▼ Turn off notes 「Apache Pulsar」 a take more many Technology Technique dry cargo ▼

Click to read the original text , Start your pulse !
This article is from WeChat official account. - ApachePulsar(ApachePulsar).
If there is any infringement , Please contact the [email protected] Delete .
Participation of this paper “OSC Source creation plan ”, You are welcome to join us , share .
边栏推荐
- R & D thinking 07 - embedded intelligent product safety certification required
- 集群常用群起脚本
- 弱大数定理的意义与证明
- 【xxl-job】池塘水绿风微暖,记得玉真初见面
- Qcom--lk phase I2C interface configuration scheme -i2c6
- Kubernetes cluster dashboard & kuboard installation demo
- Americo technology launches professional desktop video editing solution
- 【ROS2】为什么要使用ROS2?《ROS2系统特性介绍》
- Three laws of go reflection
- Unity get resource path
猜你喜欢

How to realize the stable output of 3.3v/3.6v (1.2-5v) voltage of lithium battery by using the voltage rise and fall chip cs5517
![[he doesn't mention love, but every word is love]](/img/28/0c3ddad3dc9b1ef8d0618164f39e53.png)
[he doesn't mention love, but every word is love]

ACWING/2004. Misspelling

终于等到你,小程序开源啦~

Navicat防止新建查询误删

【ROS2】为什么要使用ROS2?《ROS2系统特性介绍》

sin(a-b)=sina*cosb-sinb*cosa的推导过程

アルマ / 炼金妹

Keil debug view variable prompt not in scope

sin(a+b)=sina*cosb+sinb*cosa的推导过程
随机推荐
Design of PWM breathing lamp based on FPGA
Pratique de gestion hiérarchique basée sur kubesphere
From file system to distributed file system
mysql 表查询json数据
TorchServe避坑指南
[learn shell programming easily]-5. Plan tasks
joda. Time get date summary
Are you still doing the dishes yourself? Teach you how to make dishwasher controller with single chip microcomputer
Who can teach me how to learn SCM, what to learn first and how to get started?
Practice of hierarchical management based on kubesphere
Blue Bridge Cup SCM module code (external interrupt) (code + comment)
Understand ZBrush carving software and game modeling analysis
How to find happiness in programming and get lasting motivation?
Direct select sort and quick sort
【一起上水硕系列】Day 5
Unity get resource path
[acnoi2022] the structure of President Wang
基於 KubeSphere 的分級管理實踐
Entry level use of flask
Query JSON data in MySQL table