当前位置:网站首页>Mit6.824-lab2d-2022 (detailed explanation of log compression)
Mit6.824-lab2d-2022 (detailed explanation of log compression)
2022-06-10 10:51:00 【Xingping XP】
List of articles
Preface
Because I am busy with the final exam , as well as 2D Most of the code refactoring is required … This one is actually a long time behind the last one .
One 、 Log compression (Log compaction) The background of
First, we need to understand the background of log compression , Why do I need log compression ?
Let's see paper Description in :
Raft’s log grows during normal operation to incorporate more client requests, but in a practical system, it cannot grow without bound. As the log grows longer, it occupies more space and takes more time to replay. Thiswill eventually cause availability problems without some
mechanism to discard obsolete information that has accumulated in the log.
This passage means :Raft The log of will grow during normal operation , To accommodate more client requests , But in a real system , It cannot grow indefinitely . As the log gets longer , It takes up more space and takes more time to replay . If there is no mechanism to discard the stale information accumulated in the log , This will eventually lead to Usability issues .
So we use Snapshot( snapshot ) To simply implement log compression .
Two 、Snapshot The interpretation of
2.1、Snapshot The way
stay Paper Of figure12 There is such a picture in which the function of snapshots can be well illustrated :
This picture is actually easy to understand , Now let the written test explain briefly : Suppose now log Stored in x、y Update of .x The update information of is in turn 3、2、0、5.y The update information of is in turn 1、9、7. And the log subscript 1~5 The log of is commit 了 , This indicates that this log is no longer needed for the current node . Then we will access this log Finally, store information As a log, that is x=0,y=9 And record the log index of the last snapshot storage (last included index) And the corresponding term of office . At this point, our new log only needs 6、7 Uncommitted parts ,log The length of is also from 7 Change into 2.
Therefore, it can be seen that snapshot storage is based on raft The number of nodes determines . Every node will Access your own snapshots , Snapshot information is equivalent to commit Later logs ..
2.2、Snapshot And AppendEntries The connection of
go back to raft In the log increment of , In fact, we can find that ,commit The updated process is actually ,Leader Send to each node to synchronize logs , And back to leader Sync RPC Result , to update matchIndex. If more than half of the nodes have successfully synchronized the logs , that leader More than half , And the latest matchIndex Set to commitIndex, And then submitted by ticker Submit . Then it will be updated the next time the log heartbeat is sent followers Of commitIndex Subscript .
Therefore, there may be half of the nodes , Or network partition ,crash The node of has not been updated to the submitted node , The submitted log has been leader Submit and abandon . Then we need leader Send your own snapshot , Install for these followers.
2.3、Snapshot Advantages and disadvantages
First, provide the original description :
There are two more issues that impact snapshotting performance. First, servers must decide when to snapshot. If a server snapshots too often, it wastes disk bandwidth and energy; if it snapshots too infrequently, it risks exhausting its storage capacity, and it increases the time required to replay the log during restarts. One simple strategy is to take a snapshot when the log reaches a fixed size in
bytes. If this size is set to be significantly larger than the
expected size of a snapshot, then the disk bandwidth overhead for snapshotting will be small.
The second performance issue is that writing a snapshot can take a significant amount of time, and we do not want this to delay normal operations. The solution is to use copy-on-write techniques so that new updates can be accepted without impacting the snapshot being written. For example, state machines built with functional data
structures naturally support this. Alternatively, the operating system’s copy-on-write support (e.g., fork on Linux) can be used to create an in-memory snapshot of the entire state machine (our implementation uses this approach).
It can be seen that Snapshot There will be two questions :
- Waste of disk bandwidth and energy caused by frequent snapshot settings . Of course, frequent snapshot settings can also lead to storage problems . Therefore, a reasonable snapshot installation is the correct solution . Paper The strategy mentioned in is to Log Set a reasonable size , Beyond this size, the log snapshot is updated .
- The other is the time spent writing snapshots . Because there may be competition when writing snapshots , And lock .paper It is also a classic copy-on-write Method , When updating snapshots , First, store a copy temporarily log Backup of .
3、 ... and 、lab2D
3.1、 The interaction process
Yes 2D In fact, it is not required to solve the performance overhead , Here we start with the interaction diagram provided by the official experiment guide .
Through this picture, you can actually review the whole lab2 The whole picture of . I will give another general overview of the specific process later , The snapshot section is described here .
As can be seen from the figure, in a single node Service The floor will be right Raft Node call Snapshot、 And CondInstallSnap Two functions , This is what we need to write in the laboratory . For their own Raft Nodes need to persist the snapshot information , That is to say SaveStateAndSnapshot function , This part of the persistent information will pass through Service use ReadSnapshot To call . These two functions on persistence , The lab has been persisting for us persist.go Provided in .
For snapshot connection between nodes, it is through InstallSnapshot Communicate between nodes . It depends on who is leader 了 .
3.2、Snapshot And CondInstallSnap
- lab2 We need to implement these two functions by ourselves , Now let's talk about the role and implementation of these two functions .
First of all Snapshot function , about Snapshot The function is actually paper There is also a paragraph mentioned in :
This snapshotting approach departs from Raft’s strongleader principle, since followers can take snapshots without the knowledge of the leader. However, we think thisdeparture is justified. While having a leader helps avoid conflicting decisions in reaching consensus, consensus has already been reached when snapshotting, so no decisions conflict. Data still only flows from leaders to followers, just followers can now reorganize their data.
because snapshot In fact, that is service Yes rf Called , send rf The node updates its own snapshot information . In this way, some people may think that this violates Raft The principle of strong leadership . Because followers can update their snapshots without the leader's knowledge . But in fact, this situation is reasonable , The snapshot is updated only to update the data , It does not conflict with reaching a consensus . Data is still just flowing from the leader to the lower level ,followers Just use snapshots to reduce their storage burden .
Then let's look at the specific code implementation :
// Snapshot the service says it has created a snapshot that has
// all info up to and including index. this means the
// service no longer needs the log through (and including)
// that index. Raft should now trim its log as much as possible.
// index Represents a snapshot apply Applied index, and snapshot It represents the upper class service The stream of snapshot bytes transmitted , It includes Index Previous data
// The purpose of this function is to discard the logs installed in the snapshot , And install snapshot data , Also update the snapshot subscript , Belong to peers Self initiative update , And leader Sending snapshots does not conflict
func (rf *Raft) Snapshot(index int, snapshot []byte) {
// Your code here (2D).
if rf.killed() {
return
}
rf.mu.Lock()
defer rf.mu.Unlock()
// If the subscript is greater than its own submission , Note the snapshot cannot be installed without being committed , If the self snapshot point is greater than index Note that installation is not required
//fmt.Println("[Snapshot] commintIndex", rf.commitIndex)
if rf.lastIncludeIndex >= index || index > rf.commitIndex {
return
}
// Update snapshot log
sLogs := make([]LogEntry, 0)
sLogs = append(sLogs, LogEntry{
})
for i := index + 1; i <= rf.getLastIndex(); i++ {
sLogs = append(sLogs, rf.restoreLog(i))
}
//fmt.Printf("[Snapshot-Rf(%v)]rf.commitIndex:%v,index:%v\n", rf.me, rf.commitIndex, index)
// Update snapshot subscript / term
if index == rf.getLastIndex()+1 {
rf.lastIncludeTerm = rf.getLastTerm()
} else {
rf.lastIncludeTerm = rf.restoreLogTerm(index)
}
rf.lastIncludeIndex = index
rf.logs = sLogs
// apply The snapshot should be reset commitIndex、lastApplied
if index > rf.commitIndex {
rf.commitIndex = index
}
if index > rf.lastApplied {
rf.lastApplied = index
}
// Persistent snapshot information
rf.persister.SaveStateAndSnapshot(rf.persistData(), snapshot)
}
Specific explanations remain in the code , It is not difficult to understand what this function does . It is worth mentioning that persister.SaveStateAndSnapshot() The function of is explained in introduce There are also , In fact, if the node is killed Lead to crash situation , Then restart after , The persistent data it needs .
- CondInstallSnap function : For this function, combine introduction The explanation in is also very easy to understand , The background of this function is that you sent a snapshot , Then the snapshot you sent should be uploaded to applyCh, And at the same time your appendEntries You also need to upload logs , Could lead to conflict . So the original text is described in this way :
to avoid the requirement that snapshots and log entries sent on applyCh are coordinated.
But actually , As long as you are applied Do a good job of synchronization , Add mutex . Then this problem can be avoided , So this is also mentioned in the laboratory api It has been abandoned , Discourages the realization of , Simply return a true, Just go .
// CondInstallSnapshot
// A service wants to switch to snapshot. Only do so if Raft hasn't
// have more recent info since it communicate the snapshot on applyChan.
//
func (rf *Raft) CondInstallSnapshot(lastIncludedTerm int, lastIncludedIndex int, snapshot []byte) bool {
// Your code here (2D).
return true
}
3.3、InstallSnapshot RPC
This part is the interaction diagram leader For updating . Also like lab2a、lab2b I gave a table to implement by myself .
Here is a brief mention of when snapshots are needed .
As mentioned before , The time to send a snapshot is actually , Namely leader Send to follower When the log has been abandoned . Then you need to send a snapshot . Therefore, the call entry that sends the snapshot should check the log during log increment . The query condition is rf.nextIndex[server]-1 < rf.lastIncludeIndex Lower than your own snapshot .
//installSnapshot, If rf.nextIndex[i]-1 Less than equal lastIncludeIndex, explain followers The log of is smaller than its snapshot state , Send your own snapshot
// Also note that it is smaller than the snapshot , It is already relatively backward
if rf.nextIndex[server]-1 < rf.lastIncludeIndex {
go rf.leaderSendSnapShot(server)
rf.mu.Unlock()
return
}
- RPC Structure :
type InstallSnapshotArgs struct {
Term int // The term of office of the sending requestor
LeaderId int // Of the requesting party LeaderId
LastIncludeIndex int // Snapshot last applied Log subscript for
LastIncludeTerm int // Snapshot last applied Current term of office at
Data []byte // The original byte stream data of the snapshot block
//Done bool
}
type InstallSnapshotReply struct {
Term int
}
For parameter design, you can see introduction For some fields offset Split snapshots by offset this experiment does not require .
- RPC Function implementation :
// InstallSnapShot RPC Handler
func (rf *Raft) InstallSnapShot(args *InstallSnapshotArgs, reply *InstallSnapshotReply) {
rf.mu.Lock()
if rf.currentTerm > args.Term {
reply.Term = rf.currentTerm
rf.mu.Unlock()
return
}
rf.currentTerm = args.Term
reply.Term = args.Term
rf.status = Follower
rf.votedFor = -1
rf.voteNum = 0
rf.persist()
rf.votedTimer = time.Now()
if rf.lastIncludeIndex >= args.LastIncludeIndex {
rf.mu.Unlock()
return
}
// After the snapshot logs cutting , Direct before snapshot applied
index := args.LastIncludeIndex
tempLog := make([]LogEntry, 0)
tempLog = append(tempLog, LogEntry{
})
for i := index + 1; i <= rf.getLastIndex(); i++ {
tempLog = append(tempLog, rf.restoreLog(i))
}
rf.lastIncludeTerm = args.LastIncludeTerm
rf.lastIncludeIndex = args.LastIncludeIndex
rf.logs = tempLog
if index > rf.commitIndex {
rf.commitIndex = index
}
if index > rf.lastApplied {
rf.lastApplied = index
}
rf.persister.SaveStateAndSnapshot(rf.persistData(), args.Data)
msg := ApplyMsg{
SnapshotValid: true,
Snapshot: args.Data,
SnapshotTerm: rf.lastIncludeTerm,
SnapshotIndex: rf.lastIncludeIndex,
}
rf.mu.Unlock()
rf.applyChan <- msg
}
For this part, it is actually passed on Dataj Just to be direct applied, Of course, there may be unexpected snapshots in the original array log, Then split these , Note that each time you install a snapshot , To persist , As shown in the interaction diagram .
3.5、 Subscript Tips
In fact, for 2D Subscript update is the most tedious , The subscript of the snapshot should be included every time the log is updated , Because for real subscripts, for example, I have shown above ,1~ 7 In the Journal of Intercepted to 5 Snapshot , At this time, the real log subscript does not 7, It is 2, At this time, you have to use the offset of the snapshot , Go to restore Restore true subscript , For this part , It is recommended to use tool classes to encapsulate , That's right 2D The readability will be higher when modifying .
3、 ... and 、DEBUG gossip
- finish lab2D Actually, it takes a long time , although 2D Subscript modification does the whole thing lab2 The most tedious of all , But it is not because of this that it is most difficult . It is 2D Need to be right 2A,2B,2C To do regression tests , Finally, several tests passed . The author has also experienced a lot bug, But after the final exam , I have been writing this blog for some time , Simply recall what you still remember bug.
- Because in front of the author ABC It's using select For asynchronous . But found select In fact, there will be many Hidden blockage 、 The deadlock problem 、 The problem of resource leakage , And for multiple ticker Competition between , Not only does it make the code less readable , It may also take too long test Failure . As for how to find it, I still print the log patiently , It is found that those who fail test place , Finally, I have been send a election, After discovery, there are two rf The node has not received a reply from another node . Then the two nodes each hold one vote , As a result, it cannot be elected , Check that the node is stuck apply chan part , Finally, we can trace back to the source and find that the lock has been held and sent ticker Of select The place of . For this part, it is the whole ticker It's been modified , Split the commit log 、 Election log 、 Three log increments ticker.
- There's another problem 2C Of figure8 as well as 2B Of RPC counts aren’t too high If you want compatibility between , Be sure to pay attention to three ticker Sleep time between , Coordination , If coordination is inconsistent , It can easily lead to a difficult life test Direct failure , Or one fails through another .
- If it can be successfully compatible with the previous three experiments , So the first 4 One of the key points of the experiment is actually the timing of sending snapshots , Change of subscript , And then whether the persistence is involved . for example lab2C At the time of the Persist At this time, more Index and tenure of the persistent snapshot Two parameters .
summary
In fact, it is not easy to stick to it completely , about 2C In fact, it's a little lucky ,2D In fact, many modifications have been made , Consists of three RPC Write it down , In fact, there are not so many places case,paper Medium success Is the most concise . Then maybe I will write lab2 Post distributed summary and experience . And then in June, I will close my heart , Brush algorithm , Reprecipitation java Went to the .
- About test:


About gitee:
gitee6.824-2022-lab2D
边栏推荐
- pip下载源的两种修改方法
- Jiachuang biology rushes to the Growth Enterprise Market: the actual controller with annual revenue of 130million was once a professor of Wuhan University
- PV operation daily question - ticket sales
- C 12 circular queue
- 数组目标值target两个整数,并返回它们的数组下标
- 开源字节设计思想
- PV操作每日一题-橘子苹果问题(高阶版变式)
- 数组判断任意出现的重复值
- MySQL Foundation
- 【先楫HPM6750测评】RT-Thread开发环境搭建和Hello World
猜你喜欢

杰理之BLE 动态调节功率【篇】

Academic lecture: masp for multi label active learning

Implementation code of several recent good papers (with source code download)

How to state clearly and concisely the product requirements?

学术讲座: 多标签主动学习之 MASP

硬核剧透!11个议题、14位大咖,龙蜥社区走进 Intel MeetUp 议程公布!

Efficiency tools: utools

珠海高远电能科技有限公司30%股权转让,来自塔米狗分享

B站季报图解:营收51亿同比增30% 月均活跃用户近3亿

Selenium distributed testing
随机推荐
[high concurrency] about optimistic lock and pessimistic lock, the interviewer of ant financial asked me these questions!!
PV操作每日一题-考试问题
杰理之BLE PB2 不能硬件接地或接到三极管基极【篇】
关于Visual Studio 2019安装时VS installer无法下载文件,进度条为0,显示网络有问题的解决办法
Continuous subarray of maximum sum
杰理之BLE IO 口中断及翻转【篇】
2022 underground coal mine electrical examination question bank and online simulation examination
MySQL实战45讲_8_从一个问题来加深对 mysql 可重复读的理解
[untitled]
Mixin -- mixed
ColorUI配色详情
二分查找有序数组中的特定值
Create swift color class
PV操作每日一题-餐厅上菜问题
高考志愿填报,城市、学校与专业怎么选?
PAT 甲级 1134 顶点覆盖
Vs code supports configuring remote synchronization
杰理之长按复位及高电平复位【篇】
Random number letter (upper case) combination
MySQL Foundation
