当前位置:网站首页>Memtable for leveldb source code analysis
Memtable for leveldb source code analysis
2022-07-01 05:18:00 【Ethan97】
Preface
LevelDB It's a google Very efficient kv database .
LevelDB Adopted LSM tree It achieves efficient write operation and good read performance .LSM tree namely The Log-Structured Merge-Tree, It fully considers the slow random read and write speed of the disk , As far as possible, it ensures that the key value is only written to the disk in sequence during the insertion process , Without performing seek 、 Expensive operations such as moving heads .LSM tree It mainly consists of two data structures , One is memory resident memtable, The second is saved to disk sstable.LSM tree Introduction to , See also introduction to LSM.
- LevelDB Maintain in memory
memtableRecord recent writes (kTypeValue) And delete operations (kTypeDeletion); - When in memory
memtableAfter reaching a certain size , It will be saved assstable.
In this paper, LevelDB Of memtable Realization .
memtable
memtable Maintenance recently occurred in LevelDB Write operations for . Its bottom layer adopts a kind of O(logn) Find and insert data structures with time complexity :Skip List, The following is a brief introduction to its concept .

Skip List There are multiple layers , At the bottom is a well ordered linked list .
Skip list Lookup
- As shown in the figure above , Suppose you want to find 9, When looking for , From the top , Find the first one that is just less than 9 The node of , Here for 1;
- Because of the node 1 There are no more nodes after , Go down one floor , Repeat the process , Arrival node 6;
- Go on down one floor , And then look forward to , arrive 9, That is, find the target node .
Skip list Insertion
Skip list The process of inserting is similar to that of searching , But in the process of finding the insertion location , You need to record the nodes you pass “ route ”, When the insertion position is found , Walk vertically to the ground floor , Insert the new node into . Then go back layer by layer “ Node path ”, The first i+1 The nodes of the layer have a fixed probability p Extend to the upper layer .
On average , Each node appears in 1/(1-p) In layers .
memtable Realization
MemTable Class provides the API
// Auto increment reference count .
void Ref() {
++refs_; }
// Drop reference count. Delete if no more references exist.
void Unref() {
--refs_;
assert(refs_ >= 0);
if (refs_ <= 0) {
delete this;
}
}
// Back here memtable About the amount of memory space used .
size_t ApproximateMemoryUsage();
// memtable The iterator .
// The caller needs to ensure that when the iterator is alive memtable Also alive .
// This iterator returns the encoded internal key (internal key).
Iterator* NewIterator();
// Go to memtable Add a key value pair to the , Specify the serial number SequenceNumber And type .
// The type is kTypeValue Indicates that the key value pair is inserted , The type is kTypeDeletion Indicates that the key value pair is deleted .
void Add(SequenceNumber seq, ValueType type, const Slice& key,
const Slice& value);
// According to the key from memtable Get value , The value will be saved in value variable , And back to true.
// If memtable Contains a key Delete tag for , Save one NotFound() To variable s And in return true.
// Other cases return to false.
bool Get(const LookupKey& key, std::string* value, Status* s);
MemTable The member variables of are as follows :
typedef SkipList<const char*, KeyComparator> Table;
// Key comparator
KeyComparator comparator_;
// Reference counter
int refs_;
// Memory allocator
Arena arena_;
// At the bottom skip list
Table table_;
Add
Add Will a entry Insert into memtable in .
One entry The memory layout of is as follows :

here key The type is Internal Key.LevelDB There are three key:
- Lookup Key
- Internal Key
- User Key
User Key It is directly used for reading and writing databases key, Usually use one Slice Express , Such as :
db->Put(leveldb::WriteOptions(), "hello", "LevelDB"); // here "hello" namely User Key.
Internal Key It is in User Key Two fields are expanded on the basis of .Lookup Key And in the Internal Key A field is added to the header of the . The relationship among them is shown in the figure below :

Add The code implementation is as follows :
void MemTable::Add(SequenceNumber s, ValueType type, const Slice& key,
const Slice& value) {
// Format of an entry is concatenation of:
// key_size : varint32 of internal_key.size()
// key bytes : char[internal_key.size()]
// tag : uint64((sequence << 8) | type)
// value_size : varint32 of value.size()
// value bytes : char[value.size()]
size_t key_size = key.size();
size_t val_size = value.size();
size_t internal_key_size = key_size + 8; // Internal Key The occupied memory size is User Key + 8 bytes, See the figure above for details
const size_t encoded_len = VarintLength(internal_key_size) +
internal_key_size + VarintLength(val_size) +
val_size;
// to entry Allocate space .
char* buf = arena_.Allocate(encoded_len);
// take key_size Write it here entry in .
char* p = EncodeVarint32(buf, internal_key_size);
// take key Write it here entry in .
std::memcpy(p, key.data(), key_size);
p += key_size;
// take tag Write it here entry in ,tag Low with serial number 56 Bit and operation type ( Insert / Delete , Octet ).
EncodeFixed64(p, (s << 8) | type);
p += 8;
// take val_size Write it here entry in .
p = EncodeVarint32(p, val_size);
// take value Write it here entry in .
std::memcpy(p, value.data(), val_size);
assert(p + val_size == buf + encoded_len);
// entry Build complete , Insert it into skip list in .
table_.Insert(buf);
}
Add Actually according to entry The memory layout of puts the fields in the allocated buf In memory space , Then call it. skip list Methods of member variables table_.Insert(buf) Will construct entry Insert into skip list in .
Get
Get According to the key LookupKey stay memtable Search for entry.
Get The code is as follows :
bool MemTable::Get(const LookupKey& key, std::string* value, Status* s) {
Slice memkey = key.memtable_key();
// Declared a skip list Iterator on .
Table::Iterator iter(&table_);
iter.Seek(memkey.data());
if (iter.Valid()) {
// entry format is:
// klength varint32
// userkey char[klength]
// tag uint64
// vlength varint32
// value char[vlength]
// Check that it belongs to the same user key.
// Do not check the serial number , because Seek() The sequence number is too large entry.
// Iterate over the next key .
const char* entry = iter.key();
// from entry Middle readout key_length.
uint32_t key_length;
// GetVarint32Ptr from entry read out key_length, And return the position of the pointer after reading , namely key The address of .
const char* key_ptr = GetVarint32Ptr(entry, entry + 5, &key_length);
// Yes entry Medium user key and lookup key Medium user key Compare .
if (comparator_.comparator.user_comparator()->Compare(
Slice(key_ptr, key_length - 8), key.user_key()) == 0) {
// Found the right one user key.
// Take it out and follow it user key After tag, Serial number and value type .
const uint64_t tag = DecodeFixed64(key_ptr + key_length - 8);
// Take out tag low 8 The value of a , namely value type.
switch (static_cast<ValueType>(tag & 0xff)) {
case kTypeValue: {
// key_ptr + key_length That is, the position of the value length field ,GetLengthPrefixedSlice from entry It takes out the value v.
Slice v = GetLengthPrefixedSlice(key_ptr + key_length);
// Will value v Assign to value After the return true.
value->assign(v.data(), v.size());
return true;
}
case kTypeDeletion:
// The key has been deleted , return not found.
*s = Status::NotFound(Slice());
return true;
}
}
}
return false;
}
take memtable writes sstable
When one memtable Beyond a certain size ( Default 4MB) when , It will create a new one memtable And log files , And direct future updates to this new memtable And log files .
Perform the following work in the background :
- Put the previous
memtableIt sayssstable; - Discard old
memtable; - Delete old log files and
memtable; - New
sstableAdd to level-0.
The call chain is as follows :
Put() -> Write() -> MakeRoomForWrite() -> MaybeScheduleCompaction() ->
env_->Schedule(&DBImpl::BGWork, this) Asynchronous start BGWork() -> BackgroundCall() ->
BackgroundCompaction() -> CompactMemTable()
Reference material
边栏推荐
- 如何选择导电滑环材料
- Thread process foundation of JUC
- 【暑期每日一題】洛穀 P1568 賽跑
- Unit testing with mongodb
- Tcp/ip explanation (version 2) notes / 3 link layer / 3.2 Ethernet and IEEE 802 lan/man standards
- Rainbow combines neuvector to practice container safety management
- 每日一题-LeetCode1175-质数排列-数学
- Distributed transactions - Solutions
- How to meet the requirements of source code confidentiality and source code security management
- CockroachDB 分布式事务源码分析之 TxnCoordSender
猜你喜欢

Detailed explanation of distributed global unique ID solution
![Solution: thread 1:[< *> setvalue:forundefined key]: this class is not key value coding compliant for the key*](/img/88/0b99d1db2cdc70ab72d2b3c623dfaa.jpg)
Solution: thread 1:[< *> setvalue:forundefined key]: this class is not key value coding compliant for the key*

Things generated by busybox

LevelDB源码分析之LRU Cache
![[data recovery in North Asia] a data recovery case of raid crash caused by hard disk drop during data synchronization of hot spare disk of RAID5 disk array](/img/22/606ff1e8dad3d5896b32d2146b0477.jpg)
[data recovery in North Asia] a data recovery case of raid crash caused by hard disk drop during data synchronization of hot spare disk of RAID5 disk array

实战:redux的基本使用

导电滑环短路的原因以及应对措施

【暑期每日一题】洛谷 P5886 Hello, 2020!

Manually implement a simple stack

How to hide browser network IP address and modify IP internet access?
随机推荐
Application of industrial conductive slip ring
Principle, technology and implementation scheme of data consistency in distributed database
Usage and principle of synchronized
Global and Chinese market of high-end home theater 2022-2028: Research Report on technology, participants, trends, market size and share
QDataStream的簡單讀寫驗證
Global and Chinese market of 3D CAD 2022-2028: Research Report on technology, participants, trends, market size and share
Receiving package install and uninstall events
Precautions for use of conductive slip ring
Flutter 实现每次进来界面都刷新数据
What can the points mall Games bring to businesses? How to build a points mall?
Day 05 - file operation function
Solution: drag the Xib control to the code file, and an error setvalue:forundefined key:this class is not key value coding compliant for the key is reported
Global and Chinese markets of Ethernet communication modules 2022-2028: Research Report on technology, participants, trends, market size and share
複制寶貝提示材質不能為空,如何解决?
How to select conductive slip ring material
[Yugong series] February 2022 Net architecture class 005 ABP vNext Net core web application getting started configuration
Tcp/ip explanation (version 2) notes / 3 link layer / 3.2 Ethernet and IEEE 802 lan/man standards
[summer daily question] Luogu p5886 Hello, 2020!
导电滑环使用的注意事项
Mathematical knowledge: finding the number of divisors