Preface
This experiment has two tasks : Clock replacement algorithm and buffer pool manager , They correspond to each other ClockReplacer and BufferPoolManager class ,BufferPoolManager Will use ClockReplacer Select the page to be swapped out , And pass DiskManager Write the swapped out page to the database file . The following describes the implementation process of these two classes .
Code implementation
If you clone directly Bustub Warehouse , Get is fall 2021 Experimental code , about fall 2019, Can be commit Switch to 5972018: Fix typo in type.cpp(#66). But this introduces a pit , It's the need to put build_support/gtest_CMakeLists.txt.in Is changed to :
cmake_minimum_required(VERSION 3.8)
project(googletest-download NONE)
include(ExternalProject)
ExternalProject_Add(googletest
GIT_REPOSITORY [email protected]:google/googletest.git
GIT_TAG main
SOURCE_DIR "${CMAKE_BINARY_DIR}/googletest-src"
BINARY_DIR "${CMAKE_BINARY_DIR}/googletest-build"
CONFIGURE_COMMAND ""
BUILD_COMMAND ""
INSTALL_COMMAND ""
TEST_COMMAND ""
)
The main changes here are GIT_TAG by main, because googletest The warehouse seems to be master The branch was renamed main 了 .
ClockReplacer class
The project homepage introduces the implementation of this class :
The size of the
ClockReplaceris the same as buffer pool since it contains placeholders for all of the frames in theBufferPoolManager. However, not all the frames are considered as in theClockReplacer. TheClockReplaceris initialized to have no frame in it. Then, only the newly unpinned ones will be considered in theClockReplacer. Adding a frame to or removing a frame from a replacer is implemented by changing a reference bit of a frame. The clock hand initially points to the placeholder of frame 0. For each frame, you need to track two things: 1. Is this frame currently in theClockReplacer? 2. Has this frame recently been unpinned (ref flag)?In some scenarios, the two are the same. For example, when you unpin a page, both of the above are true. However, the frame stays in the
ClockReplaceruntil it is pinned or victimized, but its ref flag is modified by the clock hand.
A simple translation , Namely ClockReplacer Class maintains a frame Set , The collection size is consistent with the buffer pool size . Due to some frame Being accessed by another thread , these frame Of pin count ( Equal to the number of threads accessing the frame ) Will be greater than the 0, Now these frame It is not allowed to be replaced , To put it another way , That's all. frame be not in ClockReplacer In the set of maintenance . For those that can be replaced frame, It must meet two conditions :
pin countby 0, That is, the frame is inClockReplacerin . Once the... Of a framepin countGreater than zero , Will be removedClockReplacer( callClockReplacer::Pin)reference bitbyfalse, That is, the frame has not been accessed recently . aboutpin countJust become 0 And be joinedClockReplacerIn terms of frames , Because it has just been visited , So itsreference bitbytrue( callClockReplacer::Unpin)
As for the process of clock replacement algorithm , In fact, it is from... In order frame Select a process that meets the above exchange conditions from the set . In order to maintain the position of the clock pointer and ensure thread safety , You need to add a clock pointer member clock_hand_ And a read-write lock mutex_, Frame set frames_ Each element of the represents whether the frame is in ClockReplacer And reference bit:
/**
* ClockReplacer implements the clock replacement policy, which approximates the Least Recently Used policy.
*/
class ClockReplacer : public Replacer {
public:
/**
* Create a new ClockReplacer.
* @param num_pages the maximum number of pages the ClockReplacer will be required to store
*/
explicit ClockReplacer(size_t num_pages);
/**
* Destroys the ClockReplacer.
*/
~ClockReplacer() override;
bool Victim(frame_id_t *frame_id) override;
void Pin(frame_id_t frame_id) override;
void Unpin(frame_id_t frame_id) override;
size_t Size() override;
private:
frame_id_t clock_hand_ = 0;
std::vector<std::tuple<bool, bool>> frames_;
std::shared_mutex mutex_;
};
Each method is defined as follows , It uses std::lock_guard To ensure that the code is abnormally safe :
ClockReplacer::ClockReplacer(size_t num_pages) {
for (size_t i = 0; i < num_pages; ++i) {
frames_.push_back(std::make_tuple(false, false));
}
}
ClockReplacer::~ClockReplacer() = default;
bool ClockReplacer::Victim(frame_id_t *frame_id) {
if (Size() == 0) {
return false;
}
std::lock_guard<std::shared_mutex> lock(mutex_);
while (true) {
auto &[contains, ref] = frames_[clock_hand_];
if (contains) {
if (ref) {
ref = false;
} else {
*frame_id = clock_hand_;
contains = false;
return true;
}
}
clock_hand_ = (clock_hand_ + 1) % frames_.size();
}
}
void ClockReplacer::Pin(frame_id_t frame_id) {
assert(static_cast<size_t>(frame_id) < frames_.size());
std::lock_guard<std::shared_mutex> lock(mutex_);
auto &[contains, ref] = frames_[frame_id];
contains = false;
ref = false;
}
void ClockReplacer::Unpin(frame_id_t frame_id) {
assert(static_cast<size_t>(frame_id) < frames_.size());
std::lock_guard<std::shared_mutex> lock(mutex_);
auto &[contains, ref] = frames_[frame_id];
contains = true;
ref = true;
}
size_t ClockReplacer::Size() {
std::shared_lock<std::shared_mutex> lock(mutex_);
size_t size = 0;
for (auto &[contains, ref] : frames_) {
size += contains;
}
return size;
}
Enter the command at the terminal :
mkdir build
cd build
cmake ..
make clock_replacer_test
./test/clock_replacer_test
The test results are as follows :

BufferPoolManager class
Here, the mutex lock is replaced by a read-write lock , To protect page_table_、pages_ and free_list_, At the same time, an auxiliary function is introduced GetVictimFrameId():
class BufferPoolManager {
// Omitted code
protected:
/**
* select a victim frame from the free list or replacer.
* @return the frame id, INVALID_PAGE_ID if the victim could not be found
*/
frame_id_t GetVictimFrameId();
/** This latch protects shared data structures. We recommend updating this comment to describe what it protects. */
std::shared_mutex latch_;
};
BufferPoolManager Class requires us to implement five functions :
FetchPageImpl(page_id)NewPageImpl(page_id)UnpinPageImpl(page_id, is_dirty)FlushPageImpl(page_id)DeletePageImpl(page_id)FlushAllPagesImpl()
The following functions will be implemented one by one .
FetchPageImpl(page_id)
This function implements the main functions of the buffer pool : Provide the specified to the upper layer page. The buffer pool manager starts with page_table_ Search for page_id Does the key exist :
- If it exists, it is based on
page_idCorrespondingframe_idFrom buffer poolpages_Take outpage - If it doesn't exist, go through
GetVictimFrameId()Function to select the frame to be swapped out , This function starts withfree_list_Find the empty bit of buffer pool in , If you can't find a vacant seat, you have to rely on the previous section to realizeClockReplacerSelect the wronged leader who has been replaced
The specific code is as follows :
Page *BufferPoolManager::FetchPageImpl(page_id_t page_id) {
// 1. Search the page table for the requested page (P).
std::lock_guard<std::shared_mutex> lock(latch_);
Page *page;
// 1.1 If P exists, pin it and return it immediately.
auto it = page_table_.find(page_id);
if (it != page_table_.end()) {
page = &pages_[it->second];
if (page->pin_count_++ == 0) {
replacer_->Pin(it->second);
}
return page;
}
// 1.2 If P does not exist, find a replacement page (R) from either the free list or the replacer.
// Note that pages are always found from the free list first.
frame_id_t frame_id = GetVictimFrameId();
if (frame_id == INVALID_PAGE_ID) {
return nullptr;
}
// 2. If R is dirty, write it back to the disk.
page = &pages_[frame_id];
if (page->IsDirty()) {
disk_manager_->WritePage(page->page_id_, page->data_);
}
// 3. Delete R from the page table and insert P.
page_table_.erase(page->GetPageId());
page_table_[page_id] = frame_id;
// 4. Update P's metadata, read in the page content from disk, and then return a pointer to P.
disk_manager_->ReadPage(page_id, page->data_);
page->update(page_id, 1, false);
replacer_->Pin(frame_id);
return page;
}
frame_id_t BufferPoolManager::GetVictimFrameId() {
frame_id_t frame_id;
if (!free_list_.empty()) {
frame_id = free_list_.front();
free_list_.pop_front();
} else {
if (!replacer_->Victim(&frame_id)) {
return INVALID_PAGE_ID;
}
}
return frame_id;
}
The above code also uses a Page::update Auxiliary function , Used to update the page Metadata :
/**
* update the meta data of page
* @param page_id the page id
* @param pin_count the pin count
* @param is_dirty is page dirty
* @param reset_memory whether to reset the memory of page
*/
void update(page_id_t page_id, int pin_count, bool is_dirty, bool reset_memory = false) {
page_id_ = page_id;
pin_count_ = pin_count;
is_dirty_ = is_dirty;
if (reset_memory) {
ResetMemory();
}
}
NewPageImpl(page_id)
This function inserts a new page into the buffer pool , If all the pages in the buffer pool are being accessed by threads , Insert the failure , Otherwise, it depends on GetVictimFrameId() Calculate the insertion position :
Page *BufferPoolManager::NewPageImpl(page_id_t *page_id) {
// 0. Make sure you call DiskManager::AllocatePage!
std::lock_guard<std::shared_mutex> lock(latch_);
// 1. If all the pages in the buffer pool are pinned, return nullptr.
if (free_list_.empty() && replacer_->Size() == 0) {
*page_id = INVALID_PAGE_ID;
return nullptr;
}
// 2. Pick a victim page P from either the free list or the replacer. Always pick from the free list first.
frame_id_t frame_id = GetVictimFrameId();
if (frame_id == INVALID_PAGE_ID) {
*page_id = INVALID_PAGE_ID;
return nullptr;
}
// 3. Update P's metadata, zero out memory and add P to the page table.
Page *page = &pages_[frame_id];
if (page->IsDirty()) {
disk_manager_->WritePage(page->page_id_, page->data_);
}
*page_id = disk_manager_->AllocatePage();
page_table_.erase(page->GetPageId());
page_table_[*page_id] = frame_id;
// Need to put dirty bit Set to false Can pass IsDirty The test case
page->update(*page_id, 1, true, true);
// 4. Set the page ID output parameter. Return a pointer to P.
return page;
}
DeletePageImpl(page_id)
This function deletes a... From the buffer pool and database files page, And its page_id Set to INVALID_PAGE_ID:
bool BufferPoolManager::DeletePageImpl(page_id_t page_id) {
// 0. Make sure you call DiskManager::DeallocatePage!
std::lock_guard<std::shared_mutex> lock(latch_);
// 1. search the page table for the requested page (P).
// If P does not exist, return true.
auto it = page_table_.find(page_id);
if (it == page_table_.end()) {
return true;
}
// 2. If P exists, but has a non-zero pin-count, return false. Someone is using the page.
Page &page = pages_[it->second];
if (page.pin_count_ > 0) {
return false;
}
// 3. Otherwise, P can be deleted. Remove P from the page table, reset its metadata and return it to the free list.
disk_manager_->DeallocatePage(page_id);
page_table_.erase(page_id);
page.update(INVALID_PAGE_ID, 0, false, true);
free_list_.push_back(it->second);
return true;
}
UnpinPageImpl(page_id, is_dirty)
This function is used to reduce the number of references to a page pin count, When pin_count by 0 You need to add it to ClockReplacer in :
bool BufferPoolManager::UnpinPageImpl(page_id_t page_id, bool is_dirty) {
std::lock_guard<std::shared_mutex> lock(latch_);
auto it = page_table_.find(page_id);
if (it == page_table_.end()) {
return false;
}
Page &page = pages_[it->second];
if (page.pin_count_ <= 0) {
return false;
}
// add page to replacer when the pin count is 0
page.is_dirty_ |= is_dirty;
if (--page.pin_count_ == 0) {
replacer_->Unpin(it->second);
}
return true;
}
FlushPageImpl(page_id)
If the buffer pool page Has been modified , You need to write it to disk to keep it synchronized :
bool BufferPoolManager::FlushPageImpl(page_id_t page_id) {
// Make sure you call DiskManager::WritePage!
std::shared_lock<std::shared_mutex> lock(latch_);
auto it = page_table_.find(page_id);
if (it == page_table_.end()) {
return false;
}
// write page to disk if it's dirty
Page &page = pages_[it->second];
if (page.IsDirty()) {
disk_manager_->WritePage(page_id, pages_[it->second].data_);
page.is_dirty_ = false;
}
return true;
}
FlushAllPagesImpl()
This function will buffer all page Write to disk :
void BufferPoolManager::FlushAllPagesImpl() {
// You can do it!
std::lock_guard<std::shared_mutex> lock(latch_);
for (size_t i = 0; i < pool_size_; ++i) {
Page &page = pages_[i];
if (page.page_id_ != INVALID_PAGE_ID && page.IsDirty()) {
disk_manager_->WritePage(i, page.data_);
page.is_dirty_ = false;
}
}
}
test
Input instructions at the terminal :
cd build
make buffer_pool_manager_test
./test/buffer_pool_manager_test
The test results are as follows :

summary
This experiment examines students' understanding of concurrency and STL The degree of mastery , Because the implementation steps are listed in the comments ( What's the trouble You can do it! notes ), So the code can be written smoothly , above ~~









