当前位置:网站首页>Masterless replication system (1) - write DB when node fails
Masterless replication system (1) - write DB when node fails
2022-07-31 16:39:00 【Huawei cloud】
The idea of single-master and multi-master replication is that the client sends a write request to a master node, and the DB system is responsible for copying the write request to other replicas.The master node decides the write order, and the slave nodes apply the write logs sent by the master node in the same order.
Some data storage systems use a different design: abandoning the primary node, allowing any replica to accept writes directly from clients.The earliest replicated data systems were masterless (or called decentralized replication, centerless replication), but later in the era of relational database dominance, this idea was almost forgotten.After Amazon used it for its in-house Dynamo system[^vi], it was once again a popular DB architecture.Riak, Cassandra, and Voldemort are all open source data stores with a masterless replication model inspired by Dynamo, so such databases are also known as Dynamo style.
[^vi]: Dynamo is not available for users outside of Amazon.Confusingly, AWS offers a managed database product called DynamoDB that uses a completely different architecture: it's based on single-leader replication.
In some unowned implementations, the client sends write requests directly to multiple replicas, while in other implementations, there is a coordinator node that writes on behalf of the client, but unlike the master node's database,The coordinator is not responsible for maintaining write order.This design difference has profound implications for how DBs are used.
4.1 Write DB when node fails
Assuming a three-replica DB, one of which is currently unavailable, perhaps rebooting to install system updates.Under the primary node replication model, to continue processing writes, a failover needs to be performed.
Without a master model, there is no such switch.
Figure-10: Client (user 1234) sends write requests to three replicas in parallel, two available replicas accept writes, and the unavailable replica cannot handle it.Assuming two successful confirmations of the three copies, the user 1234 can consider the writing to be successful after receiving the two confirmation responses.The case where one of the replicas cannot be written can be completely ignored.
The failed node comes back online and clients start reading it.Any writes that occur during a node failure are not yet synchronized at that node, so reads may get stale data.
To solve this problem, when a client reads data from the DB, it does not send a request to 1 replica, but to multiple replicas in parallel.Clients may get different responses from different nodes, i.e. the latest value from one node and the old value from another node.The version number can be used to determine which value is updated.
4.1.1 Read Repair and Anti-Entropy
The replication model should ensure that all data is eventually replicated to all replicas.After a failed node comes back online, how does it catch up with missed writes?
Dynamo-style data storage system mechanism:
Read repair
When a client reads multiple copies in parallel, an expired return value can be detected.As shown in Figure-10, user 2345 gets version 6 from R3 and version 7 from replicas 1 and 2.The client can determine that replica 3 is the expired value, and then write the new value to that replica.Suitable for read-intensive scenarios
Anti-entropy process
Some data stores have background processes that constantly look for data differences between replicas, copying any missing data from one replica to another.Unlike replicated logs based on primary replication, this anti-entropy process does not guarantee any particular order of replicated writes and introduces significant synchronization lag
Not all systems implement either scheme.For example, Voldemort currently has no anti-entropy process.If there is no anti-entropy process, because [read repair] is only possible to perform repair when a read occurs, those rarely accessed data may be lost in some replicas and can no longer be detected, thus reducing the durability of writes.
边栏推荐
- 入职一个月反思
- Replication Latency Case (1) - Eventual Consistency
- server certificate verification failed. CAfile: /etc/ssl/certs/ca-certificates.crt CRLfile: none 失败
- flutter设置statusbar状态栏的背景颜色和 APP(AppBar)内部颜色一致方法。
- 动态规划之线性dp(下)
- 苹果官网样式调整 结账时产品图片“巨大化”
- 【C语言】LeetCode27.移除元素
- 软件实现AT命令操作过程
- 牛客 HJ19 简单错误记录
- Summary of the implementation method of string inversion "recommended collection"
猜你喜欢
第05章 存储引擎【1.MySQL架构篇】【MySQL高级】
研发过程中的文档管理与工具
智能垃圾桶(八)——红外对管传感器(树莓派pico)
[pytorch] 1.7 pytorch and numpy, tensor and array conversion
6-22 Vulnerability exploit - postgresql database password cracking
Premiere Pro 2022 for (pr 2022)v22.5.0
Three aspects of Ali: How to solve the problem of MQ message loss, duplication and backlog?
深度学习机器学习理论及应用实战-必备知识点整理分享
How Redis handles concurrent access
SringMVC中个常见的几个问题
随机推荐
Baidu cloud web speed playback (is there any website available)
6. 使用 Postman 工具高效管理和测试 SAP ABAP OData 服务
Single-cell sequencing workflow (single-cell RNA sequencing)
Flutter 获取状态栏statusbar的高度
Anaconda如何顺利安装CV2
form 表单提交后,使页面不跳转[通俗易懂]
软件实现AT命令操作过程
2022年必读的12本机器学习书籍推荐
adb shell 报错error: device unauthorized
LevelSequence源码分析
SringMVC中个常见的几个问题
Replication Latency Case (1) - Eventual Consistency
[7.28] Code Source - [Fence Painting] [Appropriate Pairs (Data Enhanced Version)]
Flutter gets the height of the status bar statusbar
jeecg主从数据库读写分离配置「建议收藏」
T - sne + data visualization parts of the network parameters
i.MX6ULL driver development | 33 - NXP original network device driver reading (LAN8720 PHY)
复制延迟案例(3)-单调读
入职一个月反思
npm安装时卡在sill idealTree buildDeps,npm安装速度慢,npm安装卡在一个地方不动