当前位置:网站首页>EC code introduction
EC code introduction
2022-07-27 22:13:00 【Samooyou】
What is? EC code
EC(Erasure code), It is a kind of erasure code , Compared with multi replica replication , Erasure codes can achieve higher data reliability with less data redundancy , But the coding method is more complex , It takes a lot of calculation . Erasure codes can only tolerate data loss , Data tampering cannot be tolerated , This is the name of erasure code .
Principle of erasure "> Principle of erasure "> Principle of erasure
EC The code is divided into data block and check block . Suppose our input data is in D1,D2,...D5 To represent the vector of , matrix B Is the coding matrix , After coding, we get D,C Matrix of composition , among D For data blocks ,C Is a check block . Our data writing needs to be encoded before it can be stored .
Redundancy comparison
Compared with traditional files that are divided into data blocks for storage ,EC Encoded files are divided into blocks , A block group is divided into data blocks (data block) Sum check block (parity block), When a block loss occurs in a block group , When the number of lost blocks does not exceed a certain number , We can recover through the remaining blocks in the block group . for example RS(6-3), It means that a block group consists of 6 Data blocks and 3 Check blocks , The number of dropped blocks that can be tolerated is equal to the number of check blocks , That is to say 3.9 As long as you don't lose more than 3 Block , Can be recovered by relevant algorithms .(RS For Reed-Solomon code )
We store and RS(6-3) For example , Compare the data redundancy of the two .
Classic three copy storage , The document is divided into several Block, And each Block Corresponding to three copies (Replicas), in other words , Two of the three replicas belong to redundant storage ,200% Redundancy ratio .
RS(6-3), Within a block group ,9 There are 6 Data block , That is to say, there is only one left 3 Check blocks are redundant , The redundancy rate is 50%.
so , use EC Code in principle , It can greatly reduce redundancy , Improve storage efficiency .
EC Storage of code file
EC The storage of code files is mainly divided into continuous storage (contiguous storage) And stripe cell storage (stripe cell storage), because HDFS Continuous storage is not yet supported , Our next concepts are all around stripe cell storage .
Fringe unit (Stripe Unit):
stay EC Code encoded file , The file is divided into several fringe units . Stripe unit is stored , The stripe cells will be scattered and stored in multiple DN On .
EC Code code (Encoding)
The stripe unit acts as encoder The input of , The verification unit acts as encoder Output , The process of generating a verification unit from a fringe unit is called EC Code encoding .
Write each stripe cell , Will be coded , Generate several verification units , These check units are written into several check blocks .
EC Code decoding (Decoding)
The remaining fringe unit and check unit are input as decoder , Finally get complete data , This process of recovering data is called decoding .
quote EC Post code changes to the architecture
NN End extension : Normally , The block group contains several blocks , Developers have introduced a new block naming pattern , Let's get the block group it belongs to from the block name , So as to realize the management at the block group level .
Client End extension :Client You can read and write in parallel ,DFSStripedOutputStream management data streamers A collection of , Every data streamer One corresponding to the internal block of a storage block group DN. each data streamers Basically, they work asynchronously , One of the coordinators coordinates the whole writing process , Including the end of writing the block group , Allocation of new block groups , It realizes parallel writing at the block level , Of course , Block group level or serial write . In terms of reading ,DFSStripedInputStream It can convert the byte range of the file requested to be read into the byte range of multiple internal blocks in the block group , And then realize parallel reading .HDFS It's using online EC, Code while writing .
DN End extension :DN The end will run one ECWorker The task of , Mainly for the backstage to deal with bad EC Code block . bad EC The code block will be NN detected , Then choose one DN Do recovery work , Relevant recovery tasks are informed by the return of heartbeat . This process is right or wrong Replicated Block The recovery process is very similar . The reconstruction process mainly performs these three key tasks .
1. from source nodes Read data at ,, The input data is read in parallel by a specific thread pool . be based on EC Code strategy , Just read the minimum number of input blocks that can be used for recovery .
2. Decode data and generate output data , New data and blocks can be obtained by decoding the input data . All missing data blocks and check blocks will be decoded together .
3. Transfer the generated data block to the target node , Once decoding is complete , The recovered block will be passed to the target DN.
EC Code strategies have different modes , It encapsulates our coding / The way of decoding . The definition of each method contains the following information :
1.EC schema, It includes the number of data blocks and check blocks in a block group and the related coding algorithm .RS(6,3) Namely 6 Data blocks ,3 Check blocks , With Reed-Solomon code .
2. The size of the strip unit , This determines the granularity of our reading and writing , Include clients buffer size , Some work of coding .
for example ,RS-6-3-1024k, Express RS code ,6 Data blocks 3 Check blocks , The size of each strip unit is 1024k.
EC Code strategy and multi copy strategy can coexist , You can set relevant directories to force the use of multiple copies instead of EC code . alike , Specifically EC The code policy is set on the directory , When a file is created , It will use the nearest ancestor path EC Code strategy . Directory level EC The code policy will only affect the newly created files in the directory . Once the file is created , His EC The code strategy will not change , Unless we copy this file ( For example, use distcp), This will rewrite his data . It is useless to rename or move the file to another directory .
Users can also use XML File customization EC Strategy , I won't repeat it here , For details, please refer to Official website Information .
Deploy
EC Code pair cluster CPU And network have higher requirements .
EC The encoding and decoding of the code will consume extra CPU, Mainly in the Client End sum DN End .
Realization EC The code strategy needs at least DN Reach a certain number , for example RS(6,3) At the very least 9 individual DN.(6 Data blocks ,3 Check blocks )
To achieve rack level fault tolerance ,EC Code files will be transferred between racks , Most operations of reading and writing strip units are cross rack , Therefore, bisection of bandwidth is very important .
meanwhile , Having enough racks is also very important for rack level fault tolerance , Each rack cannot hold more blocks than the number of check blocks , That is to say, at least ( Data blocks + Check block )/ Check block , Take the whole rack up , Otherwise, rack level fault tolerance cannot be achieved . Even if the number of racks is not enough , The file written by the bar will still be propagated to multiple nodes , Try to ensure node level fault tolerance .
Limit
application EC After a yard ,append,truncate,concat,hsync,hflush,setReplication Wait functions are no longer supported for technical reasons
Reference material :
Apache Hadoop 3.3.3 – HDFS Erasure Coding
Hadoop 3.0 EC technology - Programming knowledge
Introduction to HDFS Erasure Coding in Apache Hadoop - Cloudera Blog
Erasure-Code- Erasure code -1- Principles - You know
Talk again HDFS Erasure Coding_Android Blogs of people on the road -CSDN Blog
边栏推荐
- 关系型数据库的设计思想,20张图给你看的明明白白
- In depth understanding of recursive method calls (including instance maze problem, tower of Hanoi, monkey eating peach, fiboracci, factorial))
- STM32项目分享---MQTT智能门禁系统(含APP控制)
- 学完4种 Redis 集群方案要多久?我一口气给你说完
- STM32 project Sharing -- mqtt intelligent access control system (including app control)
- Excalidraw:很好用的在线、免费「手绘」虚拟白板工具
- The design idea of relational database is obvious to you in 20 pictures
- More than 100 lines should be split into functions
- Interview question: what are the functions of fail safe mechanism and fail fast mechanism
- 固体继电器
猜你喜欢

vs2019 release模式调试:此表达式有副作用,将不予计算。

Log4j 漏洞仍普遍存在?

If demand splitting is as simple as cutting a cake | agile practice

极化继电器

Talk about MySQL transaction two-phase commit

Log4j 漏洞仍普遍存在,并持续造成影响

高频继电器

Pytoch distributed training

Monitor the running of server jar and restart script

Seven lines of code crashed station B for three hours
随机推荐
Implementation of arbitrary code execution based on.Net dynamic compilation technology
Interview questions that big companies need to prepare
How to deal with high concurrency deadlock?
Understanding of L1 regularization and L2 regularization [easy to understand]
It's too voluminous. A company has completely opened its core system (smart system) that has been operating for many years
V2.X 同步异常,无法云端同步的帖子一大堆,同步又卡又慢
Starrocks community structure comes out, waiting for you to upgrade!
How long will it take to learn the four redis cluster solutions? I'll finish it for you in one breath
2021-11-05 understanding of class variables and class methods
学完4种 Redis 集群方案要多久?我一口气给你说完
8000字讲透OBSA原理与应用实践
Station B collapsed. What did the developer responsible for the repair do that night?
Deploy dolphin scheduler high availability cluster based on rainbow
时间继电器
Deepfake's face is hard to distinguish between true and false, and musk Fenke has disguised successfully
[question 22] dungeons and Warriors (Beijing Institute of Technology / Beijing Institute of Technology / programming methods and practice / primary school)
固体继电器
How can anyone ask how MySQL archives data?
[question 24] logic closed loop (Beijing Institute of Technology / Beijing University of Technology / programming methods and practice / primary school)
舌簧继电器