当前位置：网站首页>What is etcd and its application scenarios

What is etcd and its application scenarios

2022-06-24 02:43:00 【Ask North】

From the official account ：BiggerBoy

One 、 What is? etcd？

etcd Pronunciation is /ˈɛtsiːdiː/, The origin of the name ,“distributed etc directory.”, intend “ Distributed etc Catalog ”, It shows that it stores the configuration information of large-scale distributed systems .
A sentence on the official website

A distributed, reliable key-value store for the most critical data of a distributed system.

Translated and understood as ： A warehouse for storing the most critical data in a distributed system , It's distributed 、 Reliable key value pairs for the warehouse .
First, it is a data storage warehouse , Its features are distributed 、 Reliable , The data storage format is key value pair storage , It is mainly used to store key data in distributed systems .

In addition, in the official FAQ In the document

https://etcd.io/docs/v3.5/faq/ That's the explanation etcd Of ：
etcd Is a consistent distributed key value store . It is mainly used as a separate coordination service in a distributed system . And it is designed to save a small amount of data that can be completely put into memory .

This indicates that its data is stored in memory , And it is recommended to save a small amount of data . Seeing the coordination service, we will soon think of Zookeeper, also Zookeeper It is also a distributed coordinator of distributed high availability , Data also exists in memory . What's the difference between them ？ I'll talk about .

Two 、etcd The origin of

With CoreOS and Kubernetes And other projects are increasingly popular in the open source community , They're all used in projects etcd Component as a highly available 、 Highly consistent service discovery repository , Gradually become the focus of developers . In the era of cloud computing , How to make services access to computing cluster quickly and transparently , How to make shared configuration information quickly discovered by all machines in the cluster , more importantly , How to build such a set of high availability 、 Security 、 Easy to deploy and fast response service cluster , Has become an urgent problem to be solved .etcd It's good news to solve these problems

actually ,etcd As a subject Zookeeper And doozer Inspired projects , Besides having similar functions , It has the following 4 Characteristics {![ To quote Docker Official documents ]}.

Simple ： be based on HTTP+JSON Of API Let you use it curl The command is easy to use .
Security ： Optional SSL Customer authentication mechanism .
Fast ： Each instance supports 1000 write operations per second .
trusted ： Use Raft The algorithm fully realizes the distributed .

With the development of cloud computing , People pay more and more attention to the problems involved in distributed systems . Accept the last article ZooKeeper Application scenario summary （ Hyperdetail ） Inspired by the article （ Some cases are quoted from this article .）, According to my own understanding, I also summarized some etcd The classic usage scenarios of . It is worth noting that , The data in distributed system is divided into control data and application data . Use etcd The data processed in the scenario of is the control data by default , For application data , Only a small amount of data is recommended , However, frequent update visits .

3、 ... and 、etcd Application scenarios of

3.1 Scene one ： Service discovery

Service discovery （Service Discovery） To solve one of the most common problems in Distributed Systems , That is, how can processes or services in the same distributed cluster find each other and establish connections . essentially , Service discovery is to find out whether there are processes listening in the cluster udp or tcp port , And you can search and connect by name . To solve the problem of service discovery , We need to have three pillars , Be short of one cannot .
A strong consistency 、 Highly available service storage directory . be based on Raft Algorithm etcd By nature, it is such a highly consistent and available service storage directory .

A mechanism for registering services and monitoring the health status of services . Users can go to etcd Registration service in , And set the registered service key TTL, Keep the heartbeat of the service regularly to achieve the effect of monitoring the health status .
A mechanism for finding and connecting services . By means of etcd The services registered under the specified topic can also be found under the corresponding topic . To make sure the connection , We can deploy one... On every service machine proxy Mode etcd, This ensures access to etcd The services of the cluster can be connected to each other .

chart 1 The service discovery diagram is shown .

chart 1 Service discovery diagram

Let's take a look at the specific application scenarios corresponding to service discovery .

In the framework of microservice collaborative work , Service dynamic add . With Docker The popularity of containers , A variety of micro services work together , There are more and more cases that constitute a relatively powerful architecture . The need to add these services transparently and dynamically is growing . Through the service discovery mechanism , stay etcd The directory in which a service name is registered , Store the... Of available service nodes in this directory IP. In the process of using the service , Just find the available service nodes from the service directory to use . The collaboration of microservices is shown in the figure 2 Shown .

chart 2 Microservices work together

PaaS Application multi instance and instance failure restart transparency in the platform .PaaS Applications in the platform generally have multiple instances , By domain name , Not only can multiple instances be accessed transparently , It can also realize load balancing . However, an instance of the application may fail to restart at any time , At this point, you need to dynamically configure domain name resolution （ route ） Information in . adopt etcd The service discovery function of can easily solve this dynamic configuration problem , Pictured 33 Shown .
Transparent cloud platform

chart 3 Transparent cloud platform

3.2 Scene two ： Message publishing and subscription

In distributed systems , The most suitable communication mode between components is message publishing and subscription mechanism . To be specific , Build a configuration sharing center , The data provider publishes messages in this configuration center , And message users subscribe to topics they care about , Once there's a news release on the subject , Will notify subscribers in real time . In this way, centralized management and real-time dynamic update of distributed system configuration can be realized .

Some configuration information used in the application is stored in etcd Centralized management . This kind of scenario is usually used like this ： The app takes the initiative to start from etcd Get the configuration information once , meanwhile , stay etcd Register one on the node Watcher And wait for , Every time the configuration is updated later ,etcd Will notify subscribers in real time , So as to obtain the latest configuration information .

Distributed search services , The meta information of the index and the node status information of the server cluster machine are stored in etcd in , For each client subscription . Use etcd Of key TTL The function ensures that the machine status is updated in real time .

Distributed log collection system . The core work of this system is to collect logs distributed on different machines . Collectors are usually based on application （ Or topic ） To assign collection task units , So you can etcd Create one on to apply （ Or topic ） Named directory P, And put this application （ Or topic ） All related machines ip, Stored in the directory as a subdirectory P Next , Then set a recursive etcd Watcher, Monitor applications recursively （ Or topic ） Changes to all information in the directory . In this way, the machine IP（ news ） In the event of change , It can inform the collector to adjust the task allocation in real time .

Information in the system needs dynamic automatic acquisition and manual intervention to modify information request content . The usual solution is to expose the interface , for example JMX Interface , To get some runtime information or submit a modification request . And introduce etcd after , Just store the information in the designated etcd Directory , You can pass HTTP The interface is accessed directly from the outside .

chart 4 Message publishing and subscription

3.3 Scene three ： Load balancing

stay Scene one Load balancing is also mentioned in , Load balancing mentioned in this article refers to soft load balancing . In distributed systems , To ensure high availability of services and data consistency , Data and services are usually deployed in multiple copies , So as to achieve peer-to-peer service , Even if one of the services fails , It does not affect the use of . This kind of implementation will lead to the decline of data writing performance to a certain extent , But it can realize the load balance of data access . Because every peer service node has complete data , So the user's access traffic can be divided into different machines .

etcd The information access stored in the distributed architecture supports load balancing .etcd After clustering , Every etcd All of the core nodes can handle the user's requests . therefore , The message data with small amount of data but frequent access is directly stored in etcd It's also a good choice , For example, the second level code table commonly used in business system . The working process of secondary code table is generally like this , Store the code in the table , stay etcd The specific meaning represented by the stored code in , The business system calls the table lookup process , You need to look up the meaning of the code in the table . So if you store a small amount of data in the secondary code table in etcd in , It is not only convenient to modify , It's also easy to access a lot .

utilize etcd Maintain a load balancing node table .etcd You can monitor the status of multiple nodes in a cluster , When a request is sent , The request can be polled and forwarded to multiple surviving nodes . similar KafkaMQ, adopt Zookeeper To maintain the load balance between producers and consumers . You can also use it etcd To do it Zookeeper The job of .

chart 5 Load balancing

3.4 Scene 4 ： Distributed notification and coordination

Distributed notification and coordination discussed here , Similar to message publishing and subscription . Both use etcd Medium Watcher Mechanism , Through registration and asynchronous notification mechanism , Realize the notification and coordination between different systems in the distributed environment , So as to process the data change in real time . The implementation is usually ： Different systems are in etcd Register the same directory on , Simultaneous setting Watcher Monitor changes in the directory （ If there is a need for changes to subdirectories , Can be set to recursive mode ）, When a system is updated etcd The catalog of , So set up Watcher The system will be notified , And deal with it accordingly .

adopt etcd Perform low coupling heartbeat detection . The detection system and the detected system pass etcd A directory on the is associated rather than directly associated with , This can greatly reduce the coupling of the system .

adopt etcd Complete system scheduling . A system has two parts: console and push system , The responsibility of the console is to control the push system to carry out the corresponding push work . Some operations that managers do on the console , In fact, it just needs to be modified etcd The status of some directory nodes on , and etcd Will automatically notify the registration of these changes Watcher Push system client , The push system makes the corresponding push task .

adopt etcd Complete work report . Most of the similar task distribution systems , After the subtask starts , To etcd To register a temporary working directory , And regularly report their progress （ Write progress to this temporary directory ）, In this way, the task manager can know the progress of the task in real time .

chart 6 Distributed collaborative work

3.5 Scene five ： Distributed lock

because etcd Use Raft The algorithm keeps the strong consistency of data , The values stored in the cluster for an operation must be globally consistent , So it's easy to implement distributed locks . There are two ways to use the lock service , One is to keep exclusive , Second, control the timing .

Keep exclusive , That is, only one user who tries to obtain the lock can get .etcd For this purpose, a set of distributed lock atom operations is provided CAS（CompareAndSwap） Of API. By setting prevExist value , It can ensure that when multiple nodes create a directory at the same time , There is only one success , The user can be considered to have obtained the lock .

Control timing , That is, all users who try to obtain the lock will enter the waiting queue , The order in which locks are obtained is globally unique , At the same time, it determines the execution order of the queue .etcd A set of API（ Automatically create ordered keys ）, When creating a value for a directory, specify POST action , such etcd Will automatically generate a current maximum value of key in the directory , Store this new value （ Client number ）. You can also use API List all key values in the current directory in order . At this time, the value of these keys is the timing of the client , The value stored in these keys can be the number representing the client .

chart 7 Distributed lock

3.6 Scene six ： Distributed queues

The general usage of distributed queues is similar to the control timing usage of distributed locks described in scenario 5 , Create a first in, first out queue , Ensure that the order . Another interesting implementation is to ensure that the queue reaches a certain condition, and then execute in order . The implementation of this method can be found in /queue Create another... In this directory /queue/condition node .

condition It can represent the queue size . For example, a large task can only be executed when many small tasks are ready , Every time a small task is ready , Here it is condition Digital plus 1, Until the big task figures , Then start to perform a series of small tasks in the queue , Finally carry out the big task .

condition It can indicate that a task is not in the queue . This task can be the first executor of all sorting tasks , It can also be a point in the topology where there is no dependency . Usually , You must perform these tasks before you can perform other tasks in the queue .

condition It can also represent other kinds of notifications to start executing tasks . It can be specified by the control program , When condition When there is a change , Start Queue task .

chart 8 Distributed queues

3.7 Scene seven ： Cluster monitoring and Leader campaign for

adopt etcd It's very simple and real-time to monitor , The following two features are used .

The previous scenes have already mentioned Watcher Mechanism , When a node disappears or changes ,Watcher It will discover and inform the user as soon as possible .
Nodes can be set TTL key, Like every other 30s towards etcd Sending a heartbeat means that the node is still alive , Otherwise, the node disappears .

In this way, the health status of each node can be detected in the first time , To complete the monitoring requirements of the cluster .

in addition , Using distributed locks , Can finish Leader campaign for . For some long time CPU To calculate or use IO operation , It only needs to be elected Leader To calculate or process once , Then copy the results to others Follower that will do , So as to avoid repeated labor , Save computing resources .

Leader The classic application scenario is to build a full index in the search system . If each machine is indexed separately , Not only does it take time , And there is no guarantee of index consistency . By means of etcd Of CAS Mechanism election Leader, from Leader Index calculation , Then distribute the calculation results to other nodes .

chart 9 Leader campaign for

3.8 Scene 8 ： Why etcd without Zookeeper？

Read “ZooKeeper Application scenario summary （ Hyperdetail ）” Readers of an article may find ,etcd To achieve these functions ,Zookeeper Can achieve . So why use etcd Not directly Zookeeper Well ？

By contrast ,Zookeeper It has the following disadvantages ：

complex .Zookeeper The deployment and maintenance of is complex , Administrators need to master a series of knowledge and skills ; and Paxos Strong consistency algorithm is also famous for its complexity and incomprehension ; in addition ,Zookeeper The use of is also complex , The client needs to be installed , The official only offers java and C Interface between two languages .
Java To write . This is not right Java Biased , It is Java Itself is biased towards heavy applications , It introduces a lot of dependencies . The operation and maintenance personnel generally hope that the machine cluster is as simple as possible , It is not easy to make mistakes in maintenance .
Slow development .Apache Unique to foundation projects “Apache Way” Controversial in the open source community , One of the major reasons is the slow development of the project due to the huge structure and loose management of the foundation .

and etcd As a rising star , Its advantages are obvious .

Simple . Use Go The language is easy to write and deploy ; Use HTTP Easy to use as an interface ; Use Raft The algorithm ensures strong consistency and makes it easy for users to understand .
Data persistence .etcd The default data is persisted as soon as it is updated .
Security .etcd Support SSL Client security authentication .

Last ,etcd As a young project , In high-speed iteration and development , This is both an advantage , It's also a disadvantage . The advantage is that its future has unlimited possibilities , The disadvantage is that the iteration of the version makes the reliability of its use impossible to guarantee , Unable to get the inspection of large projects used for a long time . However , at present CoreOS、Kubernetes and Cloudfoundry And other well-known projects are used in the production environment etcd, So in general ,etcd It's worth trying .

Reference resources https://etcd.io/docs/v3.5/faq/

Excerpt from

http://www.sel.zju.edu.cn/blog/2015/02/01/etcd%E4%BB%8E%E5%BA%94%E7%94%A8%E5%9C%BA%E6%99%AF%E5%88%B0%E5%AE%9E%E7%8E%B0%E5%8E%9F%E7%90%86%E7%9A%84%E5%85%A8%E6%96%B9%E4%BD%8D%E8%A7%A3%E8%AF%BB/

原网站

版权声明
本文为[Ask North]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/02/202202211646161296.html