当前位置：网站首页>Distributed - summary list

Distributed - summary list

2022-07-01 04:49:00 【Hello，C++！】

One 、 Resolution principle

Single responsibility
Service granularity is moderate
Consider team structure
Start with the business model
Evolutionary split
Avoid circular and bidirectional dependencies

1 People's perspective

Maintain a namesake project Denali Million code monsters ( Although physical deployment is separate ), From release to launch , From the perspective of personnel , Hundred people develop on one project at the same time , Once something goes wrong online , All code needs to be rolled back , From the perspective of personnel , Also basically endured to the extreme .

2 Business perspective

Taobao contains too many businesses ： user 、 goods 、 transaction 、 payment … wait , All the code is early in denali In a project , The code has seriously affected the efficiency of the business , Each business has its own needs , You need to deploy your own application , Respective development requirements .

3 From an architectural point of view

From the database side oracle The bottleneck of centralized database architecture , The number of connection pools is limited (oracle The database provides about 5000 A connection ), Database CPU The limit has been reached 90%. The database side also needs to consider vertical splitting .

4. There is an urgent need to move towards a large-scale distributed era , First need to apply split

Three types of clusters ：

Load balancing cluster （Load balancing clusters） abbreviation LBC
High availability clusters （High-availability clusters） abbreviation HAC
High performance computing cluster （High-perfomance clusters） abbreviation HPC

Two 、 Single sign on SSO Implementation principle of

Single sign on SSO（Single Sign On） To put it simply, in a multi system environment , After a user logs in , You don't have to log in to other systems , That is to say, a user's login can get the trust of all other systems .

The essence of single sign on is to share login status among multiple application systems , If the user's login status is recorded in Session Medium , To achieve shared login status , Share it first Session.

Realization principle

The principle is relatively simple , Adopt sharing cookie Realization SSO,sso-server Use redis Storage users ticket,app-a and app-b Use Spring Interceptor filters user requests , Every request needs to be addressed to sso-server verification ticket, If authentication fails, redirect to login ( Attached source url).

sso Adopt client / Server architecture , We see first sso-client And sso-server Functions to be implemented （ below ：sso authentication center =sso-server）

CAS And OAuth2 The difference between

One 、

CAS Ensure the security of user resources on the client during single sign on .

OAuth2 It is to ensure the security of user resources on the server .

Two 、

CAS The final information to be obtained by the client is , Does this user have access to me （CAS client ） Resources for .

OAuth2 The final information obtained is , I （oauth2 service provider ） Can users' resources make you （oauth2 The client of ） visit .

3、 ... and 、

CAS Single sign on for , All resources are on the client side , be not in CAS On the server side . The user is giving CAS After the server provides the user name and password , As CAS The client doesn't know about this . Just give the client ST, So the client can't determine this ST Is it forged by the user or really effective , So take this ST Go to the server and ask again , What this user gave me is valid ST Still invalid ST, It's effective for me to let this user access .

OAuth2 authentication , Resources are all there OAuth2 The service provider side , The client wants to ask for the user's resources . So in the safest mode , After user authorization , The server cannot directly return token, Redirect to the client , Because of this token It may be intercepted by hackers , If a hacker intercepts this token, Then the user's resources are exposed to the hacker . So the smart server sends an authentication code To the client （ By redirecting ）, The client is in the background , adopt https The way , Use this code, And another string of pre negotiated passwords between the client and the server , To get token And refresh token, This process is very safe . If the hacker intercepts code, He doesn't have the pre negotiated password , He can't get token Of . such oauth2 Can guarantee the request for resources , The user agrees , The client is also recognized , You can safely send resources to this client .

summary ： therefore cas Login and OAuth2 The biggest difference in the process is , adopt ST perhaps code When you go to certification , Do you need a pre negotiated password .

3、 ... and 、 Distributed transactions - Solution

There are several solutions for distributed transactions ：

Global news
Distributed transaction based on reliable message service
TCC
Best effort notification

1、 Two-phase commit /XA

XA A transaction consists of one or more resource managers （RM）、 A transaction manager （TM） And an application （ApplicationProgram） form .

The first stage （prepare）： That is, all participants RM Prepare to execute the transaction and lock the required resources . participants ready when , towards TM The report is ready .
The second stage (commit/rollback)： When the transaction manager confirms that all participants (RM) all ready after , Send... To all participants commit command .

2、SAGA

Saga This is a database paper saga A scheme mentioned . Its core idea is to split long transactions into multiple local short transactions , from Saga Transaction coordinator coordination , If it's finished normally, then it's finished normally , If a step fails , The compensation operation is called once in reverse order .

SAGA Characteristics of the transaction ：

High concurrency , Don't look like XA Locking resources for a long time like a transaction
Normal operation and compensation operation need to be defined , Development volume ratio XA Big
The consistency is weak , For transfers , It could happen A User has deducted , Finally, the transfer failed again

3、TCC

About TCC（Try-Confirm-Cancel） The concept of , The first is the Pat Helland On 2007 A piece published in was called 《Life beyond Distributed Transactions:an Apostate’s Opinion》 The paper put forward .

TCC It is divided into 3 Stages

Try Stage ： Trying to perform , Complete all business checks （ Uniformity ）, Reserve necessary business resources （ Quasi isolation ）
Confirm Stage ： Confirm the execution of the actual execution of the business , No business checks , Use only Try Business resources reserved in the stage ,Confirm Operation requires idempotent design ,Confirm It needs to be retried after failure .
Cancel Stage ： Cancel the execution , Release Try Business resources reserved in the stage .Cancel The phase of the anomaly and Confirm The phase exception handling scheme is basically the same , Idempotent design is required .

TCC The characteristics are as follows ：

High concurrency , No long-term resource locking .
A large amount of development , Need to provide Try/Confirm/Cancel Interface .
Good consistency , Not going to happen SAGA After deduction, the transfer fails
TCC It is applicable to order business , Business with constraints on intermediate states

4、 Local message table | Transaction message

The core of the design is to asynchronously ensure the execution of tasks requiring distributed processing through messages .

Characteristics of local message table ：

Long transactions only need to be split into multiple tasks , Easy to use
Producers need to create additional message tables
Each local message table needs to be polled
If the consumer's logic fails to succeed through retry , Then more mechanisms are needed , To roll back the operation

5、 Best effort notification

The initiating notifying party shall try its best to inform the receiving party of the result of business processing through a certain mechanism ：

Provide the interface , Enable the notification receiver to query the business processing results through the interface
Message queue ACK Mechanism , Message queuing is performed at intervals 1min、5min、10min、30min、1h、2h、5h、10h The way , Gradually increase the notice interval , Until the time window limit required by the notification is reached . No further notice after

6、AT Transaction mode （ Open source distributed transaction solution -Seata）

This is Ali's open source project seata A transaction mode in , The ant gold suit is also called FMT. The advantage is that the transaction mode is used in , similar XA Pattern , There is no need to write various compensation operations for the business , Rollback is done automatically by the framework , The disadvantages are similar AT, There is a lock for a long time , It does not meet the high concurrency scenario . Interested students can refer to seata-AT

Seata Will provide users with AT、TCC、SAGA and XA Transaction mode , Create a one-stop distributed solution for users .

Four 、 The four isolation levels of a database

There are four isolation levels in the database ：

1. Read uncommitted Read uncommitted
  At this level , In the process of modifying a row of data by a transaction , Another transaction is not allowed to modify the row data , But allow another transaction to read the row data .
  So in this level , No lost updates , But there will be dirty reading 、 It can't be read repeatedly .
2. Read committed Read the submission
  At this level , Uncommitted write transactions do not allow other transactions to access the row , So there's no dirty reading ; But the transaction that reads the data allows other transactions to access the row data , So there will be non repeatable reading .
3. Repeatable read Repeated reading
  At this level , Read transactions prohibit writing transactions , But read transactions are allowed , Therefore, it will not happen that the same transaction reads different data twice （ It can't be read repeatedly ）, And write transactions prohibit all other transactions .
4. Serializable serialize
  This level requires that all transactions must be executed serially , Therefore, all problems caused by concurrency can be avoided , But it's inefficient .

5、 ... and 、CAP theory

CAP The theory is that ： In a distributed system , At best, it can only satisfy C、A、P Two of the requirements .

CAP The meaning of ：

- C：Consistency Uniformity
  Whether multiple copies of the same data are the same in real time .
- A：Availability Usability
  Usability ： For a certain period of time & The system returns a clear result It's called the system available .
- P：Partition tolerance Partition tolerance
  Distribute the same service across multiple systems , So as to ensure that a certain system goes down , There are still other systems that provide the same services .

BASE theory

CAP The theory tells us a tragic fact that we have to accept —— We can only C、A、P Choose two conditions . And for business systems , We often choose to sacrifice consistency for system availability and partition fault tolerance . But the point here is , So-called “ Sacrifice consistency ” It's not all about giving up data consistency , It's about sacrificing strong consistency for weak consistency . Let's introduce BASE theory .

- - BA：Basic Available Basic available
    - The whole system is under certain circumstances of force majeure , Can still guarantee “ Usability ”, That is, a definite result can still be returned within a certain period of time . It's just “ Basic available ” and “ High availability ” Is the difference between the ：
      - “ A certain time ” It can be extended properly
        When there is a big promotion , The response time can be extended appropriately
      - Return a degraded page to some users
        Return a degraded page directly to some users , To relieve the server pressure . But should pay attention to , Returning to the downgrade page is still returning clear results .
  - S：Soft State： The state of flexibility
    The status of different copies of the same data , You don't need real-time consistency .
  - E：Eventual Consisstency： Final consistency
    The status of different copies of the same data , You don't need real-time consistency , But make sure it's still consistent after a certain period of time .

6、 ... and 、 Implementation of distributed lock

1：SQL Optimize

update stock set num=num-1 where id = #{id};

be based on MySQL The pessimistic lock of

select * from stock where id=#{id} for update;

be based on MySQL Optimistic lock of

select num,version from stock where id=#{id};

update stock set num=new_num, version=version+1 where id=#{id} and version=#{version};

2： be based on Redis Distributed lock

Redis Nature provides setnx command , Atomic manipulation can be guaranteed , The order is at the designated key When there is no , by key Set the specified value .

3：java Medium Redission

If it is Java Code , have access to Redission Package to implement distributed locks .

4： be based on zookeeper Distributed locks for

The database lock ：

advantage ： Use the database directly , Use Simple .

shortcoming ： Most of the bottlenecks in distributed systems are in databases , Using database lock will increase the burden of database .

Buffer lock ：

advantage ： High performance , It's easier to implement , stay Allow accidental lock failure , Does not affect the normal use of the system , Cache lock is recommended .

shortcoming ： adopt The lock timeout mechanism is not very reliable , When the thread gets the lock , Too long processing time results in lock timeout , The function of the lock is invalid .

zookeeper lock ：

advantage ： Don't rely on timeouts to release locks ; High reliability ; When the system requires high reliability , The proposal USES zookeeper lock .

shortcoming ： Performance is not as good as cache lock , Because you need to create nodes frequently, delete nodes .

7、 ... and 、 Implementation scheme of distributed database data consistency technology | Data layer middleware

be based on ZooKeeper Service discovery of （CP）

be based on Eureka Service discovery of （AP）

Distributed data layer middleware

Dynamic data sources
Read / write separation
Distributed unique primary key generator
Sub database and sub table
Connection pool and SQL monitor
Dynamic configuration, etc

1.TDDL

Taobao developed according to its own business characteristics TDDL frame , It mainly solves the application transparency of database and table and data replication between heterogeneous databases , It's based on a centralized configuration JDBC datasource Realization .

characteristic

Implement dynamic data sources 、 Read / write separation 、 Sub database and sub table .

shortcoming

The function of sub database and sub table is not open source yet , Currently, there are few published documents , And need to rely on diamond（ Taobao internal use of a persistent configuration management system ）

2.DRDS

Alibaba distributed relational database service （Distribute Relational Database Service, abbreviation DRDS） It's a horizontal split 、 Smooth expansion and contraction capacity 、 Read write separation of online distributed database services .

Its predecessor is Taobao TDDL, The next generation is DRDS, Integrate cloud services , charge 、Cobar、TDDL Integrate , commercial , The preferred .

3.Atlas

Atlas By Qihoo 360 company Web Platform Department infrastructure team development and maintenance based on MySQL Data middle tier project of the protocol .

It's in MySQL Officially launched MySQL-Proxy 0.8.2 Based on version , A lot of changes bug, Many features have been added . At present, the project 360 Widely used within the company , quite a lot MySQL The service has been connected Atlas platform , It carries billions of read and write requests every day .

The main function ：

1. Read / write separation

2. Load balancing from the library

3.IP Filter

4. Automatic dividing table

5.DBA Smooth up and down line DB

6. Automatically remove downtime DB

4.MTDDL（Meituan Distributed Data Layer）

Meituan reviews distributed data access layer middleware

characteristic

Implement dynamic data sources 、 Read / write separation 、 Sub database and sub table , And tddl similar .

Now I will MTDDL For example , You can also refer to Taobao tddl, Explain the architecture design of distributed data layer Middleware in detail .

8、 ... and 、Zookeeper Principle and architecture design of

Zookeeper The distributed service framework is Apache Hadoop A subproject of , It is mainly used to solve some data management problems often encountered in distributed applications , Such as ：

Unified naming service
State synchronization service
Cluster management
Management of distributed application configuration items

Zookeeper Basic principle and architecture of

1、Zookeeper Role

» The leader （leader）： Responsible for the initiation and resolution of the vote , Update system status .

» Learners' （learner）： Including followers （follower） And the observer （observer）,follower It is used to accept the client request and return the result to the client , Take part in the voting process .

» Observer： Can accept client connections , Forward the write request to leader, but observer Not in the voting process , Sync only leader The state of ,observer The goal is to extend the system , Improve read speed

» client （client）： Request originator

Zookeeper At the heart of that is atomic radio , This mechanism ensures that Server Synchronization between . The protocol that implements this mechanism is called Zab agreement .

Zab There are two modes of protocol , They are Recovery mode （ Elector ） And broadcast mode （ Sync ）.

Every Server There are three states in the working process ：

LOOKING： At present Server I do not know! leader Who is it? , Searching for

LEADING： At present Server It's an elected leader

FOLLOWING：leader It has been elected , At present Server Keep up with it

1、 Distributed coordination technology

2、 Implementation of distributed lock

ZooKeeper Data model Znode

ZooKeeper In namespace Znode, Both file and directory characteristics . Maintain data as files 、 Meta information 、ACL、 Data structures such as time stamps , It can also be used as a part of path identification like directory . Each node in the graph is called a Znode. Every Znode from 3 Part of it is made up of :

① stat： This is the status message , Describe the Znode Version of , Permissions and other information

② data： With this Znode Associated data

③ children： The Znode Child nodes under

(4) Node type

**① Temporary node ：** The lifecycle of this node depends on the session that created them . Once the conversation is over (Session) end , Temporary nodes will be deleted automatically , Of course, you can delete it manually . Although every temporary Znode Will be bound to a client session , But they are still visible to all clients . in addition ,ZooKeeper The temporary node of is not allowed to have child nodes .

**② Permanent nodes ：** The life cycle of this node does not depend on the session , And only when the client displays the deletion operation , They can be deleted .

Sequential node

ZooKeeper Operation in service

Nine 、 Distributed, globally unique ID Solution details

ID It's the only identification of the data , The traditional way is to make use of UUID And database augmentation ID, In Internet enterprises , Most companies use Mysql, And because of the need for transaction support , So you usually use Innodb Storage engine ,UUID Too long and out of order , So it's not suitable for Innodb As the primary key , Self increasing ID More appropriate , But as the company's business grows , The amount of data will be more and more , Data need to be tabulated , And after dividing the tables , The data in each table will increase at its own pace , There is a good chance that ID Conflict .

Ten 、 Comparison of microservice configuration center

1、Disconf

2014 year 7 Baidu open source configuration management center , Also have the ability of configuration management , But it's no longer maintained , The most recent submission was two years ago .

2、Spring Cloud Config

2014 year 9 In open source ,Spring Cloud Ecological components , You can talk to Spring Cloud Seamless integration of the system .

3、Apollo

2016 year 5 month , Ctrip open source configuration management center , Have normative authority 、 Process governance and other features .

4、Nacos

2018 year 6 month , Alibaba open source configuration center , You can also do DNS and RPC Service discovery of .

Comparison of core concepts of configuration center

application 、 colony 、 Grayscale Publishing 、 Rights management 、 version management & Roll back 、 Multiple environments 、 Configure the comparison of real-time push

6 Deployment structure & High availability contrast

Spring Cloud Config

Spring Cloud Config contain config-server、Git and Spring Cloud Bus Three components ：

config-server Provide to client to get configuration ;
Git Used to store and modify configuration ;
Spring Cloud Bus Notify client configuration changes ;

Apollo

Apollo It is divided into MySQL,Config Service,Admin Service,Portal Four modules ：

MySQL Storage Apollo Metadata and user configuration data ;
Config Service Provide read of configuration 、 Push and other functions , All client requests fall to Config Service On ;
Admin Service Provide configuration changes 、 Release and other functions ,Portal The service of operation is Admin Service;
Portal Provide user configuration management interface ;

Apollo Support four dimensions Key-Value Format configuration

Application( application ) The application that actually uses the configuration ,Apollo The client needs to know who the current application is at runtime , So you can get the corresponding configuration . Each application has its own identity –appId, You need to configure... In your code
Environment( Environmental Science ) Configure the corresponding environment ,Apollo The client needs to know which environment the current application is in ,, So you can get the configuration of the application ; Environment has nothing to do with code , If the same code is deployed in different environments, the configuration of different environments should be obtained ; By default, the environment reads the configuration on the machine (server.properties Of env attribute ) designated
Cluster( colony ) Grouping of different instances in an application , For example, according to different data centers , The example of Shanghai computer room is divided into a cluster 、 The example of Shenzhen computer room is divided into a cluster ; For different Cluster, The same configuration can have different values ; By default, the cluster is specified by reading the configuration on the machine (server.properties Of idc attribute )
Namespace( Namespace ) Grouping of different configurations under an application , Is a collection of configuration items , You can simply put Namespace Category is ( To configure ) file , Different types of configuration are stored in different files , For example, database configuration file 、RPC The configuration file 、 Apply your own configuration file, etc ; The application can directly read the configuration of public components namespace, for example DAL、RPC etc. ; Applications can also inherit the configuration of public components namespace To adjust the configuration of public components , Such as DAL The number of initial database connections

Main functional features ：

Unified management of different environments 、 Configuration of different clusters
Configuration changes take effect in real time （ Hot release ）
Version release management
Grayscale Publishing
Rights management 、 Release review 、 Operational audit
Client configuration information monitoring
Provide Java and .Net Native Client
Provide an open platform API
Simple deployment , Less dependence

Nacos

Nacos Deployment requires Nacos Service and MySQL：

Nacos External services , Support configuration management and service discovery ;
MySQL Provide Nacos Data persistent storage of ;

Configuration center comparison

At present, there are many configuration centers on the market , This article mainly selects several widely used key items for comparison , As shown in the following table

The function point	spring-cloud-config	ctrip-apollo	disconf
Grayscale Publishing	I won't support it	Support	Partial updates are not supported
Warning notice	I won't support it	Support	Support
Instance configuration monitoring	Need to be combined springadmin	Support	Support
Configuration effective time	adopt refresh take effect	real time	real time
Configure update push	Manual trigger	Support	Support
Configure timed pull	nothing	Support	Dependent event driven
Local cache configuration	nothing	Support	Support
Spring Boot Support	Native support	Support	I won't support it
Spring Cloud Support	Native support	Support	I won't support it
Business intrusiveness	weak	weak	weak , Support annotation and xml The way
Unified management	nothing , adopt git operation	Unified interface	Unified interface

11、 ... and 、 Distributed transactions —2PC and 3PC principle TCC Business

Common solutions for distributed things :

2PC Two paragraph submission agreement
3PC Three paragraph submission agreement ( Make up for the shortcomings of the two end submission protocol )
TCC perhaps GTS( Ali )
Message middleware final consistency
Use LCN Solve distributed things , idea “LCN No production business ,LCN It's just the porter of the local business ”.

Twelve 、 Make sure that the database and table are distributed

1、 vertical （ The longitudinal ） segmentation

There are two kinds of vertical segmentation: vertical database and vertical table .

Advantages of vertical segmentation ：

Solve business system level coupling , Business is clear
Similar to micro service governance , It can also manage the data of different businesses at different levels 、 maintain 、 monitor 、 Extension etc.
High concurrency scenarios , Vertical segmentation improves to a certain extent IO、 Number of database connections 、 The bottleneck of stand-alone hardware resources

shortcoming ：

Some watches can't join, It can only be solved through interface aggregation , Increased the complexity of development
Distributed transaction processing is complex
There is still a problem of too much data in a single table （ It needs to be sliced horizontally ）

2、 level （ The transverse ） segmentation

When an application is no longer fine-grained for vertical segmentation , Or the number of data lines after segmentation is huge , There is a single library reading and writing 、 Storage performance bottlenecks , In this case, horizontal segmentation is needed .

The advantages of horizontal segmentation ：

There is no single database with too much data 、 The performance bottleneck of high concurrency , Improve system stability and load capacity
The transformation of application end is small , There is no need to split the business modules

shortcoming ：

Transaction consistency across shards is difficult to guarantee
Cross database join The performance of association query is poor
It is very difficult to extend data many times and the amount of maintenance is very large

3、 Several typical data fragmentation rules are ：

According to the range of values

Take the model according to the numerical value

Problems caused by sub database and sub table

1、 Transaction consistency issues

2、 Cross node Association query join problem

3、 Paging across nodes 、 Sort 、 Function problem

4、 Global primary key avoidance problem

5、 Data migration 、 Capacity expansion

Support sub database and sub table middleware

Standing on the shoulders of giants can save a lot of energy , At present, there are some mature open source solutions for sub database and sub table ：

sharding-jdbc（ Dangdang ）
TSharding（ Mushroom street ）
Atlas（ qihoo 360）
Cobar（ Alibaba ）
MyCAT（ be based on Cobar）
Oceanus（58 Same city ）
Vitess（ Google ）

client Pattern ,proxy Pattern

Whether it's client Pattern , still proxy Pattern , The core steps are the same ：SQL analysis , rewrite , route , perform , The results merge .

13、 ... and 、 Cache architecture in large distributed systems

1、CDN The advantages of caching are shown in the figure below ：

2、 Reverse proxy cache

The reverse proxy is located in the application server room , Deal with all that is right with Web Server requests .

If the page requested by the user is buffered on the proxy server , The proxy server sends the buffer directly to the user .

If there is no buffer, go first to Web The server makes a request , Retrieve the data , Local cache and then send it to the user . By lowering to Web Number of requests from the server , So it reduces Web Server load .

Application scenarios ： Generally, only small static file resources are cached , Such as css、js、 picture .

3、 Local application cache

Ehcache The main features of are shown in the figure below ：

Guava Cache

Basic introduction ：Guava Cache yes Google Open source Java Reuse toolset Libraries Guava One of the caching tools in .

4、 Distributed cache

Memcached

Redis

Common problems mainly include the following points ：

Data consistency
Cache penetration
Cache avalanche
Cache highly available
Cache hotspot

fourteen 、 Distributed NoSQL brief introduction

TRDB Database technology features ：

（1） Use strong storage mode technology . Here the special value database table 、 That's ok 、 Field creation , All need to be defined in advance , And related attribute constraints .

（2） use SQL Technical standards to define and operate databases .

（3） Use strong transactions to ensure availability and security

（4） It mainly adopts single machine centralized processing （CP,Centralized Processing） The way .

NoSQL Database technology features ：

（1） Using weak storage mode technology

（2） There is no SQL Technical standards to define and operate databases

（3） Weak transactions are used to ensure data availability and security, or there is no transaction processing mechanism at all .

（4） It mainly adopts multi computer distributed processing （DP,Distributed Processing） The way .

Common distributed file systems

GFS、HDFS、Lustre 、Ceph 、GridFS 、mogileFS、TFS、FastDFS etc. . They are suitable for different fields . None of them are system level distributed file systems , But application level distributed file storage Storage services .

15、 ... and 、 Distributed relational database solution

The key points of distributed relational database are as follows ：

sub-treasury
table
M-S
colony
Load balancing
Programming interface (API)

1、MyCat

characteristic

Support for read/write separation , Support Mysql Two masters and many subordinates , And a master-slave mode
Supports global tables , Data is automatically sliced to multiple nodes , Used for efficient table Association queries
Support unique based on E-R The segmentation strategy of relationship , It realizes efficient table Association query
Automatic failover , High availability
Provide high availability data fragmentation cluster
Support JDBC Connect ORACLE、DB2、SQL Server, Simulate it as MySQL Server Use
Support Mysql colony , It can be used as Proxy Use
Open source based on Alibaba Cobar Product development ,Cobar The stability of 、 reliability 、 Excellent architecture and performance

2、Atlas

Atlas By Qihoo360,Web Platform Department infrastructure team development and maintenance based on MySQL Data middle tier project of the protocol . It's in MySQL Officially launched MySQL-Proxy0.8.2 Based on version , A lot of changes bug, Many features have been added . At present, the project 360 Widely used within the company , quite a lot MySQL The service has been connected Atlas platform , It carries billions of read and write requests every day .

characteristic

Read / write separation
Load balancing from the library
IP Filter
SQL Statement black and white list
Automatic dividing table

3、Cobar

Cobar Is to provide a relational database （MySQL） Middleware for distributed services , It can make the traditional database get good linear expansion , And it still looks like a database , Be transparent to applications .

The product runs stably in Alibaba 3 In the above .
Took over 3000+ individual MySQL Database schema.
Cluster daily processing online SQL request 50 More than 100 million times .
The cluster handles online data traffic on a daily basis TB Above grade .

4、Mysql proxy

summary

MySQL Proxy It's a client End sum MySQLserver A simple program between the two ends , It can monitor 、 Analyze or change their communication . It's flexible to use , There is no limit to , Common uses include ： Load balancing , fault 、 Query analysis , Query filtering and modification, etc .MySQLProxy It's such a middle tier agent , To put it simply ,MySQLProxy It's a connection pool , Responsible for forwarding the connection request of the foreground application to the background database , And by using lua Script , It can realize complex connection control and filtering , So as to achieve read-write separation and load balancing . For applications ,MySQLProxy It's completely transparent , The application only needs to connect to MySQLProxy Just listen to the port . Of course , such proxy The machine may become a single point of failure , But you can use multiple proxy The machine is redundant , Configure multiple connections in the connection pool configuration of the application server proxy The connection parameters of .MySQLProxy A more powerful function is to realize “ Read / write separation ”, The basic principle is to let the main database handle transactional queries , Let slave handle SELECT Inquire about . Database replication is used to synchronize changes caused by transactional queries to slave databases in the cluster .

characteristic

- Load balancing
- Read / write separation
- Table splitting is not supported
- Agent layer monitoring

sixteen 、 Fully distributed transaction solution details

Business ： A transaction is a reliable, independent unit of work consisting of a set of operations , Business has ACID Characteristics of , Atomicity 、 Uniformity 、 Isolation and persistence .
Local transactions ： When a transaction is managed locally by a resource manager, it is called a local transaction . The advantage of local transactions is that they support strict ACID characteristic , Efficient , reliable , State can be maintained only in Explorer , And the application programming model is simple . But the local transaction does not have the processing ability of distributed transaction , The smallest unit of isolation is limited to resource managers .
Global transaction ： When a transaction is managed globally by the global transaction manager, it becomes a global transaction , The transaction manager is responsible for managing the global transaction state and participating resources , Consistent commit rollback of collaborative resources .
TX agreement ： Interface between application or application server and transaction manager .
XA agreement ： Interface between global transaction manager and resource manager .XA By X/Open Distributed transaction specification proposed by the organization . The specification mainly defines the interface between global transaction manager and local resource manager . Mainstream database products have been implemented XA Interface .XA The interface is a two-way system interface , It acts as a communication bridge between transaction managers and multiple resource managers . The reason why we need XA This is because in a distributed system, theoretically, two machines cannot reach a consistent state , Therefore, a single point is introduced for coordination . Transactions managed and coordinated by the global transaction manager can span multiple resources and processes . Global transaction managers generally use XA The two-phase protocol interacts with the database .
AP： Applications , Can be understood as using DTP（Data Tools Platform） The program .
RM： Explorer , This can be a DBMS Or message server management system , The application controls the resources through the resource manager , Resources must achieve XA Defined interfaces . The resource manager is responsible for controlling and managing the actual resources .
TM： Transaction manager , Responsible for coordination and management of affairs , Provide to AP Programming interface and management resource manager . The transaction manager controls the global transaction , Managing the life cycle of a transaction , And coordinate resources .
Two phase submission agreement ：XA A mechanism for coordinating multiple resources in a global transaction .TM and RM A two-stage submission scheme is adopted to solve the consistency problem . Two node commit requires a coordinator （TM） To control all participants （RM） The operation results of nodes and guide whether these nodes need to be finally submitted . The limitation of the two-phase submission is the cost of the agreement , The lasting cost of the preparation phase , The persistence cost of the global transaction state , Vulnerability caused by multiple potential failure points , After preparation , The failure before submission causes a series of isolation and recovery problems .
BASE theory ：BA Basic business availability , Partition support failed ,S Indicates the state of flexibility , That is to allow a short time out of sync ,E Indicates final consistency , The data is consistent in the end , But real time is inconsistent . Atomicity and permanence must be fundamentally guaranteed , For usability 、 The need for performance and service degradation , Only to reduce the requirements of consistency and isolation .
CAP Theorem ： For shared data systems , At most, you can only have CAP Two of them , Any two have their own adaptive scenarios , In a real business system, it's usually ACID And CAP Mixture . The most important thing in a distributed system is to meet the business requirements , Rather than pursuing a high level of abstraction , Absolute system characteristics .C For consistency , That is, all users see the same data .A Indicates availability , It means that you can always find a copy of the available data .P Indicates partition fault tolerance , Be able to tolerate network interruption and other failures .
Service patterns in flexible transactions ：
1. Can query operation ： The service operation has a globally unique identifier , The only definite time of operation .
2. Idempotent operation ： The business results generated by repeated calls are the same as those generated by one call . One is to achieve idempotency through business operations , Second, the system caches all requests and processing results , Finally, after detecting repeated requests , Automatically return the previous processing results .
3. TCC operation ：Try Stage , Try to do business , Complete all business checks , Achieve consistency ; Reserve necessary business resources , Achieve quasi isolation .Confirm Stage ： Really do business , Don't do any tests , Only for Try Business resources reserved in the stage ,Confirm The operation should also satisfy idempotency .Cancel Stage ： Cancel execution of business , Release Try Business resources reserved in the stage ,Cancel Operations must be idempotent .TCC And 2PC( Two-phase commit ) The difference between agreements ：TCC Located in the business service layer rather than the resource layer ,TCC There is no separate preparation stage ,Try The ability to operate and prepare resources ,TCC in Try The operation can flexibly select business resources , Lock granularity .TCC Development cost ratio 2PC high . actually TCC It is also a two-phase operation , however TCC Not equal to 2PC operation .
4. Compensable operation ：Do Stage ： Real execution of business processing , Business processing results are externally visible .Compensate Stage ： Offset or partially cancel the business result of the forward business operation , The compensation operation satisfies idempotency . constraint ： The compensation operation is feasible in business , The risks and costs caused by the non isolation of business execution results or incomplete compensation are controllable . actually ,TCC Of Confirm and Cancel Operation can be regarded as compensation operation .

seventeen 、 Fully distributed Session Solution details

Scheme 1 ： Client storage

Store information directly in cookie in
cookie Is a small piece of data stored on the client , Client pass http Protocol and server cookie Interaction , Usually used to store some insensitive information

shortcoming ：

- Data stored on client , There are safety risks
- cookie Storage size 、 There are restrictions on the type
- The data is stored in cookie in , If a request cookie Too big , It will add more overhead to the network

Option two ：session Copy

session Replication is a kind of server cluster used more by small enterprise applications session Management mechanism , Not many are used in real development , Through to web The server ( for example Tomcat) Set up clusters .

The problem is ：

session The principle of synchronization is to send a broadcast in the same LAN to synchronize asynchronously session Of , Once there are more servers , It's coming up ,session The amount of data that needs to be synchronized is large , You need to transfer the... On other servers session Synchronize all to this server , It will bring some network overhead , When the number of users is very large , There will be a lack of memory

Option three ：session binding ：

Nginx It's a free 、 Open source 、 High performance http Servers and reverse proxy servers

Nginx What can be done ：
Reverse proxy 、 Load balancing 、http The server （ Dynamic and static agents ）、 Forward agency

How to use nginx Conduct session binding
We make use of nginx Reverse proxy and load balancing , Previously, the client will be assigned to one of the servers for processing , The specific server to be allocated for processing depends on the server's load balancing algorithm ( polling 、 Random 、ip-hash、 Weight, etc ),

Option four ： be based on redis Storage session programme

advantage ：

This is the most commonly used method in enterprises
spring It's packaged for us spring-session, Just introduce dependencies directly
Data saved in redis in , Seamless access , There is no potential safety hazard
redis It can be used as a cluster , Build master slave , At the same time, it is convenient to manage

shortcoming ：

One more network call ,web The container needs to be directed to redis visit

eighteen 、 Load balancing ： Algorithm 、 Realization 、 100 million load solution details

Load balancing （Load Balance）, It means to load （ Such as access requests from the front end ） Balance 、（ Through load balancing algorithm ） Allocate to multiple operating units （ The server , middleware ） Go ahead and execute . It's about high performance , A single point of failure （ High availability ）, Extensibility （ Horizontal expansion ） The ultimate solution . It can be understood as , Load balancing is a technology used by high availability and high concurrency .

The role of load balancing ：

1、 Increased throughput , Solving concurrent pressure （ High performance ）;

2、 Provide for failover （ High availability ）;

3、 By adding or reducing the number of servers , Provide site scalability （ Extensibility ）;

4、 Safety protection （ Do some filtering on the load balancing device , Black and white list, etc ）.

Hardware load balancing

adopt F5、A10、Citrix Netscaler Wait for hardware to achieve load balancing .

Software load balancing

adopt LVS、Nginx、HAProxy Such as software to achieve load balancing .

nineteen 、 Implementation principle of distributed consistency protocol

The classification of consistency

Strong consistency
- explain ： Ensure that the cluster state is changed immediately after the system change commit .
- Model ：
- - Paxos
  - Raft（muti-paxos）
  - ZAB（muti-paxos）
Weak consistency
- explain ： It's also called ultimate consistency , The system does not guarantee to change the state of the cluster immediately after the change commit , But over time, the final state is consistent .
- Model ：
- - DNS System
  - Gossip agreement

Cluster Each node in the maintains a current state of the entire cluster in its own view , It mainly includes ：

Current cluster status
What the nodes in the cluster are responsible for slots Information , And its migrate state
Of the nodes in the cluster master-slave state
The survival status and unreachable voting of each node in the cluster

Redis Clusters are decentralized , The states are synchronized with each other gossip Protocol communication , There are several types of messages in the cluster ：

Meet adopt 「cluster meet ip port」 command , The nodes of the existing cluster will send invitations to the new nodes , Join an existing cluster .
Ping Nodes send... To other nodes in the cluster every second ping news , The message contains the addresses of two nodes that it knows 、 Slot 、 State information 、 Last communication time, etc .
Pong Node receives ping The message will be answered pong news , The message also contains two known node information .
Fail node ping When you don't know a node , All nodes in the cluster will be broadcast messages that the node has hung up . Other nodes mark offline after receiving the message .

Essay classification

原网站

版权声明
本文为[Hello，C++！]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/182/202207010440195184.html

当前位置：网站首页>Distributed - summary list

Distributed - summary list

One 、 Resolution principle

Two 、 Single sign on SSO Implementation principle of

Realization principle

3、 ... and 、 Distributed transactions - Solution

1、 Two-phase commit /XA

2、SAGA

3、TCC

4、 Local message table | Transaction message

5、 Best effort notification

6、AT Transaction mode （ Open source distributed transaction solution -Seata）

Four 、 The four isolation levels of a database

5、 ... and 、CAP theory

BASE theory

6、 ... and 、 Implementation of distributed lock

1：SQL Optimize

be based on MySQL The pessimistic lock of

be based on MySQL Optimistic lock of

2： be based on Redis Distributed lock

3：java Medium Redission

4： be based on zookeeper Distributed locks for

7、 ... and 、 Implementation scheme of distributed database data consistency technology | Data layer middleware

be based on ZooKeeper Service discovery of （CP）

be based on Eureka Service discovery of （AP）

Distributed data layer middleware

8、 ... and 、Zookeeper Principle and architecture design of

Zookeeper Basic principle and architecture of

1、 Distributed coordination technology

2、 Implementation of distributed lock

ZooKeeper Data model Znode

ZooKeeper Operation in service

Nine 、 Distributed, globally unique ID Solution details

Ten 、 Comparison of microservice configuration center

Comparison of core concepts of configuration center

Main functional features ：

Configuration center comparison

11、 ... and 、 Distributed transactions —2PC and 3PC principle TCC Business

Twelve 、 Make sure that the database and table are distributed

1、 vertical （ The longitudinal ） segmentation

2、 level （ The transverse ） segmentation

3、 Several typical data fragmentation rules are ：

According to the range of values

Take the model according to the numerical value

Problems caused by sub database and sub table

Support sub database and sub table middleware

13、 ... and 、 Cache architecture in large distributed systems

1、CDN The advantages of caching are shown in the figure below ：

2、 Reverse proxy cache

** Application scenarios ：** Generally, only small static file resources are cached , Such as css、js、 picture .

3、 Local application cache

Ehcache The main features of are shown in the figure below ：

Guava Cache

** Basic introduction ：**Guava Cache yes Google Open source Java Reuse toolset Libraries Guava One of the caching tools in .

4、 Distributed cache

Memcached

Redis

fourteen 、 Distributed NoSQL brief introduction

15、 ... and 、 Distributed relational database solution

1、MyCat

characteristic

2、Atlas

characteristic

3、Cobar

4、Mysql proxy

summary

characteristic

sixteen 、 Fully distributed transaction solution details

seventeen 、 Fully distributed Session Solution details

Scheme 1 ： Client storage

Option two ：session Copy

Option three ：session binding ：

Option four ： be based on redis Storage session programme

eighteen 、 Load balancing ： Algorithm 、 Realization 、 100 million load solution details

nineteen 、 Implementation principle of distributed consistency protocol

The classification of consistency

Essay classification

边栏推荐

猜你喜欢

随机推荐

Application scenarios ： Generally, only small static file resources are cached , Such as css、js、 picture .

Basic introduction ：Guava Cache yes Google Open source Java Reuse toolset Libraries Guava One of the caching tools in .