当前位置:网站首页>Distributed - summary list
Distributed - summary list
2022-07-01 04:49:00 【Hello,C++!】
One 、 Resolution principle
- Single responsibility
- Service granularity is moderate
- Consider team structure
- Start with the business model
- Evolutionary split
- Avoid circular and bidirectional dependencies
1 People's perspective
Maintain a namesake project Denali Million code monsters ( Although physical deployment is separate ), From release to launch , From the perspective of personnel , Hundred people develop on one project at the same time , Once something goes wrong online , All code needs to be rolled back , From the perspective of personnel , Also basically endured to the extreme .
2 Business perspective
Taobao contains too many businesses : user 、 goods 、 transaction 、 payment … wait , All the code is early in denali In a project , The code has seriously affected the efficiency of the business , Each business has its own needs , You need to deploy your own application , Respective development requirements .
3 From an architectural point of view
From the database side oracle The bottleneck of centralized database architecture , The number of connection pools is limited (oracle The database provides about 5000 A connection ), Database CPU The limit has been reached 90%. The database side also needs to consider vertical splitting .
4. There is an urgent need to move towards a large-scale distributed era , First need to apply split
Three types of clusters :
- Load balancing cluster (Load balancing clusters) abbreviation LBC
- High availability clusters (High-availability clusters) abbreviation HAC
- High performance computing cluster (High-perfomance clusters) abbreviation HPC
Two 、 Single sign on SSO Implementation principle of
Single sign on SSO(Single Sign On) To put it simply, in a multi system environment , After a user logs in , You don't have to log in to other systems , That is to say, a user's login can get the trust of all other systems .
The essence of single sign on is to share login status among multiple application systems , If the user's login status is recorded in Session Medium , To achieve shared login status , Share it first Session.
Realization principle
The principle is relatively simple , Adopt sharing cookie Realization SSO,sso-server Use redis Storage users ticket,app-a and app-b Use Spring Interceptor filters user requests , Every request needs to be addressed to sso-server verification ticket, If authentication fails, redirect to login ( Attached source url).
sso Adopt client / Server architecture , We see first sso-client And sso-server Functions to be implemented ( below :sso authentication center =sso-server)
CAS And OAuth2 The difference between
One 、
CAS Ensure the security of user resources on the client during single sign on .
OAuth2 It is to ensure the security of user resources on the server .
Two 、
CAS The final information to be obtained by the client is , Does this user have access to me (CAS client ) Resources for .
OAuth2 The final information obtained is , I (oauth2 service provider ) Can users' resources make you (oauth2 The client of ) visit .
3、 ... and 、
CAS Single sign on for , All resources are on the client side , be not in CAS On the server side . The user is giving CAS After the server provides the user name and password , As CAS The client doesn't know about this . Just give the client ST, So the client can't determine this ST Is it forged by the user or really effective , So take this ST Go to the server and ask again , What this user gave me is valid ST Still invalid ST, It's effective for me to let this user access .
OAuth2 authentication , Resources are all there OAuth2 The service provider side , The client wants to ask for the user's resources . So in the safest mode , After user authorization , The server cannot directly return token, Redirect to the client , Because of this token It may be intercepted by hackers , If a hacker intercepts this token, Then the user's resources are exposed to the hacker . So the smart server sends an authentication code To the client ( By redirecting ), The client is in the background , adopt https The way , Use this code, And another string of pre negotiated passwords between the client and the server , To get token And refresh token, This process is very safe . If the hacker intercepts code, He doesn't have the pre negotiated password , He can't get token Of . such oauth2 Can guarantee the request for resources , The user agrees , The client is also recognized , You can safely send resources to this client .
summary : therefore cas Login and OAuth2 The biggest difference in the process is , adopt ST perhaps code When you go to certification , Do you need a pre negotiated password .
3、 ... and 、 Distributed transactions - Solution
There are several solutions for distributed transactions :
- Global news
- Distributed transaction based on reliable message service
- TCC
- Best effort notification
1、 Two-phase commit /XA
XA A transaction consists of one or more resource managers (RM)、 A transaction manager (TM) And an application (ApplicationProgram) form .
The first stage (prepare): That is, all participants RM Prepare to execute the transaction and lock the required resources . participants ready when , towards TM The report is ready .
The second stage (commit/rollback): When the transaction manager confirms that all participants (RM) all ready after , Send... To all participants commit command .
2、SAGA
Saga This is a database paper saga A scheme mentioned . Its core idea is to split long transactions into multiple local short transactions , from Saga Transaction coordinator coordination , If it's finished normally, then it's finished normally , If a step fails , The compensation operation is called once in reverse order .
SAGA Characteristics of the transaction :
- High concurrency , Don't look like XA Locking resources for a long time like a transaction
- Normal operation and compensation operation need to be defined , Development volume ratio XA Big
- The consistency is weak , For transfers , It could happen A User has deducted , Finally, the transfer failed again
3、TCC
About TCC(Try-Confirm-Cancel) The concept of , The first is the Pat Helland On 2007 A piece published in was called 《Life beyond Distributed Transactions:an Apostate’s Opinion》 The paper put forward .
TCC It is divided into 3 Stages
- Try Stage : Trying to perform , Complete all business checks ( Uniformity ), Reserve necessary business resources ( Quasi isolation )
- Confirm Stage : Confirm the execution of the actual execution of the business , No business checks , Use only Try Business resources reserved in the stage ,Confirm Operation requires idempotent design ,Confirm It needs to be retried after failure .
- Cancel Stage : Cancel the execution , Release Try Business resources reserved in the stage .Cancel The phase of the anomaly and Confirm The phase exception handling scheme is basically the same , Idempotent design is required .
TCC The characteristics are as follows :
- High concurrency , No long-term resource locking .
- A large amount of development , Need to provide Try/Confirm/Cancel Interface .
- Good consistency , Not going to happen SAGA After deduction, the transfer fails
- TCC It is applicable to order business , Business with constraints on intermediate states
4、 Local message table | Transaction message
The core of the design is to asynchronously ensure the execution of tasks requiring distributed processing through messages .
Characteristics of local message table :
- Long transactions only need to be split into multiple tasks , Easy to use
- Producers need to create additional message tables
- Each local message table needs to be polled
- If the consumer's logic fails to succeed through retry , Then more mechanisms are needed , To roll back the operation
5、 Best effort notification
The initiating notifying party shall try its best to inform the receiving party of the result of business processing through a certain mechanism :
- Provide the interface , Enable the notification receiver to query the business processing results through the interface
- Message queue ACK Mechanism , Message queuing is performed at intervals 1min、5min、10min、30min、1h、2h、5h、10h The way , Gradually increase the notice interval , Until the time window limit required by the notification is reached . No further notice after
6、AT Transaction mode ( Open source distributed transaction solution -Seata)
This is Ali's open source project seata A transaction mode in , The ant gold suit is also called FMT. The advantage is that the transaction mode is used in , similar XA Pattern , There is no need to write various compensation operations for the business , Rollback is done automatically by the framework , The disadvantages are similar AT, There is a lock for a long time , It does not meet the high concurrency scenario . Interested students can refer to seata-AT
Seata Will provide users with AT、TCC、SAGA and XA Transaction mode , Create a one-stop distributed solution for users .
Four 、 The four isolation levels of a database
There are four isolation levels in the database :
- Read uncommitted Read uncommitted
At this level , In the process of modifying a row of data by a transaction , Another transaction is not allowed to modify the row data , But allow another transaction to read the row data .
So in this level , No lost updates , But there will be dirty reading 、 It can't be read repeatedly . - Read committed Read the submission
At this level , Uncommitted write transactions do not allow other transactions to access the row , So there's no dirty reading ; But the transaction that reads the data allows other transactions to access the row data , So there will be non repeatable reading . - Repeatable read Repeated reading
At this level , Read transactions prohibit writing transactions , But read transactions are allowed , Therefore, it will not happen that the same transaction reads different data twice ( It can't be read repeatedly ), And write transactions prohibit all other transactions . - Serializable serialize
This level requires that all transactions must be executed serially , Therefore, all problems caused by concurrency can be avoided , But it's inefficient .
- Read uncommitted Read uncommitted
5、 ... and 、CAP theory
CAP The theory is that : In a distributed system , At best, it can only satisfy C、A、P Two of the requirements .
CAP The meaning of :
- C:Consistency Uniformity
Whether multiple copies of the same data are the same in real time . - A:Availability Usability
Usability : For a certain period of time & The system returns a clear result It's called the system available . - P:Partition tolerance Partition tolerance
Distribute the same service across multiple systems , So as to ensure that a certain system goes down , There are still other systems that provide the same services .
- C:Consistency Uniformity
BASE theory
CAP The theory tells us a tragic fact that we have to accept —— We can only C、A、P Choose two conditions . And for business systems , We often choose to sacrifice consistency for system availability and partition fault tolerance . But the point here is , So-called “ Sacrifice consistency ” It's not all about giving up data consistency , It's about sacrificing strong consistency for weak consistency . Let's introduce BASE theory .
- BA:Basic Available Basic available
- The whole system is under certain circumstances of force majeure , Can still guarantee “ Usability ”, That is, a definite result can still be returned within a certain period of time . It's just “ Basic available ” and “ High availability ” Is the difference between the :
- “ A certain time ” It can be extended properly
When there is a big promotion , The response time can be extended appropriately - Return a degraded page to some users
Return a degraded page directly to some users , To relieve the server pressure . But should pay attention to , Returning to the downgrade page is still returning clear results .
- “ A certain time ” It can be extended properly
- The whole system is under certain circumstances of force majeure , Can still guarantee “ Usability ”, That is, a definite result can still be returned within a certain period of time . It's just “ Basic available ” and “ High availability ” Is the difference between the :
- S:Soft State: The state of flexibility
The status of different copies of the same data , You don't need real-time consistency . - E:Eventual Consisstency: Final consistency
The status of different copies of the same data , You don't need real-time consistency , But make sure it's still consistent after a certain period of time .
- BA:Basic Available Basic available
6、 ... and 、 Implementation of distributed lock
1:SQL Optimize
update stock set num=num-1 where id = #{id};
be based on MySQL The pessimistic lock of
select * from stock where id=#{id} for update;
be based on MySQL Optimistic lock of
select num,version from stock where id=#{id};
update stock set num=new_num, version=version+1 where id=#{id} and version=#{version};
2: be based on Redis Distributed lock
Redis Nature provides setnx
command , Atomic manipulation can be guaranteed , The order is at the designated key When there is no , by key Set the specified value .
3:java Medium Redission
If it is Java Code , have access to Redission Package to implement distributed locks .
4: be based on zookeeper Distributed locks for
The database lock :
advantage : Use the database directly , Use Simple .
shortcoming : Most of the bottlenecks in distributed systems are in databases , Using database lock will increase the burden of database .
Buffer lock :
advantage : High performance , It's easier to implement , stay Allow accidental lock failure , Does not affect the normal use of the system , Cache lock is recommended .
shortcoming : adopt The lock timeout mechanism is not very reliable , When the thread gets the lock , Too long processing time results in lock timeout , The function of the lock is invalid .
zookeeper lock :
advantage : Don't rely on timeouts to release locks ; High reliability ; When the system requires high reliability , The proposal USES zookeeper lock .
shortcoming : Performance is not as good as cache lock , Because you need to create nodes frequently, delete nodes .
7、 ... and 、 Implementation scheme of distributed database data consistency technology | Data layer middleware
be based on ZooKeeper Service discovery of (CP)
be based on Eureka Service discovery of (AP)
Distributed data layer middleware
- Dynamic data sources
- Read / write separation
- Distributed unique primary key generator
- Sub database and sub table
- Connection pool and SQL monitor
- Dynamic configuration, etc
1.TDDL
Taobao developed according to its own business characteristics TDDL frame , It mainly solves the application transparency of database and table and data replication between heterogeneous databases , It's based on a centralized configuration JDBC datasource Realization .
characteristic
Implement dynamic data sources 、 Read / write separation 、 Sub database and sub table .
shortcoming
The function of sub database and sub table is not open source yet , Currently, there are few published documents , And need to rely on diamond( Taobao internal use of a persistent configuration management system )
2.DRDS
Alibaba distributed relational database service (Distribute Relational Database Service, abbreviation DRDS) It's a horizontal split 、 Smooth expansion and contraction capacity 、 Read write separation of online distributed database services .
Its predecessor is Taobao TDDL, The next generation is DRDS, Integrate cloud services , charge 、Cobar、TDDL Integrate , commercial , The preferred .
3.Atlas
Atlas By Qihoo 360 company Web Platform Department infrastructure team development and maintenance based on MySQL Data middle tier project of the protocol .
It's in MySQL Officially launched MySQL-Proxy 0.8.2 Based on version , A lot of changes bug, Many features have been added . At present, the project 360 Widely used within the company , quite a lot MySQL The service has been connected Atlas platform , It carries billions of read and write requests every day .
The main function :
1. Read / write separation
2. Load balancing from the library
3.IP Filter
4. Automatic dividing table
5.DBA Smooth up and down line DB
6. Automatically remove downtime DB
4.MTDDL(Meituan Distributed Data Layer)
Meituan reviews distributed data access layer middleware
characteristic
Implement dynamic data sources 、 Read / write separation 、 Sub database and sub table , And tddl similar .
Now I will MTDDL For example , You can also refer to Taobao tddl, Explain the architecture design of distributed data layer Middleware in detail .
8、 ... and 、Zookeeper Principle and architecture design of
Zookeeper The distributed service framework is Apache Hadoop A subproject of , It is mainly used to solve some data management problems often encountered in distributed applications , Such as :
- Unified naming service
- State synchronization service
- Cluster management
- Management of distributed application configuration items
Zookeeper Basic principle and architecture of
1、Zookeeper Role
» The leader (leader): Responsible for the initiation and resolution of the vote , Update system status .
» Learners' (learner): Including followers (follower) And the observer (observer),follower It is used to accept the client request and return the result to the client , Take part in the voting process .
» Observer: Can accept client connections , Forward the write request to leader, but observer Not in the voting process , Sync only leader The state of ,observer The goal is to extend the system , Improve read speed
» client (client): Request originator
Zookeeper At the heart of that is atomic radio , This mechanism ensures that Server Synchronization between . The protocol that implements this mechanism is called Zab agreement .
Zab There are two modes of protocol , They are Recovery mode ( Elector ) And broadcast mode ( Sync ).
Every Server There are three states in the working process :
LOOKING: At present Server I do not know! leader Who is it? , Searching for
LEADING: At present Server It's an elected leader
FOLLOWING:leader It has been elected , At present Server Keep up with it
1、 Distributed coordination technology
2、 Implementation of distributed lock
ZooKeeper Data model Znode
ZooKeeper In namespace Znode, Both file and directory characteristics . Maintain data as files 、 Meta information 、ACL、 Data structures such as time stamps , It can also be used as a part of path identification like directory . Each node in the graph is called a Znode. Every Znode from 3 Part of it is made up of :
① stat: This is the status message , Describe the Znode Version of , Permissions and other information
② data: With this Znode Associated data
③ children: The Znode Child nodes under
(4) Node type
**① Temporary node :** The lifecycle of this node depends on the session that created them . Once the conversation is over (Session) end , Temporary nodes will be deleted automatically , Of course, you can delete it manually . Although every temporary Znode Will be bound to a client session , But they are still visible to all clients . in addition ,ZooKeeper The temporary node of is not allowed to have child nodes .
**② Permanent nodes :** The life cycle of this node does not depend on the session , And only when the client displays the deletion operation , They can be deleted .
Sequential node
ZooKeeper Operation in service
Nine 、 Distributed, globally unique ID Solution details
ID It's the only identification of the data , The traditional way is to make use of UUID And database augmentation ID, In Internet enterprises , Most companies use Mysql, And because of the need for transaction support , So you usually use Innodb Storage engine ,UUID Too long and out of order , So it's not suitable for Innodb As the primary key , Self increasing ID More appropriate , But as the company's business grows , The amount of data will be more and more , Data need to be tabulated , And after dividing the tables , The data in each table will increase at its own pace , There is a good chance that ID Conflict .
Ten 、 Comparison of microservice configuration center
1、Disconf
2014 year 7 Baidu open source configuration management center , Also have the ability of configuration management , But it's no longer maintained , The most recent submission was two years ago .
2、Spring Cloud Config
2014 year 9 In open source ,Spring Cloud Ecological components , You can talk to Spring Cloud Seamless integration of the system .
3、Apollo
2016 year 5 month , Ctrip open source configuration management center , Have normative authority 、 Process governance and other features .
4、Nacos
2018 year 6 month , Alibaba open source configuration center , You can also do DNS and RPC Service discovery of .
Comparison of core concepts of configuration center
application 、 colony 、 Grayscale Publishing 、 Rights management 、 version management & Roll back 、 Multiple environments 、 Configure the comparison of real-time push
6 Deployment structure & High availability contrast
Spring Cloud Config
Spring Cloud Config contain config-server、Git and Spring Cloud Bus Three components :
- config-server Provide to client to get configuration ;
- Git Used to store and modify configuration ;
- Spring Cloud Bus Notify client configuration changes ;
Apollo
Apollo It is divided into MySQL,Config Service,Admin Service,Portal Four modules :
- MySQL Storage Apollo Metadata and user configuration data ;
- Config Service Provide read of configuration 、 Push and other functions , All client requests fall to Config Service On ;
- Admin Service Provide configuration changes 、 Release and other functions ,Portal The service of operation is Admin Service;
- Portal Provide user configuration management interface ;
Apollo Support four dimensions Key-Value Format configuration
- Application( application ) The application that actually uses the configuration ,Apollo The client needs to know who the current application is at runtime , So you can get the corresponding configuration . Each application has its own identity –appId, You need to configure... In your code
- Environment( Environmental Science ) Configure the corresponding environment ,Apollo The client needs to know which environment the current application is in ,, So you can get the configuration of the application ; Environment has nothing to do with code , If the same code is deployed in different environments, the configuration of different environments should be obtained ; By default, the environment reads the configuration on the machine (server.properties Of env attribute ) designated
- Cluster( colony ) Grouping of different instances in an application , For example, according to different data centers , The example of Shanghai computer room is divided into a cluster 、 The example of Shenzhen computer room is divided into a cluster ; For different Cluster, The same configuration can have different values ; By default, the cluster is specified by reading the configuration on the machine (server.properties Of idc attribute )
- Namespace( Namespace ) Grouping of different configurations under an application , Is a collection of configuration items , You can simply put Namespace Category is ( To configure ) file , Different types of configuration are stored in different files , For example, database configuration file 、RPC The configuration file 、 Apply your own configuration file, etc ; The application can directly read the configuration of public components namespace, for example DAL、RPC etc. ; Applications can also inherit the configuration of public components namespace To adjust the configuration of public components , Such as DAL The number of initial database connections
Main functional features :
- Unified management of different environments 、 Configuration of different clusters
- Configuration changes take effect in real time ( Hot release )
- Version release management
- Grayscale Publishing
- Rights management 、 Release review 、 Operational audit
- Client configuration information monitoring
- Provide Java and .Net Native Client
- Provide an open platform API
- Simple deployment , Less dependence
Nacos
Nacos Deployment requires Nacos Service and MySQL:
- Nacos External services , Support configuration management and service discovery ;
- MySQL Provide Nacos Data persistent storage of ;
Configuration center comparison
At present, there are many configuration centers on the market , This article mainly selects several widely used key items for comparison , As shown in the following table
The function point | spring-cloud-config | ctrip-apollo | disconf |
---|---|---|---|
Grayscale Publishing | I won't support it | Support | Partial updates are not supported |
Warning notice | I won't support it | Support | Support |
Instance configuration monitoring | Need to be combined springadmin | Support | Support |
Configuration effective time | adopt refresh take effect | real time | real time |
Configure update push | Manual trigger | Support | Support |
Configure timed pull | nothing | Support | Dependent event driven |
Local cache configuration | nothing | Support | Support |
Spring Boot Support | Native support | Support | I won't support it |
Spring Cloud Support | Native support | Support | I won't support it |
Business intrusiveness | weak | weak | weak , Support annotation and xml The way |
Unified management | nothing , adopt git operation | Unified interface | Unified interface |
11、 ... and 、 Distributed transactions —2PC and 3PC principle TCC Business
Common solutions for distributed things :
- 2PC Two paragraph submission agreement
- 3PC Three paragraph submission agreement ( Make up for the shortcomings of the two end submission protocol )
- TCC perhaps GTS( Ali )
- Message middleware final consistency
- Use LCN Solve distributed things , idea “LCN No production business ,LCN It's just the porter of the local business ”.
Twelve 、 Make sure that the database and table are distributed
1、 vertical ( The longitudinal ) segmentation
There are two kinds of vertical segmentation: vertical database and vertical table .
Advantages of vertical segmentation :
- Solve business system level coupling , Business is clear
- Similar to micro service governance , It can also manage the data of different businesses at different levels 、 maintain 、 monitor 、 Extension etc.
- High concurrency scenarios , Vertical segmentation improves to a certain extent IO、 Number of database connections 、 The bottleneck of stand-alone hardware resources
shortcoming :
- Some watches can't join, It can only be solved through interface aggregation , Increased the complexity of development
- Distributed transaction processing is complex
- There is still a problem of too much data in a single table ( It needs to be sliced horizontally )
2、 level ( The transverse ) segmentation
When an application is no longer fine-grained for vertical segmentation , Or the number of data lines after segmentation is huge , There is a single library reading and writing 、 Storage performance bottlenecks , In this case, horizontal segmentation is needed .
The advantages of horizontal segmentation :
- There is no single database with too much data 、 The performance bottleneck of high concurrency , Improve system stability and load capacity
- The transformation of application end is small , There is no need to split the business modules
shortcoming :
- Transaction consistency across shards is difficult to guarantee
- Cross database join The performance of association query is poor
- It is very difficult to extend data many times and the amount of maintenance is very large
3、 Several typical data fragmentation rules are :
According to the range of values
Take the model according to the numerical value
Problems caused by sub database and sub table
1、 Transaction consistency issues
2、 Cross node Association query join problem
3、 Paging across nodes 、 Sort 、 Function problem
4、 Global primary key avoidance problem
5、 Data migration 、 Capacity expansion
Support sub database and sub table middleware
Standing on the shoulders of giants can save a lot of energy , At present, there are some mature open source solutions for sub database and sub table :
- sharding-jdbc( Dangdang )
- TSharding( Mushroom street )
- Atlas( qihoo 360)
- Cobar( Alibaba )
- MyCAT( be based on Cobar)
- Oceanus(58 Same city )
- Vitess( Google )
client Pattern ,proxy Pattern
Whether it's client Pattern , still proxy Pattern , The core steps are the same :SQL analysis , rewrite , route , perform , The results merge .
13、 ... and 、 Cache architecture in large distributed systems
1、CDN The advantages of caching are shown in the figure below :
2、 Reverse proxy cache
The reverse proxy is located in the application server room , Deal with all that is right with Web Server requests .
If the page requested by the user is buffered on the proxy server , The proxy server sends the buffer directly to the user .
If there is no buffer, go first to Web The server makes a request , Retrieve the data , Local cache and then send it to the user . By lowering to Web Number of requests from the server , So it reduces Web Server load .
** Application scenarios :** Generally, only small static file resources are cached , Such as css、js、 picture .
3、 Local application cache
Ehcache The main features of are shown in the figure below :
Guava Cache
** Basic introduction :**Guava Cache yes Google Open source Java Reuse toolset Libraries Guava One of the caching tools in .
4、 Distributed cache
Memcached
Redis
Common problems mainly include the following points :
- Data consistency
- Cache penetration
- Cache avalanche
- Cache highly available
- Cache hotspot
fourteen 、 Distributed NoSQL brief introduction
TRDB Database technology features :
(1) Use strong storage mode technology . Here the special value database table 、 That's ok 、 Field creation , All need to be defined in advance , And related attribute constraints .
(2) use SQL Technical standards to define and operate databases .
(3) Use strong transactions to ensure availability and security
(4) It mainly adopts single machine centralized processing (CP,Centralized Processing) The way .
NoSQL Database technology features :
(1) Using weak storage mode technology
(2) There is no SQL Technical standards to define and operate databases
(3) Weak transactions are used to ensure data availability and security, or there is no transaction processing mechanism at all .
(4) It mainly adopts multi computer distributed processing (DP,Distributed Processing) The way .
Common distributed file systems
GFS、HDFS、Lustre 、Ceph 、GridFS 、mogileFS、TFS、FastDFS etc. . They are suitable for different fields . None of them are system level distributed file systems , But application level distributed file storage Storage services .
15、 ... and 、 Distributed relational database solution
The key points of distributed relational database are as follows :
- sub-treasury
- table
- M-S
- colony
- Load balancing
- Programming interface (API)
1、MyCat
characteristic
- Support for read/write separation , Support Mysql Two masters and many subordinates , And a master-slave mode
- Supports global tables , Data is automatically sliced to multiple nodes , Used for efficient table Association queries
- Support unique based on E-R The segmentation strategy of relationship , It realizes efficient table Association query
- Automatic failover , High availability
- Provide high availability data fragmentation cluster
- Support JDBC Connect ORACLE、DB2、SQL Server, Simulate it as MySQL Server Use
- Support Mysql colony , It can be used as Proxy Use
- Open source based on Alibaba Cobar Product development ,Cobar The stability of 、 reliability 、 Excellent architecture and performance
2、Atlas
Atlas By Qihoo360,Web Platform Department infrastructure team development and maintenance based on MySQL Data middle tier project of the protocol . It's in MySQL Officially launched MySQL-Proxy0.8.2 Based on version , A lot of changes bug, Many features have been added . At present, the project 360 Widely used within the company , quite a lot MySQL The service has been connected Atlas platform , It carries billions of read and write requests every day .
characteristic
- Read / write separation
- Load balancing from the library
- IP Filter
- SQL Statement black and white list
- Automatic dividing table
3、Cobar
Cobar Is to provide a relational database (MySQL) Middleware for distributed services , It can make the traditional database get good linear expansion , And it still looks like a database , Be transparent to applications .
- The product runs stably in Alibaba 3 In the above .
- Took over 3000+ individual MySQL Database schema.
- Cluster daily processing online SQL request 50 More than 100 million times .
- The cluster handles online data traffic on a daily basis TB Above grade .
4、Mysql proxy
summary
MySQL Proxy It's a client End sum MySQLserver A simple program between the two ends , It can monitor 、 Analyze or change their communication . It's flexible to use , There is no limit to , Common uses include : Load balancing , fault 、 Query analysis , Query filtering and modification, etc .MySQLProxy It's such a middle tier agent , To put it simply ,MySQLProxy It's a connection pool , Responsible for forwarding the connection request of the foreground application to the background database , And by using lua Script , It can realize complex connection control and filtering , So as to achieve read-write separation and load balancing . For applications ,MySQLProxy It's completely transparent , The application only needs to connect to MySQLProxy Just listen to the port . Of course , such proxy The machine may become a single point of failure , But you can use multiple proxy The machine is redundant , Configure multiple connections in the connection pool configuration of the application server proxy The connection parameters of .MySQLProxy A more powerful function is to realize “ Read / write separation ”, The basic principle is to let the main database handle transactional queries , Let slave handle SELECT Inquire about . Database replication is used to synchronize changes caused by transactional queries to slave databases in the cluster .
characteristic
- Load balancing
- Read / write separation
- Table splitting is not supported
- Agent layer monitoring
sixteen 、 Fully distributed transaction solution details
- Business : A transaction is a reliable, independent unit of work consisting of a set of operations , Business has ACID Characteristics of , Atomicity 、 Uniformity 、 Isolation and persistence .
- Local transactions : When a transaction is managed locally by a resource manager, it is called a local transaction . The advantage of local transactions is that they support strict ACID characteristic , Efficient , reliable , State can be maintained only in Explorer , And the application programming model is simple . But the local transaction does not have the processing ability of distributed transaction , The smallest unit of isolation is limited to resource managers .
- Global transaction : When a transaction is managed globally by the global transaction manager, it becomes a global transaction , The transaction manager is responsible for managing the global transaction state and participating resources , Consistent commit rollback of collaborative resources .
- TX agreement : Interface between application or application server and transaction manager .
- XA agreement : Interface between global transaction manager and resource manager .XA By X/Open Distributed transaction specification proposed by the organization . The specification mainly defines the interface between global transaction manager and local resource manager . Mainstream database products have been implemented XA Interface .XA The interface is a two-way system interface , It acts as a communication bridge between transaction managers and multiple resource managers . The reason why we need XA This is because in a distributed system, theoretically, two machines cannot reach a consistent state , Therefore, a single point is introduced for coordination . Transactions managed and coordinated by the global transaction manager can span multiple resources and processes . Global transaction managers generally use XA The two-phase protocol interacts with the database .
- AP: Applications , Can be understood as using DTP(Data Tools Platform) The program .
- RM: Explorer , This can be a DBMS Or message server management system , The application controls the resources through the resource manager , Resources must achieve XA Defined interfaces . The resource manager is responsible for controlling and managing the actual resources .
- TM: Transaction manager , Responsible for coordination and management of affairs , Provide to AP Programming interface and management resource manager . The transaction manager controls the global transaction , Managing the life cycle of a transaction , And coordinate resources .
- Two phase submission agreement :XA A mechanism for coordinating multiple resources in a global transaction .TM and RM A two-stage submission scheme is adopted to solve the consistency problem . Two node commit requires a coordinator (TM) To control all participants (RM) The operation results of nodes and guide whether these nodes need to be finally submitted . The limitation of the two-phase submission is the cost of the agreement , The lasting cost of the preparation phase , The persistence cost of the global transaction state , Vulnerability caused by multiple potential failure points , After preparation , The failure before submission causes a series of isolation and recovery problems .
- BASE theory :BA Basic business availability , Partition support failed ,S Indicates the state of flexibility , That is to allow a short time out of sync ,E Indicates final consistency , The data is consistent in the end , But real time is inconsistent . Atomicity and permanence must be fundamentally guaranteed , For usability 、 The need for performance and service degradation , Only to reduce the requirements of consistency and isolation .
- CAP Theorem : For shared data systems , At most, you can only have CAP Two of them , Any two have their own adaptive scenarios , In a real business system, it's usually ACID And CAP Mixture . The most important thing in a distributed system is to meet the business requirements , Rather than pursuing a high level of abstraction , Absolute system characteristics .C For consistency , That is, all users see the same data .A Indicates availability , It means that you can always find a copy of the available data .P Indicates partition fault tolerance , Be able to tolerate network interruption and other failures .
- Service patterns in flexible transactions :
- Can query operation : The service operation has a globally unique identifier , The only definite time of operation .
- Idempotent operation : The business results generated by repeated calls are the same as those generated by one call . One is to achieve idempotency through business operations , Second, the system caches all requests and processing results , Finally, after detecting repeated requests , Automatically return the previous processing results .
- TCC operation :Try Stage , Try to do business , Complete all business checks , Achieve consistency ; Reserve necessary business resources , Achieve quasi isolation .Confirm Stage : Really do business , Don't do any tests , Only for Try Business resources reserved in the stage ,Confirm The operation should also satisfy idempotency .Cancel Stage : Cancel execution of business , Release Try Business resources reserved in the stage ,Cancel Operations must be idempotent .TCC And 2PC( Two-phase commit ) The difference between agreements :TCC Located in the business service layer rather than the resource layer ,TCC There is no separate preparation stage ,Try The ability to operate and prepare resources ,TCC in Try The operation can flexibly select business resources , Lock granularity .TCC Development cost ratio 2PC high . actually TCC It is also a two-phase operation , however TCC Not equal to 2PC operation .
- Compensable operation :Do Stage : Real execution of business processing , Business processing results are externally visible .Compensate Stage : Offset or partially cancel the business result of the forward business operation , The compensation operation satisfies idempotency . constraint : The compensation operation is feasible in business , The risks and costs caused by the non isolation of business execution results or incomplete compensation are controllable . actually ,TCC Of Confirm and Cancel Operation can be regarded as compensation operation .
seventeen 、 Fully distributed Session Solution details
Scheme 1 : Client storage
Store information directly in cookie in
cookie Is a small piece of data stored on the client , Client pass http Protocol and server cookie Interaction , Usually used to store some insensitive information
shortcoming :
- Data stored on client , There are safety risks
- cookie Storage size 、 There are restrictions on the type
- The data is stored in cookie in , If a request cookie Too big , It will add more overhead to the network
Option two :session Copy
session Replication is a kind of server cluster used more by small enterprise applications session Management mechanism , Not many are used in real development , Through to web The server ( for example Tomcat) Set up clusters .
The problem is :
- session The principle of synchronization is to send a broadcast in the same LAN to synchronize asynchronously session Of , Once there are more servers , It's coming up ,session The amount of data that needs to be synchronized is large , You need to transfer the... On other servers session Synchronize all to this server , It will bring some network overhead , When the number of users is very large , There will be a lack of memory
Option three :session binding :
Nginx It's a free 、 Open source 、 High performance http Servers and reverse proxy servers
Nginx What can be done :
Reverse proxy 、 Load balancing 、http The server ( Dynamic and static agents )、 Forward agency
How to use nginx Conduct session binding
We make use of nginx Reverse proxy and load balancing , Previously, the client will be assigned to one of the servers for processing , The specific server to be allocated for processing depends on the server's load balancing algorithm ( polling 、 Random 、ip-hash、 Weight, etc ),
Option four : be based on redis Storage session programme
advantage :
- This is the most commonly used method in enterprises
- spring It's packaged for us spring-session, Just introduce dependencies directly
- Data saved in redis in , Seamless access , There is no potential safety hazard
- redis It can be used as a cluster , Build master slave , At the same time, it is convenient to manage
shortcoming :
- One more network call ,web The container needs to be directed to redis visit
eighteen 、 Load balancing : Algorithm 、 Realization 、 100 million load solution details
Load balancing (Load Balance), It means to load ( Such as access requests from the front end ) Balance 、( Through load balancing algorithm ) Allocate to multiple operating units ( The server , middleware ) Go ahead and execute . It's about high performance , A single point of failure ( High availability ), Extensibility ( Horizontal expansion ) The ultimate solution . It can be understood as , Load balancing is a technology used by high availability and high concurrency .
The role of load balancing :
1、 Increased throughput , Solving concurrent pressure ( High performance );
2、 Provide for failover ( High availability );
3、 By adding or reducing the number of servers , Provide site scalability ( Extensibility );
4、 Safety protection ( Do some filtering on the load balancing device , Black and white list, etc ).
Hardware load balancing
adopt F5、A10、Citrix Netscaler Wait for hardware to achieve load balancing .
Software load balancing
adopt LVS、Nginx、HAProxy Such as software to achieve load balancing .
nineteen 、 Implementation principle of distributed consistency protocol
The classification of consistency
Strong consistency
explain : Ensure that the cluster state is changed immediately after the system change commit .
Model :
- Paxos
- Raft(muti-paxos)
- ZAB(muti-paxos)
Weak consistency
explain : It's also called ultimate consistency , The system does not guarantee to change the state of the cluster immediately after the change commit , But over time, the final state is consistent .
Model :
- DNS System
- Gossip agreement
Cluster Each node in the maintains a current state of the entire cluster in its own view , It mainly includes :
- Current cluster status
- What the nodes in the cluster are responsible for slots Information , And its migrate state
- Of the nodes in the cluster master-slave state
- The survival status and unreachable voting of each node in the cluster
Redis Clusters are decentralized , The states are synchronized with each other gossip Protocol communication , There are several types of messages in the cluster :
- Meet adopt 「cluster meet ip port」 command , The nodes of the existing cluster will send invitations to the new nodes , Join an existing cluster .
- Ping Nodes send... To other nodes in the cluster every second ping news , The message contains the addresses of two nodes that it knows 、 Slot 、 State information 、 Last communication time, etc .
- Pong Node receives ping The message will be answered pong news , The message also contains two known node information .
- Fail node ping When you don't know a node , All nodes in the cluster will be broadcast messages that the node has hung up . Other nodes mark offline after receiving the message .
Essay classification
边栏推荐
- STM32 photoresistor sensor & two channel AD acquisition
- pytorch 卷积操作
- LeetCode316-去除重复字母-栈-贪心-字符串
- LeetCode_ 58 (length of last word)
- 【硬十宝典】——1.【基础知识】电源的分类
- VIM easy to use tutorial
- Basic exercise of test questions hexadecimal to decimal
- Introduction to JVM stack and heap
- One click shell to automatically deploy any version of redis
- Codeworks round 449 (Div. 1) C. Kodori tree template
猜你喜欢
Research on medical knowledge atlas question answering system (I)
[summer daily question] Luogu p5886 Hello, 2020!
解决:Thread 1:[<*>setValue:forUndefinedKey]:this class is not key value coding-compliant for the key *
先有网络模型的使用及修改
AssertionError assert I.ndim == 4 and I.shape[1] == 3
This sideline workload is small, 10-15k, free unlimited massage
常用的Transforms中的方法
I also gave you the MySQL interview questions of Boda factory. If you need to come in and take your own
分布式锁的实现
Introduction to JVM stack and heap
随机推荐
Pytorch(一) —— 基本语法
[daily question in summer] letter delivery by p1629 postman in Luogu (to be continued...)
[FTP] the solution to "227 entering passive mode" during FTP connection
技术分享| 融合调度中的广播功能设计
Leecode records the number of good segmentation of 1525 strings
【暑期每日一题】洛谷 P2637 第一次,第二次,成交!
LeetCode_ 58 (length of last word)
Quelques outils dont les chiens scientifiques pourraient avoir besoin
先有网络模型的使用及修改
Pytoch (III) -- function optimization
[2020 overview] overview of link prediction based on knowledge map embedding
VIM easy to use tutorial
分布式事务-解决方案
Common interview questions ①
神经网络的基本骨架-nn.Moudle的使用
Leecode question brushing record 1332 delete palindrome subsequence
【硬十宝典目录】——转载自“硬件十万个为什么”(持续更新中~~)
分布式全局唯一ID解决方案详解
FileOutPutStream
Announcement on the list of Guangdong famous high-tech products to be selected in 2021