当前位置:网站首页>What is MYCAT? Get to know you quickly
What is MYCAT? Get to know you quickly
2022-06-11 00:44:00 【A music loving programmer】
List of articles
- Preface
- One 、mycat What is it? ?
- Two 、 Understand in another way Mycat
- 3、 ... and 、mycat Principle
- Four 、mycat Core concept of
Preface
Here is just to show you Mycat, It's right Mycat An understanding of , There's no actual operation .
One 、mycat What is it? ?
1、Mycat What is it? ? In terms of definition and classification , It is an open source distributed database system , It's an implementation MySQL Agreed Server, Front end users can think of it as a database agent , use MySQL Client tools and command line access , And its back end can use MySQL Native (Native) Agreement with multiple MySQL Server communication , It can also be used. JDBC The protocol communicates with most mainstream database servers , Its core function is to divide tables and databases , Divide a large table horizontally into N Small tables , Store on the back end MySQL In servers or other databases .
2、Mycat To the current version , It's not a simple MySQL Agent , Its back end can support MySQL、 SQL Server、Oracle、 DB2、 PostgreSQL Isomainstream database , Also support MongoDB This new type NoSQL How to store , More types of storage will be supported in the future . And in the eyes of end users , Whether it's that way of storage , stay Mycat in , It's a traditional database table , Supporting the standard SQL Statement to operate data , thus , For front-end business systems , It can greatly reduce the development difficulty , Improve development speed , In the test phase , A table can be defined as any kind of Mycat Supported storage methods , such as MySQL Of MyASIM surface 、 Memory tables 、 perhaps MongoDB、 LevelDB And the fastest in memory database in the world MemSQL On . Just imagine , The user table is stored in MemSQL On , A large number of data whose read frequency far exceeds the write frequency, such as the snapshot data of orders, are stored in InnoDB in , Some log data is stored in MongoDB in , And it can also put Oracle Watch heel MySQL Do association query on the table of , Do you have a feeling that you can't breathe ? But the future , Also can pass the Mycat Automatically input some calculated and analyzed data into Hadoop in , And it can be used Mycat+Storm/Spark Stream The engine does large-scale data analysis , see
Come here , You probably understand , Mycat What is it? ? Mycat Namely BigSQL, Big Data On SQL Database.
3、 Many students saw the above description , Maybe I'm still confused , I do not know! mycat What the hell is that? , Let's explain the different roles in detail ,mycat What the hell is that? ?
Two 、 Understand in another way Mycat
1、 about DBA for , It's understandable mycat:
Mycat Namely MySQL Server, and Mycat Connected at the back MySQL Server, It's like MySQL Storage engine for , Such as InnoDB,MyISAM etc. , therefore ,Mycat It doesn't store data on its own , Data is back-end MySQL Stored on , So data reliability and transactions are MySQL Ensure that the , In short ,Mycat Namely MySQL The best mate , It makes MySQL Have the ability to follow Oracle PK The ability of .
2、 For software engineers , It's understandable mycat:
Mycat It's an approximation of MySQL Database server , You can use the connection MySQL The way to connect Mycat, Except for the port , default mycat The port is 8066 instead of mysql Of 3306, Therefore, you need to add port information to the connection string , Most of the time , You can use the familiar object mapping framework mycat, But it is suggested that for the partition table , Try to use basic SQL sentence , Because it can achieve the best performance , Especially in the case of tens of millions or even tens of billions of records .
3、 For architects , It's understandable mycat:
mycat It is a powerful database middleware , It's not just a read-write separation 、 And sub database and sub table 、 Disaster recovery backup , And it can be used for multi tenant application development , Cloud platform infrastructure , Let your architecture have strong adaptability and flexibility , With the help of the forthcoming mycat You can only optimize the module , The data access bottleneck and hotspot of the system are clear at a glance , Based on these statistical analysis data , You can adjust the back-end storage automatically or manually , Mapping different tables to different storage engines , And the whole application doesn't have to change a single line of code .
3、 ... and 、mycat Principle
mycat It's not complicated , The complexity is the code , If the code is not complicated , It has become a legend so early .
mycat One of the most important actions in the principle of is “ Intercept ”, It intercepts what the user sent SQL sentence , First of all, SQL The statement does some specific analysis : Such as fragment analysis 、 Route analysis 、 Read write separation analysis 、 Cache analysis, etc , And then put this SQL Send back-end real database , And will return the results to do the appropriate processing , And finally back to the user .
In the picture above ,orders The table is divided into three pieces datanode( abbreviation dn), These three pieces are distributed in two stations MySQL Server On (Datahost), namely [email protected] The way , So you can use one to N It's divided into two servers , The fragmentation rule is (sharding rule) Typical string enumeration fragmentation rules , A rule is defined as a fragment field (sharding column)+ Piecewise functions (rule function), The fragment field here is prov The slicing function is string enumeration .
When mycat Receive a SQL when , I'll parse this first SQL, Find the table involved , Then look at the definition of this table , If there are fragmentation rules , Then we get SQL The value of the slice field in , And assign the partition function , Get it SQL Corresponding fragment list , And then SQL Send to these segments for execution , Finally, collect and process all the result data returned by the partition , And output to the client , With select * from orders where prov = ? Statements, for example , find out prov=wuhan, According to partition function ,wuhan return dn1, therefore sql It was sent to mysql1, selection db1 Query results on , And return to the user .
If the above sql Change it to select * from orders where prov in (wuhan,beijing), that ,sql It will be sent to MySQL1 and MySQL2 To carry out , Then the result set is merged and output to the user . But usually in business our SQL There will be order by as well as limit Flipping Syntax , At this point, the result set is designed to be in mycat Secondary processing of the end , This part of the code is also more complex , And the most complex one is the two tables join, So ,mycat Put forward innovative ER Fragmentation , Global table ,HBT(human brain tech) Manual only catlet, And the combination of storm/spark Engine and other 18 kinds of martial arts solutions , So it is called the most powerful solution in the industry , This is the power of open source .
Application scenarios
mycat Up to now , The scenarios used are already very rich , And new users are constantly giving new and innovative solutions , The following is a typical application scenario :
1、 Simple separation of reading and writing , The configuration is the simplest , Support for read/write separation , Master slave switch
2、 Sub database and sub table , For more than 1000 Ten thousand meters are divided into pieces , The biggest support 1000 A hundred million pieces of a single watch
3、 Multi tenant applications , One library per application , But the application only connects mycat, So it doesn't change the program itself , Achieve multi tenancy
4、 Report system , With the help of mycat The ability to divide tables , Deal with large-scale report statistics
5、 Integrate multiple data sources
6、 As a simple and effective way to query massive data in real time , such as 100 Million frequently queried records need to be in 3 Search results in seconds , In addition to primary key based queries , There may also be scope queries or other attribute queries , here mycat Probably the simplest and most effective option
7、 Database router ,mycat be based on mysql Instance connection pool reuse mechanism , Each application can share one to the greatest extent mysql All connection pools in the instance , The concurrent access ability of the database is greatly improved
Why use mycat
1、java Tightly coupled with database
2、 High access and high concurrency pressure on the database
3、 Read write request data inconsistent
Database middleware comparison

Four 、mycat Core concept of
mycat It's database middleware , Between database and application , Intermediate services for data processing and interaction . From the original library , Segmented into multiple segmented databases , All the partitioned database clusters constitute the complete database storage . 
As shown in the figure above , After the data is divided into multiple partitioned databases , If the application needs to read the data , It is necessary to process data from multiple data sources . If there is no database middleware , Then the application will face the sharding cluster directly , Data source switching 、 Transaction processing 、 Data aggregation requires direct application processing , It's supposed to be a business-focused application , A lot of work will be done in the session to deal with the problems after fragmentation , The most important thing is that each application process will be completely duplicated to build the wheel .
1、 Logical library
For practical applications , In fact, you don't need to know the existence of middleware , Developers only need to know the concept of database , Therefore, database middleware can be regarded as a logical library composed of one or more database clusters .
In the era of cloud computing , Database middleware can provide services to one or more applications in the form of multi tenancy , Each application may access an independent or shared physical library , Common examples are Alibaba cloud database servers RDS
[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-AX4YHhCA-1627469149279)(image\ Logical library .png)]
2、 Logic table
Since there is a logic library , Then there should be a logic table , In distributed database , For applications , The tables that read and write data are logical tables . Logical table can make data split , Step by step in one or more tile libraries , It can also be done without data segmentation , Not in pieces , There is only one table
3、 Fragment table
Fragment table , It refers to the original tables with large data , Tables that need to be split into multiple databases , In this way, each partition will have some data , All the pieces make up the whole data .
4、 A non segmented watch
Not all tables in a database are large , Some tables do not need to be segmented , Non fragmentation is relative to fragmentation table , Tables that don't need data segmentation .
5、ER surface
Relational database is based on entity relation model , It describes things and relationships in the real world ,mycat Medium ER The table comes from this . According to this idea , Based on ER Data fragmentation strategy for relationships , The records of the child table and the associated parent table are stored in the same data fragment , That is, the subclass depends on the parent class , Guarantee data by table grouping join No cross library operation .
Table grouping is to solve the problem of cross slice data join It's a very good idea of , It is also an important rule of data segmentation planning .
6、 Global table
In a real business system , There are often a large number of dictionary like tables , These tables are basically little changed , A dictionary table has the following characteristics :
1、 Changes are not frequent
2、 The total amount of data has not changed much
3、 The data is not big , There are rarely more than a hundred thousand records
For this kind of watch , In the case of fragmentation , When the business table is fragmented due to its size , The association between business tables and these attached dictionary tables , It's a tough problem , therefore mycat Data redundancy is used to solve the problem of this kind of table join, That is, all partitions have a copy of data , All dictionaries or tables that conform to the characteristics of dictionaries are defined as global tables .
Data redundancy is to solve the problem of cross slice data join A good idea for , It is also another important principle of data segmentation planning
7、 Sharded nodes (dataNode)
After data segmentation , A large table is divided into different partition databases , The database of each table partition is the partition node (dataNode)
8、 Node host (dataHost)
After data segmentation , Each segment node (dataNode) It's not always a single machine , There can be multiple sharded databases on the same machine , Such one or more sharding nodes (dataNode) The machine is the node host (dataHost), To avoid the concurrency limit of single node hosts , Try to segment nodes with high reading and writing pressure (dataNode) Balanced on different node hosts (dataHost).
9、 Fragmentation rule
Data segmentation means that a large table is divided into several partitioned tables , We need some rules , In this way, the rule of dividing data into certain partitions according to certain rules is the partition rule , It is very important for data segmentation to choose appropriate segmentation rules , It will greatly avoid the difficulty of subsequent data processing .
10、 Global serial number
After data segmentation , The primary key constraint in the original relational database cannot be used under the distributed condition , Therefore, it is necessary to introduce external mechanisms to ensure data uniqueness , The mechanism to ensure the global data unique identification is the global serial number .
11、 multi-tenancy
Multi tenancy technology or multi tenancy technology , It's a software architecture technique , It is to explore and implement how to share the same system or program components in a multi-user environment , And it can ensure the data isolation between users . In the era of cloud computing , Multi tenant technology provides the same or even customized services for most clients with a single system architecture and services in the shared data center , And it can still guarantee the data isolation of customers . At present, all kinds of cloud computing services are in the category of such technologies , For example, Alibaba cloud database service (RDS), Alibaba cloud server and so on .
There are three main solutions for multi tenant data storage , Namely :
1、 Independent database
One tenant, one database , This solution has the highest level of user data isolation , The best security , But the cost is also high .
advantage : Provide independent database for different tenants , It helps to simplify the extended design of the data model , To meet the unique needs of different tenants , If there is a fault , It's easy to recover data .
shortcoming : Increased the number of database installations , It will increase the maintenance cost and purchase cost
2、 Shared database , Isolated data architecture
Multiple or all tenants share database, But one for each tenant schema
advantage : It provides a certain degree of logical data isolation for tenants with high security requirements , It's not completely isolated ; Each database can support more tenants
shortcoming : If there is a fault , Data recovery is difficult , Therefore, restoring the database will involve the data of other tenants , If you need cross tenant Statistics , There are certain difficulties
3、 Shared database , Shared data structure
Tenants share the same database, The same schema, But pass... In the table tenantID Differentiate tenant data . This is the highest level of sharing 、 The mode with the lowest isolation level
advantage : Lowest maintenance and acquisition costs , The maximum number of tenants supported by running each database
shortcoming : Lowest isolation level , Minimum safety , We need to increase the amount of safety development in the design and development , Data backup and recovery is the most difficult , You need to backup and restore one by one .
边栏推荐
- Is it safe to open an account for stock speculation in Shanghai?
- 浅谈有赞搜索质量保障体系 v2021
- LeetCode 1673. 找出最具竞争力的子序列**
- 阻塞队列 — DelayedWorkQueue源码分析
- Static method static learning
- Shengteng AI development experience based on target detection and identification of Huawei cloud ECS [Huawei cloud to jianzhiyuan]
- The mystery of number idempotent and perfect square
- Kubernetes入门介绍与基础搭建
- 如何保证消息的顺序性、消息不丢失、不被重复消费
- Deploy netron services through kubernetes and specify model files at startup
猜你喜欢

海贼oj#146.字符串

Dynamic programming classical topic triangle shortest path

Njupt Nanyou Discrete Mathematics_ Experiment 2

项目连接不到远程虚拟机The driver has not received any packets from the server.

阻塞队列 — DelayedWorkQueue源码分析

【无标题】4555
![[network planning] 2.2.3 user server interaction: cookies](/img/a8/74a1b44ce4d8b0b1a85043a091a91d.jpg)
[network planning] 2.2.3 user server interaction: cookies

Dual wing layout

安全培训管理办法

Blog recommendation | building IOT applications -- Introduction to flip technology stack
随机推荐
Multipass中文文档-教程
安全培训管理办法
Block queue - delayedworkqueue Source Analysis
Computer screen recording free software GIF and other format videos
海贼oj#148.字符串反转
The mystery of number idempotent and perfect square
Yum source update
[untitled] test
[network planning] 2.4 DNS: directory service of the Internet
Learning notes: hook point of plug-in activity
Multipass中文文档-使用指引(目录页)
DevOps到底是什么意思?
【数据库】Mysql索引面试题
富文本活动测试1
With a market value of 21.5 billion yuan, will the post-80s generation in Sichuan make TV history?
JVM 垃圾回收机制和常见的垃圾回收器
The driver has not received any packets from the server
Test it first
[no title] 4555
Database table structure