当前位置:网站首页>Database: data field change under high-speed parallel distribution
Database: data field change under high-speed parallel distribution
2022-06-09 18:27:00 【Linux server development】
1 background
This is often the case , Our business has been running steadily for some time , And the traffic has gradually increased . Now , But for some reason ( Such as function adjustment or business expansion ), You need to adjust the data table , Add fields or Modify table structure . Maybe a lot of people say alter table add column ... / alter table modify ..., It was easy to solve the problem . This is actually risky , For high complexity 、 A table with a large amount of data . Adjust table structure 、 Create or delete index 、 trigger , May cause table locking , The duration of locking the table depends on the actual situation of your data table . I have learned a painful lesson , The data scale was not evaluated well during the first business launch , As a result, business data cannot be written in for a long time . So what is the way to seamlessly upgrade the business tables of the database , Make the table transparent to users ? Let's discuss one by one .
2 Add associated table
The simplest way , Store the new fields in another secondary table , Use a foreign key to associate to the primary key of the main table . Achieve the goal of dynamic expansion . After the subsequent functions are launched , The new data will be stored in the secondary table , The main table does not need to be adjusted , transparent 、 Nondestructive .

The problem is :
When reading data , Join table query is inefficient , More data , The more complex the data , The more obvious the disadvantage .
There is no complete solution to the problem , Then there are new fields , We still face the problem of adding a new table or modifying the original table . Even if the subsequent newly added fields are added to the secondary table , It also faces the problem of locking tables .
The secondary table is only used to solve the problem of adding new fields , The problem of field update is not solved ( For example, modify the field name 、 Data type, etc ).
3 Add a new common column
Suppose our original table structure is as follows , In order to ensure the sustainable development of the business , There will be field extensions in the future . At this point, you need to consider adding a general field that can be expanded and shrunk automatically .

With MySQL As an example ,5.7 After the version version Json Field type , It is convenient for us to store complex Json Object data .
use test;
DROP TABLE IF EXISTS `t_user`;
CREATE TABLE "t_user" (
"id" bigint(20) NOT NULL AUTO_INCREMENT,
"name" varchar(20) NOT NULL,
"age" int(11) DEFAULT NULL,
"address" varchar(255) DEFAULT NULL,
"sex" int(11) DEFAULT '1',
"ext_data" json DEFAULT NULL COMMENT 'json character string ',
PRIMARY KEY ("id")
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8;
-- ----------------------------
-- Records of t_user
-- ----------------------------
INSERT INTO `t_user` VALUES ('1', 'brand', '21', 'fuzhou', '1', '{"tel": "13212345678", "name": "brand", "address": "fuzhou"}');In the code ext_data use Json data type , Is an extensible object carrier , Information supplement for storing the queried data . alike ,MySQL The data type provided , It also provides a very powerful Json Function to operate .
SELECT id,`name`,age,address FROM `t_user` WHERE json_extract(ext_data,'$.tel') = '13212345678';give the result as follows :

Written before MySQL In series , A reader of the blog asked me to sum it up MySQL Json Usage of , There has been no time , You can look at the documents on the official website , It's still clear .
Json Structures are generally downward compatible , So when you design field extensions , It is generally recommended to add , Deleting old attributes... Is not recommended . But there is also a problem , The more complex the business is ,Json The higher the complexity , There are more redundant attributes . For example, our json There are three properties ,tel、name、address, After that, the business is being adjusted , Find out tel useless , Add a age attribute , that tel Do you want to delete it ? There is a better way , Is to add... To the watch version attribute , The business of each period corresponds to one version, Every version Corresponding Json The data structure is also different .

advantage :
Properties can be dynamically extended at any time
New and old data can exist at the same time
Data migration is convenient , Write a program to convert the old version ext The new version of ext, And modify it version
Insufficient :
ext_data Fields in cannot be indexed
ext_data Inside key A lot of space will be occupied , Suggest key Be brief
from json It's troublesome to count the data of a certain field in , And it's inefficient .
Relatively inefficient query , The operation is complicated .
to update Json One of the fields in is inefficient , It is not suitable for storing data with complex business logic .
The statistics are complex , It is suggested that the data to be reported should not be saved json.
improvement :
If ext The attributes in the are required to be indexed , Probably NoSql( Such as MongoDB) It would be more suitable for
Article Welfare 】 In addition, Xiaobian also sorted out some C++ Back-end development interview questions , Teaching video , Back end learning roadmap for free , You can add what you need : Click to join the learning exchange group ~ Group file sharing
Xiaobian strongly recommends C++ Back end development free learning address :C/C++Linux Server development senior architect /C++ Background development architect

4 New table + Data migration
4.1 Data migration with triggers

The whole process is as follows :
Create a new table t_user_v1 (id, name, age, address, sex, ext_column), Contains extended fields ext_column
Add trigger on existing table , The original watch DML operation ( The main INSERT、UPDATE、DELETE), Will trigger the operation , Transfer data to a new table t_user_v1 in
For the original data in the old table , Step by step migration until completion
Delete trigger , Remove the original watch ( The default is drop fall )
Put the new watch t_user_v1 rename (rename) Original table t_user Go through the above steps , Gradually migrate data to new tables , And replace the old table , The whole operation does not need to be stopped for maintenance , No harm to the business
4.2 utilize Binlog Data migration
If it is MySQL database , You can copy binlog For data migration , The effect is the same , Compared to triggers , More stable .

4.3 The problem is
Cumbersome operation , inefficiency
There is an operational gap between data migration and data table switching , For high concurrency 、 Data sheet for high frequency operation , There are risks , It will cause temporary connection failure and Data inconsistency .
For big data tables , Long synchronization time
5 Field reservation
Reserved fields and How to map fields to table names .

5.1 The problem is
alike , Query efficiency is low
There are unknowns by default , There may be insufficient preset fields , There may also be spatial redundancy
Redundant empty subfields , There are obstacles to the occupation of storage space and the improvement of performance .
This method is still stupid , Not suitable for programmers' thinking
6 Multi master mode and hierarchical update
If the business flow is small , You can directly add or modify fields in the table , Short write locks are bearable . But if it is highly concurrent 、 clustering 、 Distributed system , From the data level, the master-slave or sub database and table management should be carried out . The following is a typical multi main mode , The process of upgrading the database table structure .

Under normal two main modes , Master master synchronization , have access to DBproxy、Fabric Wait for data middleware to do load balancing , You can also define some load policies by yourself , such as Range、Hash.
Modify the configuration , Let the traffic be switched to one of them , Then upgrade the data table of the other one ( Like cutting DB1, Use only DB2). Remember to do it during the low peak period , Avoid excessive traffic that causes another database instance to be suspended due to excessive load .
Take this operation in turn , But there is no need to upgrade at this time DB2 了 , Because it's primary synchronization .DB instance 1 It is a new table structure , At this time, it will be updated to... Together with the architecture and data DB2 On .
Wait until the two database instances are consistent , Modify the configuration , Reset the load on both database instances , Return to the previous state .
Reference material
Recommend a zero sound education C/C++ Free open courses developed in the background , Personally, I think the teacher spoke well , Share with you :C/C++ Background development senior architect , The content includes Linux,Nginx,ZeroMQ,MySQL,Redis,fastdfs,MongoDB,ZK, Streaming media ,CDN,P2P,K8S,Docker,TCP/IP, coroutines ,DPDK Etc , Learn now
original text : Database Series : The data field of the high parallel distribution is changed - Hello-Brand - Blog Garden
边栏推荐
猜你喜欢
随机推荐
提升Typecho首页ttfb加载速度以及若干升级
深度学习与CV教程(13) | 目标检测 (SSD,YOLO系列)
Synchronized implementation principle and lock upgrade process
如何实现工厂生产能耗数据的无线监测?
11年程序员给本科、研究生应届生以及准备从事后台开发同学的建议,学习进阶之路
Golang基础(2)
NLP text representation word bag model and TF-IDF
Scala basic grammar learning-1
利用go破解带密码的rar压缩文件
如何利用无线通讯技术优化钢铁厂消防用水管网?
[work with notes] multiple coexistence of ADB, sound card, network card and serial port of Tina system
[data processing] pandas reads SQL data
NLP keyword extraction overview
10 common high-frequency business scenarios that trigger IO bottlenecks
NLP-文本表示-词袋模型和TF-IDF
gcc编译demo+Makefile使用
MySQL parallel replication (MTS) principle (full version)
【操作教程】如何正确使用海康demo工具配置通道上线?
[notes of advanced mathematics] Green formula, Gauss formula, Stokes formula, field theory
Golang Foundation (3)









