当前位置:网站首页>The so-called fighting skill again gao also afraid of the chopper - partition, depots, table, and the merits of the distributed
The so-called fighting skill again gao also afraid of the chopper - partition, depots, table, and the merits of the distributed
2022-08-02 22:32:00 【Ink Sky Wheel】
I believe that I am no stranger to the above if I have done a database.Basically all mainstream databases support partitioning.Partitioning is a way of managing tables transparent to the application.Although I can't find a lot of hacked MySQL partitions on the Internet, I haven't found any problems with MySQL partitions today, and of course I haven't found any problems with PostgreSQL and Oracle.I'm guessing those problems should come from irregularities in usage.I used to build a public security system because of the billions and tens of billions in a single table, and I also used secondary partitions, and the effect was very good.
Generally, partitions are used according to time, basically according to month.For example, if I have one partition per month, and I have 5 years of data, then there are 60 partitions.At this time, if you want to archive the earliest month's data.We can use technologies such as interactive partitioning to export in seconds.Then migrate or compress these data to achieve the effect of archiving and slimming the online database.What if there are no partitions?Bitter, that can only be derived according to conditional logic, which is inefficient.After exporting, delete the exported data, but delete does not free up space, and needs to be defragmented.Fortunately, some databases do not affect the business.I have data that previously had a library Oracle.More than 20 T, the fragments have 880G.RAID10 array, the disk is 15k speed.Defragmentation was done directly online, and it was done for 55 hours without affecting the business.Another colleague has a poor array, and he has done a tragic 550 hours (almost a month).You can see how important partitioning is.When many tables are designed, due to the lack of design and planning, no partitions are established.Then it is necessary to convert the non-partitioned table into a partitioned table.How to convert I have written in the previous public account, if you are interested, you can go forward for a few days.But if the time is not a date, this will not work.It can be seen how important the design is. If the design is bad, all optimizations are powerless.
If the partition is not good, for example, if the partition is in the last month, if you forget to add a new partition, an error will be reported.At this point, Oracle has done automatic partitioning, and there will never be a problem.This feature does avoid a lot of glitches.However, when you drop the partition, you need to bring the update global index, otherwise it will fail.Hey, I've always wanted to mention it to the official, can we take it implicitly when we can't drop it?This really avoids failures.
Another bad thing about partitioning is that I divided it into 60-month partitions. As a result, when you query, you do not check according to the time, such as % before and after, or without conditions.So it's still 60 months.Assuming that this table is 600G without partition, the total amount after partitioning is still 600G.Then a SQL query 600G, the slow or slow.The speed of partitioning is that the query in one partition will be a little better (but I can't perceive it in my actual measurement), that is to say, you should limit the time to search in one partition, and do not cross partitions.In fact, this is almost the same as taking a month without partition.So I think the biggest contribution of partitioning is still life cycle management.
The sub-library is generally accompanied by micro-services and middle-end platforms. It turns out that the single-unit operation is very good.I will start dismantling it now. I saw in Lao Bai's article on microservice database selection that the development is temporary, and the operation and maintenance are two lines of tears.My real feeling is that there are two lines of tears for development and two lines of tears for operation and maintenance.Once the sub-database data interaction is either by interface or by synchronization.As for the interface, the efficiency of my contact is not high, and it must be lower than that of not dividing the library.Synchronization, the cost of using any CDC is not low, and I have never seen it without a problem.It can only be said that some tools do well, and there are few problems.In fact, most of the problems are caused by improper use.After the database is divided according to one dimension, it cannot be queried according to other dimensions.Same thing with partitions.If lookup all equals across all nodes.Also like partitioning, it's all or nothing, plus the network is slower.Although MySQL, PG and Oracle, the three mainstream databases, have disputes in the sect, but the consensus on the sub-database is that they should not be divided.
A sub-table necessarily involves a sub-library, and a sub-library does not necessarily involve a sub-table.That is, a table is divided into N parts and distributed on N nodes.One-Nth of the data is stored on each node.Don't believe in horizontal scaling, at least I don't.10 million data is stored in 100 nodes.Now that the data has reached 20 million, what should I do to add 100 nodes?That is to scatter the data of 200 nodes again.If you have done a redis cluster, you will know that the card owner is the card owner during reblance.Therefore, the rebalancing of the sub-tables has to be stopped.Then if there is also a front and rear %, the effect is the same, and the whole table is scanned separately.(Of course this parallel scan can be N times faster)
Distributed, I think MySQL's MGR is distributed, as is Oracle's RAC.TiDB Oceanbase polardb TDSQL are all with paxos and raft.When it exceeds the capacity of a single machine, consider distributed.So if it is distributed, can it be messed up?For example, before and after%, the answer is no.The same will kill the distribution.
Summary: Partitioning, sub-database, sub-table, and distributed are all afraid of not developing according to the specification, such as full table.No matter how high the so-called martial arts are, they are afraid of kitchen knives.Partitioning and distribution are desirable.Anything that is distributed is good for stand-alone games, such as Ali, Tencent, etc.I don't believe that I can use the distributed game well.For example, a single knife can slash his own, and a double knife might not have slashed anyone, and he has already bled all over the place.Sub-library and sub-table are not advisable, and if you encounter % before and after, it will make things worse.
边栏推荐
猜你喜欢
js Fetch返回数据res.json()报错问题
实现客户服务自助,打造产品知识库
快速掌握jmeter(一)——实现自动登录与动态变量
MySQL安装配置教程(超级详细、保姆级)
Introduction of uncommon interfaces of openlayers
AI科学家:自动发现物理系统的隐藏状态变量
Metaverse 001 | Can't control your emotions?The Metaverse is here to help you
当TIME_WAIT状态的TCP正常挥手,收到SYN后…
openlayers不常用接口介绍
【LeetCode】118. 杨辉三角 - Go 语言题解
随机推荐
MySQL安装(详细,适合小白)
You want the metagenomics - microbiome knowledge in all the (2022.8)
openlayers不常用接口介绍
【Psychology · Characters】Issue 1
线程池原理与实践|从入门到放弃,深度解析
Geoserver+mysql+openlayers2
详解卡尔曼滤波原理
thinkphp框架5.0.23安全更新问题-漏洞修复-/thinkphp/library/think/App.php具体怎么改以及为什么要这么改
项目分析(复杂嵌入式系统设计)
Flutter自带国际化适配自动生成方案
腾讯云孟凡杰:我所经历的云原生降本增效最佳实践案例
【LeetCode】118. 杨辉三角 - Go 语言题解
Mysql基础篇(视图)
Three.js入门
ALV报表学习总结
Metaverse 001 | Can't control your emotions?The Metaverse is here to help you
【C语言刷题】双指针原地修改数组(力扣原题分析)
日志框架学习
ABAP语法小复习
技术分享 | Apache Linkis 快速集成网页IDE工具 Scriptis