当前位置:网站首页>Why don't you recommend using products like mongodb to replace time series databases?
Why don't you recommend using products like mongodb to replace time series databases?
2022-07-05 09:43:00 【Tdengine】
Small T Reading guide : Previously, someone posted such a question on a Q & a website : Since some time series databases such as InfluxDB、TimescaleDB It's based on relational 、 Non temporal database PostgreSQL Developed from , In the time series data scenario , Can it be used MySQL/MongoDB This kind of database replaces the time series database (Time-Series Database) Use ? For this problem , Taosi data senior R & D Engineer tries to answer this question for friends who also have this question from the perspective of principle and practice .
From the definition of database , A database is a data management system , Is a software used to store data files , It can support the addition of users 、 modify 、 Delete 、 Query and so on . So by definition , Temporal databases and relationships / Non relational databases are the same , It's all for storing data . But because the characteristics of the stored data are different , The application scenarios of these two types of databases are also different :
- Relational database : It is mainly used to store structured data , Use physical objects to ensure data consistency , Use SQL Language to query . Typical representatives of such databases mainly include MySQL、Oracle、SQL Server etc. .
- Non relational database : It is mainly used to store unstructured data , Data can be stored without verification , Use JSON Data object to query . Its typical representatives mainly include MongoDB、Redis etc. .
Time series database is mainly used to store real-time data , The most obvious feature is that each data will have a timestamp attribute , In electricity 、 Petrifaction 、 metallurgy 、 Smart car 、 Monitoring and other fields are widely used . This kind of Database The typical representatives of TDengine、InfluxDB、TimescaleDB etc. . Let's cut to the point , Let's discuss the relationship / Whether non relational database can replace temporal database .
Whether to use relation / Non relational database replaces time series database ?
in fact , If the data acquisition frequency is less , If the amount of data is not large , Use relationship / There is no problem for non relational database to replace time series database . But in the long run , However, there are great risks in this practice , The specific reason should start from the characteristics of time series database .
Time sequence data has high acquisition frequency 、 Large amount of data 、 Write operation is primary and read operation is secondary 、 There are few updates or deletions 、 But it has the characteristics of statistical aggregation and other real-time computing operations , Relationship / Non relational databases are difficult to meet such high performance requirements . In the big data scenario , If the performance doesn't meet the requirements , If the data cannot be stored effectively , Such a database cannot replace the time series database .
Let's take a simple example , In the same test environment (16 nucleus 64G Memory ) Next , With the traditional relational database MySQL And temporal databases TDengine For example , Do the benchmark A comparative test of :
Separate use MySQL Self contained benchmark Tools mysqlslap and TDengine Self contained benchmark Tools taosbenchmark, Set up 16 Threads , Write single table 10 Ten thousand records , The structure of the table is 1 individual timestamp type ,2 individual int type ,2 A string type , The test results are as follows :
MySQL——
mysqlslap -uroot -p1234 --concurrency=16 --number-of-queries=100000 --create-sc
hema=tests --query="INSERT INTO meters(c0, c1, c2, c3) VALUES (RAND() * 100, RAND() * 100, uuid(), uuid())" 
TDengine——
taosBenchmark -b int,int,binary\(128\),binary\(128\) -n 100000 -t 1 -T 16 
From the above comparison test results, it can be seen that in the same case 10 Million records ,MySQL Use native mysqlslap Tools need 75 Seconds to complete , and TDengine Use native taosBenchmark Need less than 1 second . In the result of such a huge gap , We can come to a conclusion —— Use MySQL It is difficult to process time series data instead of time series database . Of course, due to different testing tools , Here's just an example , The test itself is not rigorous . I will start with some specific enterprise cases , Let's make an analysis for you .
Look at the storage of big data from specific cases
Actually , Want to answer this question , Specific enterprise case practice is the best 、 The truest answer . People in the industry should know , Time series database is gradually popular with the development of Internet of things and other technologies in recent years , Before that , The available database solutions for enterprises in all walks of life are very limited , Take Internet of vehicles enterprises as an example , The most common choice in the industry is MongoDB、HBase A class of traditional big data solutions .
But as the business grows , The amount of data is rising , These enterprises are more or less suffering from data architecture crisis , Even hinder the development of the business , We have to consider the iteration and migration of data architecture . Now I'll start with MySQL、MongoDB、HBase Three database Dimension enumerates enterprise cases , To illustrate .
MySQL
In Liugong's industrial vehicle networking application LiuGong iLink in , Due to the unreasonable complex query of the application layer and the high-frequency writing of historical data , Lead to MySQL Processing speed is slow , Even prone to downtime , Seriously affecting the user experience . After analyzing the cause , They came to a conclusion : Relational database is not suitable for storing massive time series data , In massive data aggregation Computing 、 The efficiency of thinning and other businesses is very low . From this conclusion , They started selecting models for time series databases .
Because of its business scenarios and TDengine Of “ One equipment acquisition point, one table ” It's a very consistent idea , And TDengine It can support operations such as aggregation and downsampling query of big data , Can be effectively improved MySQL Data pain point problem , After rigorous research and testing , Finally they decided to move to TDengine.
Take a look at the migration effect in a real scene : In replacement TDengine Before , This project has some business reports to show every day , The data of all devices in the next time zone shall be counted every hour , This process is in MySQL It often takes time 1 hours , Unable to perform subsequent business normally . Instead of TDengine after , The whole tabulation process only needs 10 About seconds .
The query comparison is shown in the following figure :

Reference material :https://www.taosdata.com/blog/2022/05/17/8473.html
MongoDB & HBase
For the application of these two databases , The Zero run car can be said to have a say . As a typical new energy vehicle enterprise , Zero run cars have always been MongoDB and HBase, With the accelerated expansion of business , There is a problem that the write speed is too slow 、 The support cost is too high .
use MongoDB Storing data will store all the data in memory , High storage costs result in data that can only be stored for a period of time , And the stored data format needs to be processed by the business organization , Not only is business change inflexible , The business that can be done is also very limited , and HBase Itself is a very heavy database , build HBase A complete set of HDFS Support , Use 、 Operation and maintenance 、 Labor and other costs are very high .
In the application TDengine After the architecture upgrade , The compression performance is directly improved 10 To 20 times , Reduce storage pressure and solve the problem of high data storage cost , It has also solved the previous problems HBase The problem of untimely warehousing , You can store more data with fewer server resources , Save more cost . At the same time, the business flexibility has also been greatly improved , No more MongoDB equally , Before querying, you also need to process the demand data according to the business ,TDengine Column storage of , Directly to SQL Calculation is enough .
At the end
From the above arguments, we can draw the final conclusion , If you are also facing the time series big data scenario , Time series database is the most correct 、 The most reasonable choice , If you choose a general database because the amount of data is still small , Then all kinds of thorny problems will follow , Including slow development efficiency 、 Low operating efficiency 、 The operation and maintenance cost is high 、 Application launch is slow 、 There are many problems such as too heavy privatization deployment in the scenario of small amount of data . In the selection of database ,“ An antidote against the disease ” Is a good strategy for business development .
边栏推荐
- 顶会论文看图对比学习(GNN+CL)研究趋势
- 如何正确的评测视频画质
- LeetCode 31. Next spread
- 微信小程序获取住户地区信息
- [two objects merged into one object]
- [sourcetree configure SSH and use]
- Nips2021 | new SOTA for node classification beyond graphcl, gnn+ comparative learning
- 【饿了么动态表格】
- VS Code问题:长行的长度可通过 “editor.maxTokenizationLineLength“ 进行配置
- 【sourceTree配置SSH及使用】
猜你喜欢

Applet customization component

一文详解图对比学习(GNN+CL)的一般流程和最新研究趋势

Go 语言使用 MySQL 的常见故障分析和应对方法

小程序启动性能优化实践

C form click event did not respond

22-07-04 Xi'an Shanghao housing project experience summary (01)

Creation and reference of applet

7 月 2 日邀你来TD Hero 线上发布会

How to choose the right chain management software?

【ManageEngine】如何利用好OpManager的报表功能
随机推荐
Tutorial on building a framework for middle office business system
Principle and performance analysis of lepton lossless compression
Develop and implement movie recommendation applet based on wechat cloud
百度交易中台之钱包系统架构浅析
Unity SKFramework框架(二十四)、Avatar Controller 第三人称控制
TDengine可通过数据同步工具 DataX读写
【愚公系列】2022年7月 Go教学课程 003-IDE的安装和基本使用
TDengine 离线升级流程
Unity SKFramework框架(二十三)、MiniMap 小地图工具
高性能Spark_transformation性能
OpenGL - Lighting
What should we pay attention to when developing B2C websites?
LeetCode 556. 下一个更大元素 III
图神经网络+对比学习,下一步去哪?
Dry goods sorting! How about the development trend of ERP in the manufacturing industry? It's enough to read this article
分布式数据库下子查询和 Join 等复杂 SQL 如何实现?
Understanding of smt32h7 series DMA and DMAMUX
TDengine 连接器上线 Google Data Studio 应用商店
Alibaba's ten-year test brings you into the world of APP testing
Talking about the difference between unittest and pytest