当前位置：网站首页>What do you need to do to "surpass" the general database in the time sequence scenario?

What do you need to do to "surpass" the general database in the time sequence scenario?

2022-06-11 13:04:00 【Deep learning and python】

author | Liu Jicong

Small T Reading guide ： In recent years , With the rapid development of Internet of things technology and market 、 Accelerated expansion of enterprise business , The problem of time series data processing is also paid more and more attention by industries and enterprises , In the time series scenario, the general database is struggling , All kinds of time series database products have sprung up . however , Is it really easy to build a high-quality time series database ？ This article will be from the perspective of database developers , Data processing requirements in anatomical timing scenario 、 Analyze the design idea of time series database , Give readers some hard core technology thinking .

1 How to realize the of general database in time sequence scenario “ Far exceed ”？

Make one Prototype perhaps Demo It's simple , But it is difficult to make a really good time series database product .

The reason why I say and do Prototype Simple , Because time series databases are inherently bad at processing some data , For example, transaction data with transactions . Based on this , We can drastically cut down some important features in general-purpose databases , Such as transaction 、MVCC、ACID（ stay Facebook Of Gorilla It is even suggested that there is no need to guarantee Duration）. The storage engine of some time series databases , Can't even handle out of order data , Without disorder , The storage engine can almost degenerate into a band Index Of Log. therefore , From this point of view , Time series database can be done very simply .

however , On the other hand , It's hard to make a good time series database product . Just imagine , In the design of time series database , We slashed, for example, affairs 、ACID After waiting for characteristics , If you still can't make it perform much better than the general database in the time sequence scenario , Then it's meaningless to build a special time series database . In this case , Better not do , Just use the general database directly .

So-called “ Far more than... In the timing scenario ”, It should be all-round , For example, write latency and throughput 、 Query performance 、 Real time processing 、 Even including the operation and maintenance cost of the cluster scheme , There should be a leap forward improvement . On the other hand , There is a large amount of time series data 、 Starting from the characteristics of low value , Compression ratio is more important , However, the general-purpose database seldom emphasizes the compression rate , thus it can be seen , Compression ratio is the demand that grows out of the real time sequence scenario .

There is no black technology for the realization of high compression rate , There is no need to reinvent the compression algorithm —— Nothing more than to list and store and use its best compression algorithm for each type ; More is the problem of project realization —— Write good code , Carefully optimize , Balance the relationship between write performance and compression ratio .

Besides , In the scenario of time series data “ Far exceed ” It is based on the obvious characteristics of writing and query distribution of time series data , When the data itself key When the characteristic distribution of , Naturally, we can make full use of its characteristics to create different storage engines and index structures .

Let's start with . The throughput of time series database is much higher than that of general-purpose database , In especial IoT equipment , Its equipment scale may reach tens of millions or even hundreds of millions , The data is automatically generated , hypothesis 1s Take a sample , That can produce tens of millions of... Per second 、 Billion level data write , This is not what ordinary databases can bear , With such a large throughput , How the data is partitioned 、 How to build indexes in real time , Are challenging questions . On the write link , The time series database replaces OLTP The location of the database , The read-write delay generated by the latter under the transaction and strong consistency model is difficult to support the high-throughput write of sequential database .

Let's talk about inquiry . In the case of large write throughput , The requirement of real-time data is also very high . for example , We correlate the statistics of time series data for monitoring 、 Call the police , The tolerable delay may be in the order of seconds . The pattern of query is usually aggregate query , For example, statistics in a certain period of time , Not an exact single record . in general , The query mode of time series database is usually interactive analysis , This is different from T+1 Offline warehouse , It is also different from those that often run for hours OLAP Inquire about , The response time of interactive analysis query is usually seconds 、 Sub second level .

above , While clarifying the write and query requirements , Let's take the storage engine as an example , Let's take a look at how a certain part of a time series database should be designed .

2 The storage engine wants to be the best , You have to study it yourself

at present , The storage engine of database can be roughly divided into two categories ： One is based on B-Tree Of , The other is based on LSM-Tree Of . The former is common in traditional OLTP database , such as MySQL、PQ This kind of default engine , It is more suitable for the scene of reading more and writing less ; Such as HBase、LevelDB、RocksDB One type of database uses LSM-Tree, It is more suitable in the scene of writing more and reading less . actually , The storage engine of modern database , The two will basically merge to some extent .LSM Tree Why can't you build B-Tree Index 了？（HBase stay region There are also B-Tree Index）B-Tree Why do you have to write directly to the hard disk , You can't write first WAL And walk memory Cache Well ？

For storage engines , Forerunner of time series database InfluxDB Many attempts have been made , In each storage engine （LevelDB、RocksDB、BoltDB etc. ） Jump back and forth , There are many problems encountered , such as BoltDB in mmap+BTree Random in the model IO Resulting in low throughput 、RocksDB This kind of pure LSM Tree The storage engine can't delete by time partition gracefully and quickly 、 Multiple LevelDB + The method of time partition will produce a large number of handles …… After stepping on this series of pits , Final InfluxDB Replaced by a self-developed storage engine TSM. It can be seen that for time series database , How important a good storage engine is , How rare , To be the best , You have to develop it yourself .

differ InfluxDB,TDengine Our storage engine was self-developed from the beginning —— from LSM Tree Learned from WAL、 Write the memory first skip list And so on. , But put LSM Tree The tree hierarchy has been removed , It's just partitioned by time period 、 According to the table log block .

Read here , Careful readers may find this , According to the table block design and OpenTSDB Row aggregation is somewhat similar .OpenTSDB The aggregation of rows is to put the same tag Take one hour as the time range , Store all this data in one row , This greatly reduces the amount of data to be scanned by the aggregate query . But here's the difference ,TDengine It's a multi column model , and OpenTSDB It's a single column model , Under the single column model, there is multi row aggregation , In the multi column model, aggregation will naturally form data blocks .

And familiar with LSM Tree Of KV Separation design friends should also be able to learn from TDengine See some familiar shadows in the design of storage engine . If the data block is used as the storage engine value, that key It should be the start and end time of the block , hold key Put it forward and you will get TDengine Of BRIN Indexes . From this perspective ,TDengine Of .head The document is key, and .data and .last The document is value, and key It can be combined with the characteristics of time series data to form an ordered file . In a sequential scenario , With BRIN Indexes , So you don't need bloom filter, Look at it like this ,TDengine The design of the storage engine is very clear .

Besides ,TDengine Will tag Separate data from time series data , This can greatly reduce tag Storage space occupied by data , It is especially significant in the case of large amount of data .

TDengine Of tag And the division of time series data , In the dimension modeling of data warehouse, the division of dimension table and fact table is somewhat similar ,tag Data is similar to dimension table , The time series data is similar to the fact table . But it's different , because TDengine The number of tables in is the same as the number of devices , Hundreds of millions of devices are hundreds of millions of watches （ In the developing TDengine 3.0 in , We need to support 100 100 million tables ）, So frequently create 、 And an extremely large watch , It's not easy to deal with , The main trouble is that it generates a lot of metadata , Beyond the processing capacity of a single point , This requires that TDengine This part of metadata can also be stored in pieces .

When data and metadata are segmented 、 Multi copy operation , Naturally, it comes to consistency and availability . In the time series database , Timing data is usually finally synchronized consistently , Because the throughput of the final consistent algorithm is high and the delay is low 、 Usability is also better than strong consistency algorithm , such as InfluxDB The cluster version of will use Dynamo This ownerless style of data synchronization . But metadata （ That is, the label and table data we mentioned above ） Need strong consistency , Strong consistency usually uses Raft、Paxos This kind of algorithm to ensure the correctness .

Due to the huge amount of metadata, it needs to be segmented , The chronological data and metadata are divided into pieces （ Even time series data and its associated metadata should be in the same fragment ）, But there are different consistency requirements , This leads to TDengine Replication of replicas is not simply a matter of using Raft This kind of algorithm can control , Unless the write throughput and availability of timing data are sacrificed , Also do strong consistent replication . This is it. TDengine The root cause of using self-developed replication algorithm . Of course , The consistency guarantee of these algorithms in complex distributed environment is another problem , It is also a challenge that we should focus on solving .

3 At the end

A good timing database , It originates from the insight into the data characteristics in the field of temporal data , Grow up in the test of a large number of real scenes and user feedback , It has also learned from the most advanced technology in the field of database to improve . That's the only way , Finally, it can be achieved in the timing scenario “ Far exceed ” General database , Become the preferred database in this scenario . And to do this , It's not easy .

原网站

版权声明
本文为[Deep learning and python]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/162/202206111252037445.html

当前位置：网站首页>What do you need to do to "surpass" the general database in the time sequence scenario?

What do you need to do to "surpass" the general database in the time sequence scenario?

边栏推荐

猜你喜欢

随机推荐