author : Chen Yu Taosi data
WAL(Write Ahead Log), yes TDengine An important function of , It can realize the fault tolerance of data , Ensure high availability of data .
Sounds complicated , It's also very simple . For users of relational databases , It's probably equivalent to Oracle Medium redolog ,MySQL Medium binlog and redolog, It records all the update and modification operations of the database .
Write Ahead Log Translated as “ Prewrite log ”, Meaning is : Before data is written to storage , First make a record in the log in chronological order , This ensures that the application can restore the database to any state through this log , Even if the database goes down due to accidents such as power failure , It can also avoid data loss .
at present ,TDengine The community version does not support rolling back data to the specified time yet , If necessary , You can contact our enterprise team to solve .
TDengine Medium WAL The implementation mechanism is slightly special . It is the WAL In two parts , One is mnode In the catalog WAL, One is vnode In the catalog WAL.( In order to facilitate the reading of this article , Here we need to first understand TDengine Infrastructure for : The management node (mnode) And virtual data nodes (vnode) The concept of :https://www.taosdata.com/docs...)
stay TDengine Under the data file path of ( The default is /var/lib/taos), You can see the above directory structure .
mnode Of WAL The content is persistent on the hard disk , As the most important management node , its WAL Records all about the database DDL operation ( For example, create delete operation :create dnode,create account,create mnode,create user,create table, drop dnode ,drop table etc. , Or modify the operation :alter database,alter table ,alter user etc. )
and vnode In the catalog WAL Is mainly responsible for recording the operation of writing data , At the same time, the of the watch is also recorded DDL operation , It will be cleared after triggering the falling disk .( This is our previous article —— such as 《 Storage costs are only OpenTSDB Of 1/10,TDengine What's your biggest mace ?》—— Often mentioned in , You can read in combination with this article , To deepen understanding .)
after , write in vnode The time series data will be transferred to the data file directory /vnode/vnodeX/tsdb/data below . And for the table DDL operation ( That is, the metadata of the table ) It will drop to the data file directory /vnode/vnodeX/tsdb/meta In file , As shown below :
The second picture above is vnode workflow , From here, we can see more clearly how the time series data and metadata are written to the full third buffer pool After landing on the disk ( One meta file , A data file group :.data、.head、.last file ).
In conclusion ,mnode adopt WAL The cluster is recorded 、 user 、 Metadata of database and table . and vnode adopt WAL Records the metadata of data and tables , And it will be cleared after the falling disk is triggered , The metadata of the recorded table will be written to meta file , The timing data will be written to data Catalog .
In understanding WAL After the effect of , Next, we will derive the details of the scene :
First ,WAL It is to add the updated log file to the database data update operation in chronological order . Database process taosd It will be read line by line at startup mnode Under the WAL File and operate , Until the last line , To start the service smoothly .
Under such a structure , May cause this problem :
1. When the cumulative DDL When there are too many operations ,TDengine Your startup will slow down —— So how to avoid this situation ?
First , When the number of sub tables is absolutely large , This situation cannot be avoided . But this is a big order of magnitude , For the vast majority of users, it is impossible .
This is more likely to happen : There are not many environmental tables , However, it is full of frequent operations such as deleting databases, deleting tables and rebuilding tables . such as , Create a super table with 100000 sub tables and delete , Then rebuild the Super Table . Wait until the database starts loading WAL When , Even if the front create table and drop table Are invalid operations , But it will be operated again , And the delete table operation itself will be slower when loading , So as to greatly slow down TDengine The starting speed of .
therefore , To avoid that , We must be careful in the production environment . In particular, we should try our best to avoid deleting the database 、 Delete the major operations of Super Table , If it's for debugging , The test operation of repeated reconstruction must be carried out in the test environment , The production environment cannot be extended to the test environment, and the database and table can be directly built and put into use , Unless the server TDengine It has been unloaded and cleaned .
2. If it's used for too long , Or various operations to change the table structure are inevitable , Lead to mnode Under the WAL Too big , Is there no solution ?
In this case , stay 2.1.5 After the version , We provide offline compression Mnode WAL Solution :
1) stand-alone :
- systemctl stop taosd.
- taosd --compact-mnode-wal, If the execution is normal , Will be generated in the data file directory mnode_bak Catalog , Used to save the original data .
- systemctl start taosd, such TDengine Will use compressed wal Log to start the database service process .
2) colony :
- show mnodes , confirm mnode The server on which the node resides , And distinguish mnode Of master and slave.
- systemctl stop taosd, Stop all nodes of the cluster .
- 【 Optional 】 Remove all mnode nodes mnode_bak Catalog .
- stay mnode master Server , With root Permissions to perform taosd --compact-mnode-wal .
- take mnode master Compressed mnode/wal/* Copy files to other slave The node corresponds to the directory .
- Restart the cluster .
It is worth noting that ,taosd --compact-mnode-wal The first run time of the command is basically the same as the cluster startup time before compression , It won't get faster until the next time you start .
Besides , In the future TDengine 3.0 In the version , This will also be our major optimization item . because WAL It will also become distributed storage , When the , Even in the case of the number of hundred million scale ,TDengine The starting and stopping speed of will no longer be a problem . And this optimization is just 3.0 Version is the tip of the iceberg of many features . Behind this adjustment stands TDengine Many optimization modules are important for refactoring , Stability and performance will be greatly improved , Many heavy functions will also be launched .
from 1.0 Old users following the times should be right 2.0 I am deeply touched by the progress of the version of ,3.0 The version can be said to be better than the blue .
Let's look forward to .
Want to know more TDengine Specific details of , Welcome to GitHub View the relevant source code on .
![[harmonyos] [arkui] how can Hongmeng ETS call pa](/img/19/9d2c68be48417e0aaa0d27068a67ce.jpg)






