当前位置:网站首页>Mo Tianlun salon | Tsinghua qiaojialin: Apache iotdb, originated from Tsinghua, builds an open source ecological road
Mo Tianlun salon | Tsinghua qiaojialin: Apache iotdb, originated from Tsinghua, builds an open source ecological road
2022-07-01 19:45:00 【-Xiaolan】
stay 6 month 8 Day 【 Ink wheel database salon No. 7 — Special session on open source ecology 】 in , Doctor of Tsinghua University , assistant research fellow ,Apache IoTDB PMC Teacher qiaojialin shared 《Apache IoTDB, From Tsinghua , Build an open source ecological road 》 keynote speech , This article is to sort out the content .
Reading guide Hello everyone , I am qiaojialin from Tsinghua University .Apache IoTDB It's an open source project , It originated from the laboratory of Tsinghua University , Subsequently, it was open source and donated to Apache The foundation . The content I share today is mainly divided into four aspects :IoTDB Background origin 、IoTDB Introduce 、 Open source construction and how to join us .
Background origin
1、 What is time series data
First ,IoTDB It manages time series data , That is, curve data changing with the time axis , Like in stocks K Line is a typical time series data . Time series data occupy a large volume in the field of Internet of things , It is a digital record of physical quantities of equipment , It is a true depiction of the physical world .
chart 1 Time series data diagram
The use of time series data is mainly divided into four scenarios : monitor 、 The alarm 、 forecast 、 trace .
First of all monitor Application of scene : We all want visual monitoring software , So as to clearly see its running state .
Secondly, time series data can be used for The alarm scene . When industrial enterprises monitor machinery and equipment , It is difficult to distinguish the anomalies in software operation with the naked eye , So we need to set some effective rules , When the data exceeds the preset threshold , Realize the function of alarm .
Time series data can also realize forecast . When the equipment is in poor operating condition , According to historical experience and the trend of data changes , To predict whether the device will break down , So as to help enterprises avoid unnecessary losses .
Finally, the timing data can be realized trace . When we find a fault , The change rule of historical data can be analyzed through the causes of historical data failure , So as to get some useful knowledge , To avoid the recurrence of subsequent faults .
chart 2 Use of time series data
2、IoTDB The origin of development
IoTDB The development of has gone through six stages .
2011 Years of gestation : In the country 863 In the project , Practice massive machine data management solutions in Sany Heavy Industry and other enterprises .
In the context of industrial Internet of things , Complex metadata management is required 、 Mass data storage 、 Rich data processing 、 Side cloud collaboration , These are great challenges for data management .
chart 3 Data management requirements of industrial Internet of things
In this context , Based on the traditional relational database single point bottleneck , The model is difficult to modify and write 、 Pain points that performance is difficult to meet , We from 2011 Started to try big data management solutions in , Such as Cassandra、HBase, But they also have some bottlenecks .
Therefore, we investigated the differences between different database management time series data , As shown in the figure below : chart 4 Limitations of existing systems for managing time series data
So from 2015 In, we entered IoTDB Self research period of , To start “ tsinghua IoTDB” Development .2016 year 3 The storage format of sequential data column compact file is proposed in June TsFile, Subsequent release 0.7.0 edition .
Start with developing data file formats ,IoTDB The course of self-study of the university has started . The following format diagram describes two parts : On the left is the data area , Column storage is adopted , Store the time and value of each time series separately , This enables better coding and compression . On the right is the index diagram , It can quickly query massive time series .
chart 5 Format diagram of data management
IoTDB The first practical project is Qinghai new energy big data platform , The project is to manage the data of each power generation group in Qinghai power plant , In the process of actual combat , We have also found some problems of time series data in industrial management , For example, out of order 、 The data scale is not high , Large scale, etc , The discovery of these problems also provides valuable experience for our subsequent system upgrade and improvement .
chart 6 IoTDB Actual combat on Qinghai new energy big data platform
IoTDB stay 2018 In, it entered the incubation period of open source . Same year 11 month ,IoTDB Become Apache Its incubator project , It has attracted people from Germany 、 The United States 、 Australia and other international peers pay attention to .
IoTDB Why open source ? Share our thoughts here .
IoTDB Originated in Colleges and universities , We hope to participate in actual projects with real weapons . therefore IoTDB It is not only a scientific research project , It should be an industrial product , The actual project that can be deployed to this user , Can produce value , Play value .
Second point IoTDB As basic software , It needs the joint participation of a wider range of contributors and users .
More Than This , Benchmarking foreign Berkeley Universities , They have Spark This is a standardized product for computing , We hope Chinese universities can also create an open source software , To enhance the international influence of Chinese universities .
So why did you choose... In the process of open source Apache The foundation ? because Apache It is a family of big data systems , We are familiar with Hadoop、Spark、HBase、Flink All originated from Apache The incubator . Time series data is a kind of big data , We hope to develop this project completely , So choose Apache The foundation .
That's all IoTDB The open source route .
chart 7 Time series database has gradually become popular from obscurity
2019 year IoTDB Achieve rapid growth . The project has successively obtained excellent big data products 、 The first prize of China excellent open source project , It will be released as an important achievement at the China Industrial Internet Summit .
2020 year IoTDB Successful graduation .Apache IoTDB Become the world's top project , It marks the IoTDB Built a globally recognized international open source community , And become China's colleges and universities in Apache The only successful incubation project led by the community .
2021 year IoTDB Selected into the achievements of the 13th five year plan .Apache IoTDB Participating countries “ Much starker choices-and graver consequences-in ” Scientific and technological innovation achievement exhibition .
review IoTDB Development history of , is “ Ten years to sharpen a sword ”.
chart 8 Apache IoTDB development history
IoTDB Introduce
1、Apache IoTDB What is it?
Apache IoTDB( Internet of things database ) It's an integrated collection 、 Storage 、 Software system for managing and analyzing time series data of Internet of things . It has high performance and rich functions , And with Apache Hadoop、Spark and Flink And so on , It can meet the needs of massive data storage in the field of industrial Internet of things 、 Analyze and read complex data at high speed .
Apache IoTDB Also has a Simple and easy to use 、 Low cost and high performance 、 Easy migration 、 Rich data processing ecology 、 Provider - edge - cloud ” One stop solution Performance of .
chart 9 Apache IoTDB System architecture
2、Apache IoTDB characteristic
Apache IoTDB As a Lightweight 、 High performance 、 Low cost time series database , With open system architecture 、 Lightweight deployment 、 Rich ecological 、 Internet of things exclusive model 、 High compression ratio 、 Low latency queries 、 Rich data processing 、 Efficient storage engine And so on .
chart 10 Apache IoTDB The eight characteristics of
The following figure for IoTDB Open source 、 Model 、 Inquire about 、 File comparison with other time series databases .
chart 11 Apache IoTDB Open source 、 Model 、 Inquire about 、 File comparison with other databases
3、Apache IoTDB function
IoTDB It has the ability to implement multiple query views , Support the query of various views SQL Logic The function of . When writing, it is the metadata of the Internet of things , But it can be converted into multiple views when querying , Each view has SQL Query column , And the conditions for implementing different filters , In this way, we can meet the different needs of the business system , Query for different dimensions , So as to write a very dynamic model .
chart 12 Apache IoTDB Support multiple query views
It not only supports multiple query views ,IoTDB It also has rich query functions , Support downsampling 、 Data alignment 、 repair The function of . In the query, it can realize downsampling to every minute 1 Data points 、 Multi series data alignment by time 、 Fix missing data .
chart 13 Apache IoTDB Rich query functions
In addition to the above functions ,IoTDB It also supports user-defined functions Count , Users develop by themselves 、 Create custom functions to meet customized computing needs . meanwhile , Currently built in 11 class UDF library , common 75 individual function , For user's use .
chart 14 Apache IoTDB UDF Function type and name
Besides IoTDB have Custom triggers , Realize the function of real-time calculation . Time sequence data has alarm requirements , So we have IoTDB Triggers are supported in , When a piece of data enters the database , Check based logic triggers to a certain threshold , You can give an alarm to other systems .
chart 15 Apache IoTDB Custom trigger functions
The function shown in the following figure is materialized view . We hope that the equipment A And equipment B Take the average of the speed , Then we can go through average This function is used to query , Then write the calculation results back to the database , In this way, the results can be directly accessed in the next use , There's no need to double count . The above is the materialized view select into Function implementation scenarios for .
chart 16 Apache IoTDB Materialized view function
The last one is IoTDB Continuous query The function of , This function is widely used in time series data management . We usually collect data in a high-frequency way , At the same time, I don't want to miss any point , However, it is necessary to perform various downsampling or segmented aggregation on the data , If we can segment the data in advance and save it , It can greatly accelerate the efficiency of subsequent analysis . Therefore, continuous query can be customized for background operations , And regularly calculate and process the data for a period of time .
chart 16 Apache IoTDB Continuous query function
Open source construction
1、 About Apache The foundation
Apache The foundation was founded in 199 year , So far 22 Years of history , share 351 A project , The total number of codes is 2.2 Billion rows . The total value of these codes is 220 Billion dollars , share 8200 individual committer.
chart 17 Apache Foundation development
2、IoTDB Open source construction of
stay 2021 Year of Apache Foundation global 351 In the ranking of items ,IoTDB No. 7 , exceed Hadoop、Hbase, Second only to Spark.IoTDB Our code contributors are distributed in 、 beautiful 、 Virtue 、 Britain 、 Australia and other countries , It is the only time series with international attribute in China DB The open source community .
Join the community
Developers are the beneficiaries of open source , Should be a contributor , This is also IoTDB Reasons for choosing open source .
Here are the channels to participate in the community , Welcome to the open source construction .
chart 18 Join in IoTDB organization
This is where I share today , Thank you. !
More highlights , Welcome to watch live video playback and conference materials Video playback :https://www.modb.pro/video/6499 Conference materials :https://www.modb.pro/doc/64961
- Look at the original :https://www.modb.pro/db/421250
- see 【 Domestic database salon 】 Special article on open source ecology 、 Video playback resources :https://www.modb.pro/topic/412121
For more information, you can enter Mo Tianlun community , Provide one-stop comprehensive service around the learning and growth of data people , Create a news collection 、 Online Q & A 、 Live broadcast of the event 、 Online courses 、 Document reading 、 Download resources 、 A unified platform for knowledge sharing and online operation and maintenance , Continue to promote knowledge dissemination and technological innovation in the field of data .
Official account : Mo Tianlun 、 Motianlun platform 、 Motianlun Growth Camp 、 Database localization 、 Database information
边栏推荐
- After studying 11 kinds of real-time chat software, I found that they all have these functions
- Technology T3 domestic platform! Successfully equipped with "Yihui domestic real-time system sylixos"
- GC垃圾回收
- 【let var const】
- Opencv video quality detection -- sharpness detection
- Proxy in ES6
- 面试题 16.16. 部分排序-双指针法
- tensorflow报错Could not load dynamic library ‘libcudnn.so.8
- [Mori city] random talk on GIS data (I)
- 事务隔离级别 gap锁 死锁
猜你喜欢
P2433 【深基1-2】小学数学 N 合一
118. 杨辉三角
1592 例题1 国王(Sgu223 LOJ10170 LUOGU1896 提高+/省选-) 暴力思考 状压DP 01背包
Wireshark packet analysis TCP, FTP
What must be done in graduation season before going to Shanhai
Summary of SQL query de duplication statistics methods
Leetcode 1380 lucky numbers in matrix [array] the leetcode path of heroding
How to configure webrtc video streaming format for easygbs, a new version of national standard gb28181 video platform?
为什么一定要从DevOps走向BizDevOps?
Shell advanced
随机推荐
How to configure webrtc video streaming format for easygbs, a new version of national standard gb28181 video platform?
Ffmpeg audio related commands
Opencv video quality diagnosis - VIDEO occlusion diagnosis
CMU AI PhD first year summary
通过js实现金字塔(星号金字塔,回文对称数字金字塔)
AAAI2020: Real-time Scene Text Detection with Differentiable Binarization
H264 encoding profile & level control
Collation of open source protocols of open source frameworks commonly used in Web Development
axure不显示元件库
【AI服务器搭建】CUDA环境
JS的Proxy
The key to the success of digital transformation enterprises is to create value with data
Bao, what if the O & M 100+ server is a headache? Use Xingyun housekeeper!
GB28181的NAT穿透
Battery simulation of gazebo robot
利用win7漏洞进行系统登录密码破解
What is the essential difference between Bi development and report development?
Ffmpeg avframe to cv:: mat
Anaconda安装虚拟环境到指定路径
ModSim基本使用(Modbus模拟器)