当前位置:网站首页>Here comes a white paper to uncover the technology behind Clickhouse, a node with 10000 bytes!
Here comes a white paper to uncover the technology behind Clickhouse, a node with 10000 bytes!
2022-07-07 02:32:00 【Taro source code】
ClickHouse Open source from 2016 year , With outstanding advantages in performance , In the field of analytical database, the development can be described as a rising tide . at present , Many large head manufacturers at home and abroad are deeply using ClickHouse technology .
In terms of performance ,ClickHouse stay OLAP The performance under the scenario exceeds that of similar products several times , It allows the system to start with a sub second delay PB Level raw data generation report , The server throughput is up to hundreds of millions of lines per second .
But will ClickHouse Introduction into enterprise production environment , There are still problems . About the pit of landing practice , Not all teams in the industry need to step on it by themselves , Not all teams can afford such costs , What we need to do is learn enough experience , And choose self-study 、 Purchase and other more practical solutions .
At this point , ByteDance is undoubtedly a very representative domestic enterprise : Byte beat from 2017 In, it was put into use on a large scale ClickHouse; As its deep user , It has the largest in China ClickHouse colony .
at present , Byte beating internal ClickHouse The total number of nodes exceeds 1.8 m , The total amount of data managed exceeds 700PB, The largest single cluster deployment scale is about 2400 More than nodes .
At present , ByteDance has been customized for five years ClickHouse, Precipitate into ByteHouse, Officially provide services through volcanic engine .
From adopting and transforming open source products , Go to the online commercial version for external service , This is a very difficult road , At the same time, it also makes the practical thinking and experience more valuable .
lately , Volcano engine ByteHouse union InfoQ Publishing white papers 《 from ClickHouse To ByteHouse》, In depth introduction of ByteDance 10000 nodes ClickHouse Technical implementation behind , This volume of white paper is roughly divided into four chapters :
- ClickHouse Introduction to ;
- ClickHouse Typical scenario ;
- For ClickHouse,ByteHouse Technology optimization thinking ;
- ByteHouse Design and evolution ideas .
among ,《 from ClickHouse To ByteHouse》 From chapter three , Emphasis on the ByteHouse The optimization idea of .
at present ,ByteHouse Yes ClickHouse Many upgrades and optimizations have been made , This time I chose ByteHouse Yes ClickHouse Three very important aspects of optimization and upgrading are expanded in detail :
- Self research table engine ;
- Query optimizer ;
- Elastic and expandable .
In the self research table engine module , Even though ClickHouse Provide MergeTree Family、Memory、File、Interface And dozens of different table engines , But in the actual use of bytes , It is obvious that some table engines are not enough to meet the business needs , So the corresponding optimization is carried out .
among , Emphasis on the 了 HaMergeTree 、HaUniqueMergeTree、HaKafka Three table engines .
Excerpts from the white paper :HaMergeTree Replica collaboration principle
In the query optimizer module ,ByteHouse Yes Optimizer For more than one year , Comprehensively upgrade product capabilities , The white paper details ByteHouse Transformation and optimization function on query optimizer .
In pursuit of ultimate performance ,ClickHouse It adopts a strong coupling architecture between computing and storage nodes , The capacity cannot be expanded separately according to their actual needs , And the problem that the data cannot be automatically redistributed after the node is expanded ClickHouse Expansion brings a lot of trouble in operation and maintenance .
ByteHouse In improving and optimizing ClickHouse In the process of , It also focuses on the adjustment based on this architecture , Such as ByteHouse Decoupling in storage and computation , Realize flexible and scalable technology optimization scheme .
Excerpts from the white paper : Computing storage separation architecture
besides ,《 from ClickHouse To ByteHouse》 Give out advertisements 、 Finance 、 Practice cases of the three major industries of industrial Internet , These belong to OLAP A typical application industry , And from the perspective of technology and enterprise landing, it gives the current situation of enterprises in OLAP Three core concerns of data engine selection .
Click to read the original text to download the white paper
边栏推荐
- Lidar: introduction and usage of ouster OS
- Word wrap when flex exceeds width
- Linear list --- circular linked list
- 安全交付工程师
- MySQL
- 解密函数计算异步任务能力之「任务的状态及生命周期管理」
- postgresql之整體查詢大致過程
- #yyds干货盘点# 解决名企真题:最大差值
- UC伯克利助理教授Jacob Steinhardt预测AI基准性能:AI在数学等领域的进展比预想要快,但鲁棒性基准性能进展较慢
- [C # notes] reading and writing of the contents of text files
猜你喜欢

建議收藏!!Flutter狀態管理插件哪家强?請看島上碼農的排行榜!

Draco - glTF模型压缩利器

Recommended collection!! Which is the best flutter status management plug-in? Please look at the ranking list of yard farmers on the island!
![[Mori city] random talk on GIS data (II)](/img/5a/dfa04e3edee5aa6afa56dfe614d59f.jpg)
[Mori city] random talk on GIS data (II)

UC伯克利助理教授Jacob Steinhardt预测AI基准性能:AI在数学等领域的进展比预想要快,但鲁棒性基准性能进展较慢
![[unity notes] screen coordinates to ugui coordinates](/img/e4/fc18dd9b4b0e36ec3e278e5fb3fd23.jpg)
[unity notes] screen coordinates to ugui coordinates

低代码平台中的数据连接方式(上)

postgresql之整体查询大致过程

Ali yunyili: how does yunyuansheng solve the problem of reducing costs and improving efficiency?

Compress JS code with terser
随机推荐
老板被隔离了
How can reinforcement learning be used in medical imaging? A review of Emory University's latest "reinforcement learning medical image analysis", which expounds the latest RL medical image analysis co
leetcode:5. 最长回文子串【dp + 抓着超时的尾巴】
Argo workflows source code analysis
[C # notes] use file stream to copy files
Alibaba cloud middleware open source past
ODBC database connection of MFC windows programming [147] (with source code)
建議收藏!!Flutter狀態管理插件哪家强?請看島上碼農的排行榜!
AWS学习笔记(一)
Increase 900w+ playback in 1 month! Summarize 2 new trends of top flow qiafan in station B
String or binary data will be truncated
Data connection mode in low code platform (Part 1)
How do I dump SoapClient requests for debugging- How to dump SoapClient request for debug?
A new path for enterprise mid Platform Construction -- low code platform
Lombok makes the pit of ⽤ @data and @builder at the same time
The mega version model of dall-e MINI has been released and is open for download
The cities research center of New York University recruits master of science and postdoctoral students
Go swagger use
Big guys gather | nextarch foundation cloud development meetup is coming!
阿里云易立:云原生如何破解企业降本提效难题?