当前位置:网站首页>Here comes a white paper to uncover the technology behind Clickhouse, a node with 10000 bytes!
Here comes a white paper to uncover the technology behind Clickhouse, a node with 10000 bytes!
2022-07-07 02:32:00 【Taro source code】
ClickHouse Open source from 2016 year , With outstanding advantages in performance , In the field of analytical database, the development can be described as a rising tide . at present , Many large head manufacturers at home and abroad are deeply using ClickHouse technology .
In terms of performance ,ClickHouse stay OLAP The performance under the scenario exceeds that of similar products several times , It allows the system to start with a sub second delay PB Level raw data generation report , The server throughput is up to hundreds of millions of lines per second .
But will ClickHouse Introduction into enterprise production environment , There are still problems . About the pit of landing practice , Not all teams in the industry need to step on it by themselves , Not all teams can afford such costs , What we need to do is learn enough experience , And choose self-study 、 Purchase and other more practical solutions .
At this point , ByteDance is undoubtedly a very representative domestic enterprise : Byte beat from 2017 In, it was put into use on a large scale ClickHouse; As its deep user , It has the largest in China ClickHouse colony .
at present , Byte beating internal ClickHouse The total number of nodes exceeds 1.8 m , The total amount of data managed exceeds 700PB, The largest single cluster deployment scale is about 2400 More than nodes .
At present , ByteDance has been customized for five years ClickHouse, Precipitate into ByteHouse, Officially provide services through volcanic engine .
From adopting and transforming open source products , Go to the online commercial version for external service , This is a very difficult road , At the same time, it also makes the practical thinking and experience more valuable .
lately , Volcano engine ByteHouse union InfoQ Publishing white papers 《 from ClickHouse To ByteHouse》, In depth introduction of ByteDance 10000 nodes ClickHouse Technical implementation behind , This volume of white paper is roughly divided into four chapters :
- ClickHouse Introduction to ;
- ClickHouse Typical scenario ;
- For ClickHouse,ByteHouse Technology optimization thinking ;
- ByteHouse Design and evolution ideas .
among ,《 from ClickHouse To ByteHouse》 From chapter three , Emphasis on the ByteHouse The optimization idea of .
at present ,ByteHouse Yes ClickHouse Many upgrades and optimizations have been made , This time I chose ByteHouse Yes ClickHouse Three very important aspects of optimization and upgrading are expanded in detail :
- Self research table engine ;
- Query optimizer ;
- Elastic and expandable .
In the self research table engine module , Even though ClickHouse Provide MergeTree Family、Memory、File、Interface And dozens of different table engines , But in the actual use of bytes , It is obvious that some table engines are not enough to meet the business needs , So the corresponding optimization is carried out .
among , Emphasis on the 了 HaMergeTree 、HaUniqueMergeTree、HaKafka Three table engines .
Excerpts from the white paper :HaMergeTree Replica collaboration principle
In the query optimizer module ,ByteHouse Yes Optimizer For more than one year , Comprehensively upgrade product capabilities , The white paper details ByteHouse Transformation and optimization function on query optimizer .
In pursuit of ultimate performance ,ClickHouse It adopts a strong coupling architecture between computing and storage nodes , The capacity cannot be expanded separately according to their actual needs , And the problem that the data cannot be automatically redistributed after the node is expanded ClickHouse Expansion brings a lot of trouble in operation and maintenance .
ByteHouse In improving and optimizing ClickHouse In the process of , It also focuses on the adjustment based on this architecture , Such as ByteHouse Decoupling in storage and computation , Realize flexible and scalable technology optimization scheme .
Excerpts from the white paper : Computing storage separation architecture
besides ,《 from ClickHouse To ByteHouse》 Give out advertisements 、 Finance 、 Practice cases of the three major industries of industrial Internet , These belong to OLAP A typical application industry , And from the perspective of technology and enterprise landing, it gives the current situation of enterprises in OLAP Three core concerns of data engine selection .
Click to read the original text to download the white paper
边栏推荐
- GEE升级,可以实现一件run tasks
- Processus général de requête pour PostgreSQL
- How do I dump SoapClient requests for debugging- How to dump SoapClient request for debug?
- Zhang Ping'an: accelerate cloud digital innovation and jointly build an industrial smart ecosystem
- Lumion 11.0软件安装包下载及安装教程
- [paper reading | deep reading] graphsage:inductive representation learning on large graphs
- 企业中台建设新路径——低代码平台
- C # / vb. Net supprime le filigrane d'un document word
- A new path for enterprise mid Platform Construction -- low code platform
- postgresql 之 数据目录内部结构 简介
猜你喜欢
【服务器数据恢复】raid损坏导致戴尔某型号服务器崩溃的数据恢复案例
[server data recovery] data recovery case of a Dell server crash caused by raid damage
leetcode:5. 最长回文子串【dp + 抓着超时的尾巴】
15million employees are easy to manage, and the cloud native database gaussdb makes HR office more efficient
go swagger使用
6-6漏洞利用-SSH安全防御
Big guys gather | nextarch foundation cloud development meetup is coming!
Zhang Ping'an: accelerate cloud digital innovation and jointly build an industrial smart ecosystem
Lumion 11.0软件安装包下载及安装教程
AWS学习笔记(一)
随机推荐
The boss is quarantined
Overall query process of PostgreSQL
张平安:加快云上数字创新,共建产业智慧生态
6 seconds to understand the book to the Kindle
leetcode:736. LISP syntax parsing [flowery + stack + status enumaotu + slots]
Processus général de requête pour PostgreSQL
The cities research center of New York University recruits master of science and postdoctoral students
Dall-E Mini的Mega版本模型发布,已开放下载
Schedulx v1.4.0 and SaaS versions are released, and you can experience the advanced functions of cost reduction and efficiency increase for free!
Linear list --- circular linked list
Rethinking of investment
解密函数计算异步任务能力之「任务的状态及生命周期管理」
1--新唐nuc980 NUC980移植 UBOOT,从外部mx25l启动
建议收藏!!Flutter状态管理插件哪家强?请看岛上码农的排行榜!
老板被隔离了
【论文阅读|深读】ANRL: Attributed Network Representation Learning via Deep Neural Networks
Jacob Steinhardt, assistant professor of UC Berkeley, predicts AI benchmark performance: AI has made faster progress in fields such as mathematics than expected, but the progress of robustness benchma
Web开发小妙招:巧用ThreadLocal规避层层传值
Integerset of PostgreSQL
Lombok同时使⽤@Data和@Builder 的坑