当前位置:网站首页>Here comes a white paper to uncover the technology behind Clickhouse, a node with 10000 bytes!
Here comes a white paper to uncover the technology behind Clickhouse, a node with 10000 bytes!
2022-07-07 02:32:00 【Taro source code】
ClickHouse Open source from 2016 year , With outstanding advantages in performance , In the field of analytical database, the development can be described as a rising tide . at present , Many large head manufacturers at home and abroad are deeply using ClickHouse technology .
In terms of performance ,ClickHouse stay OLAP The performance under the scenario exceeds that of similar products several times , It allows the system to start with a sub second delay PB Level raw data generation report , The server throughput is up to hundreds of millions of lines per second .
But will ClickHouse Introduction into enterprise production environment , There are still problems . About the pit of landing practice , Not all teams in the industry need to step on it by themselves , Not all teams can afford such costs , What we need to do is learn enough experience , And choose self-study 、 Purchase and other more practical solutions .
At this point , ByteDance is undoubtedly a very representative domestic enterprise : Byte beat from 2017 In, it was put into use on a large scale ClickHouse; As its deep user , It has the largest in China ClickHouse colony .
at present , Byte beating internal ClickHouse The total number of nodes exceeds 1.8 m , The total amount of data managed exceeds 700PB, The largest single cluster deployment scale is about 2400 More than nodes .
At present , ByteDance has been customized for five years ClickHouse, Precipitate into ByteHouse, Officially provide services through volcanic engine .
From adopting and transforming open source products , Go to the online commercial version for external service , This is a very difficult road , At the same time, it also makes the practical thinking and experience more valuable .
lately , Volcano engine ByteHouse union InfoQ Publishing white papers 《 from ClickHouse To ByteHouse》, In depth introduction of ByteDance 10000 nodes ClickHouse Technical implementation behind , This volume of white paper is roughly divided into four chapters :
- ClickHouse Introduction to ;
- ClickHouse Typical scenario ;
- For ClickHouse,ByteHouse Technology optimization thinking ;
- ByteHouse Design and evolution ideas .
among ,《 from ClickHouse To ByteHouse》 From chapter three , Emphasis on the ByteHouse The optimization idea of .
at present ,ByteHouse Yes ClickHouse Many upgrades and optimizations have been made , This time I chose ByteHouse Yes ClickHouse Three very important aspects of optimization and upgrading are expanded in detail :
- Self research table engine ;
- Query optimizer ;
- Elastic and expandable .
In the self research table engine module , Even though ClickHouse Provide MergeTree Family、Memory、File、Interface And dozens of different table engines , But in the actual use of bytes , It is obvious that some table engines are not enough to meet the business needs , So the corresponding optimization is carried out .
among , Emphasis on the 了 HaMergeTree 、HaUniqueMergeTree、HaKafka Three table engines .
Excerpts from the white paper :HaMergeTree Replica collaboration principle
In the query optimizer module ,ByteHouse Yes Optimizer For more than one year , Comprehensively upgrade product capabilities , The white paper details ByteHouse Transformation and optimization function on query optimizer .
In pursuit of ultimate performance ,ClickHouse It adopts a strong coupling architecture between computing and storage nodes , The capacity cannot be expanded separately according to their actual needs , And the problem that the data cannot be automatically redistributed after the node is expanded ClickHouse Expansion brings a lot of trouble in operation and maintenance .
ByteHouse In improving and optimizing ClickHouse In the process of , It also focuses on the adjustment based on this architecture , Such as ByteHouse Decoupling in storage and computation , Realize flexible and scalable technology optimization scheme .
Excerpts from the white paper : Computing storage separation architecture
besides ,《 from ClickHouse To ByteHouse》 Give out advertisements 、 Finance 、 Practice cases of the three major industries of industrial Internet , These belong to OLAP A typical application industry , And from the perspective of technology and enterprise landing, it gives the current situation of enterprises in OLAP Three core concerns of data engine selection .
Click to read the original text to download the white paper
边栏推荐
- 如何从0到1构建32Core树莓派集群
- ZABBIX 5.0: automatically monitor Alibaba cloud RDS through LLD
- [Mori city] random talk on GIS data (II)
- 低代码平台中的数据连接方式(上)
- 【论文阅读|深读】ANRL: Attributed Network Representation Learning via Deep Neural Networks
- Summer Challenge database Xueba notes (Part 2)~
- 【LeetCode】Day97-移除链表元素
- 3 -- Xintang nuc980 kernel supports JFFS2, JFFS2 file system production, kernel mount JFFS2, uboot network port settings, and uboot supports TFTP
- 真实项目,用微信小程序开门编码实现(完结)
- Robot team learning method to achieve 8.8 times human return
猜你喜欢
Compress JS code with terser
压缩 js 代码就用 terser
This week's hot open source project!
Web3对法律的需求
Apifox,你的API接口文档卷成这样了吗?
The last line of defense of cloud primary mixing department: node waterline design
老板被隔离了
Dall-E Mini的Mega版本模型发布,已开放下载
Lombok makes the pit of ⽤ @data and @builder at the same time
Application analysis of face recognition
随机推荐
[unity notes] screen coordinates to ugui coordinates
Lumion 11.0软件安装包下载及安装教程
leetcode:736. Lisp 语法解析【花里胡哨 + 栈 + 状态enumaotu + slots】
新一代云原生消息队列(一)
机器人队伍学习方法,实现8.8倍的人力回报
Several classes and functions that must be clarified when using Ceres to slam
建议收藏!!Flutter状态管理插件哪家强?请看岛上码农的排行榜!
SchedulX V1.4.0及SaaS版发布,免费体验降本增效高级功能!
[C # notes] use file stream to copy files
Processing image files uploaded by streamlit Library
[paper reading | deep reading] graphsage:inductive representation learning on large graphs
Real project, realized by wechat applet opening code (end)
Tips for web development: skillfully use ThreadLocal to avoid layer by layer value transmission
unity中跟随鼠标浮动的面板,并可以自适应文字内容的大小
What to do when encountering slow SQL? (next)
4--新唐nuc980 挂载initramfs nfs文件系统
Decryption function calculates "task state and lifecycle management" of asynchronous task capability
1--新唐nuc980 NUC980移植 UBOOT,从外部mx25l启动
进程管理基础
The boss is quarantined