当前位置:网站首页>Here comes a white paper to uncover the technology behind Clickhouse, a node with 10000 bytes!
Here comes a white paper to uncover the technology behind Clickhouse, a node with 10000 bytes!
2022-07-07 02:32:00 【Taro source code】
ClickHouse Open source from 2016 year , With outstanding advantages in performance , In the field of analytical database, the development can be described as a rising tide . at present , Many large head manufacturers at home and abroad are deeply using ClickHouse technology .
In terms of performance ,ClickHouse stay OLAP The performance under the scenario exceeds that of similar products several times , It allows the system to start with a sub second delay PB Level raw data generation report , The server throughput is up to hundreds of millions of lines per second .
But will ClickHouse Introduction into enterprise production environment , There are still problems . About the pit of landing practice , Not all teams in the industry need to step on it by themselves , Not all teams can afford such costs , What we need to do is learn enough experience , And choose self-study 、 Purchase and other more practical solutions .
At this point , ByteDance is undoubtedly a very representative domestic enterprise : Byte beat from 2017 In, it was put into use on a large scale ClickHouse; As its deep user , It has the largest in China ClickHouse colony .
at present , Byte beating internal ClickHouse The total number of nodes exceeds 1.8 m , The total amount of data managed exceeds 700PB, The largest single cluster deployment scale is about 2400 More than nodes .
At present , ByteDance has been customized for five years ClickHouse, Precipitate into ByteHouse, Officially provide services through volcanic engine .
From adopting and transforming open source products , Go to the online commercial version for external service , This is a very difficult road , At the same time, it also makes the practical thinking and experience more valuable .
lately , Volcano engine ByteHouse union InfoQ Publishing white papers 《 from ClickHouse To ByteHouse》, In depth introduction of ByteDance 10000 nodes ClickHouse Technical implementation behind , This volume of white paper is roughly divided into four chapters :
- ClickHouse Introduction to ;
- ClickHouse Typical scenario ;
- For ClickHouse,ByteHouse Technology optimization thinking ;
- ByteHouse Design and evolution ideas .
among ,《 from ClickHouse To ByteHouse》 From chapter three , Emphasis on the ByteHouse The optimization idea of .
at present ,ByteHouse Yes ClickHouse Many upgrades and optimizations have been made , This time I chose ByteHouse Yes ClickHouse Three very important aspects of optimization and upgrading are expanded in detail :
- Self research table engine ;
- Query optimizer ;
- Elastic and expandable .
In the self research table engine module , Even though ClickHouse Provide MergeTree Family、Memory、File、Interface And dozens of different table engines , But in the actual use of bytes , It is obvious that some table engines are not enough to meet the business needs , So the corresponding optimization is carried out .
among , Emphasis on the 了 HaMergeTree 、HaUniqueMergeTree、HaKafka Three table engines .
Excerpts from the white paper :HaMergeTree Replica collaboration principle
In the query optimizer module ,ByteHouse Yes Optimizer For more than one year , Comprehensively upgrade product capabilities , The white paper details ByteHouse Transformation and optimization function on query optimizer .
In pursuit of ultimate performance ,ClickHouse It adopts a strong coupling architecture between computing and storage nodes , The capacity cannot be expanded separately according to their actual needs , And the problem that the data cannot be automatically redistributed after the node is expanded ClickHouse Expansion brings a lot of trouble in operation and maintenance .
ByteHouse In improving and optimizing ClickHouse In the process of , It also focuses on the adjustment based on this architecture , Such as ByteHouse Decoupling in storage and computation , Realize flexible and scalable technology optimization scheme .
Excerpts from the white paper : Computing storage separation architecture
besides ,《 from ClickHouse To ByteHouse》 Give out advertisements 、 Finance 、 Practice cases of the three major industries of industrial Internet , These belong to OLAP A typical application industry , And from the perspective of technology and enterprise landing, it gives the current situation of enterprises in OLAP Three core concerns of data engine selection .
Click to read the original text to download the white paper
边栏推荐
- Pgadmin4 of PostgreSQL graphical interface tool
- 进程管理基础
- FLIR blackfly s usb3 industrial camera: how to use counters and timers
- 企业中台建设新路径——低代码平台
- 【LeetCode】Day97-移除链表元素
- 【论文阅读|深读】RolNE: Improving the Quality of Network Embedding with Structural Role Proximity
- PostgreSQL图形化界面工具之pgAdmin4
- 建議收藏!!Flutter狀態管理插件哪家强?請看島上碼農的排行榜!
- argo workflows源码解析
- 猿桌派第三季开播在即,打开出海浪潮下的开发者新视野
猜你喜欢

Lumion 11.0软件安装包下载及安装教程
![[server data recovery] data recovery case of a Dell server crash caused by raid damage](/img/29/e07bf1f8eae9be19f6eed69be5642d.jpg)
[server data recovery] data recovery case of a Dell server crash caused by raid damage

Schedulx v1.4.0 and SaaS versions are released, and you can experience the advanced functions of cost reduction and efficiency increase for free!

Decryption function calculates "task state and lifecycle management" of asynchronous task capability

Lombok makes the pit of ⽤ @data and @builder at the same time
![leetcode:5. Longest palindrome substring [DP + holding the tail of timeout]](/img/62/d4d5428f69fc221063a4f607750995.png)
leetcode:5. Longest palindrome substring [DP + holding the tail of timeout]

The last line of defense of cloud primary mixing department: node waterline design

FLIR blackfly s usb3 industrial camera: how to use counters and timers

Lumion 11.0 software installation package download and installation tutorial

Sensor: introduction of soil moisture sensor (xh-m214) and STM32 drive code
随机推荐
[unity notes] screen coordinates to ugui coordinates
6 seconds to understand the book to the Kindle
Seconds understand the delay and timing function of wechat applet
How can reinforcement learning be used in medical imaging? A review of Emory University's latest "reinforcement learning medical image analysis", which expounds the latest RL medical image analysis co
Zhang Ping'an: accelerate cloud digital innovation and jointly build an industrial smart ecosystem
Word wrap when flex exceeds width
阿里云易立:云原生如何破解企业降本提效难题?
Gee upgrade can realize one piece of run tasks
go swagger使用
B站6月榜单丨飞瓜数据UP主成长排行榜(哔哩哔哩平台)发布!
处理streamlit库上传的图片文件
企业中台建设新路径——低代码平台
Use of fiddler
String or binary data will be truncated
A new path for enterprise mid Platform Construction -- low code platform
SchedulX V1.4.0及SaaS版发布,免费体验降本增效高级功能!
张平安:加快云上数字创新,共建产业智慧生态
Increase 900w+ playback in 1 month! Summarize 2 new trends of top flow qiafan in station B
Alibaba cloud middleware open source past
Lombok makes the pit of ⽤ @data and @builder at the same time