当前位置:网站首页>Clickhouse 20.x distributed table testing and chproxy deployment (II)
Clickhouse 20.x distributed table testing and chproxy deployment (II)
2022-07-27 15:58:00 【51CTO】
label ( Test case space separation for air test ):clickhouse series
One : clickhouse20.x Distributed measurement of
1.1:clickhosue Distributed table creation
Prepare test files :
Refer to the official website
https://clickhouse.com/docs/en/getting-started/example-datasets/metrica
Download the file :
curl https://datasets.clickhouse.com/hits/tsv/hits_v1.tsv.xz | unxz --threads=`nproc` > hits_v1.tsv
# Validate the checksum
md5sum hits_v1.tsv
# Checksum should be equal to: f3631b6295bf06989c1437491f7592cb
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
1.2: Create libraries and distributed tables

Create local table :
clickhouse-client -h node03 -u default --password tzck123.com
CREATE TABLE datasets.hits_v1 on cluster tzcluster3s2r02 ( WatchID UInt64, JavaEnable UInt8, Title String, GoodEvent Int16, EventTime DateTime, EventDate Date, CounterID UInt32, ClientIP UInt32, ClientIP6 FixedString(16), RegionID UInt32, UserID UInt64, CounterClass Int8, OS UInt8, UserAgent UInt8, URL String, Referer String, URLDomain String, RefererDomain String, Refresh UInt8, IsRobot UInt8, RefererCategories Array(UInt16), URLCategories Array(UInt16), URLRegions Array(UInt32), RefererRegions Array(UInt32), ResolutionWidth UInt16, ResolutionHeight UInt16, ResolutionDepth UInt8, FlashMajor UInt8, FlashMinor UInt8, FlashMinor2 String, NetMajor UInt8, NetMinor UInt8, UserAgentMajor UInt16, UserAgentMinor FixedString(2), CookieEnable UInt8, JavascriptEnable UInt8, IsMobile UInt8, MobilePhone UInt8, MobilePhoneModel String, Params String, IPNetworkID UInt32, TraficSourceID Int8, SearchEngineID UInt16, SearchPhrase String, AdvEngineID UInt8, IsArtifical UInt8, WindowClientWidth UInt16, WindowClientHeight UInt16, ClientTimeZone Int16, ClientEventTime DateTime, SilverlightVersion1 UInt8, SilverlightVersion2 UInt8, SilverlightVersion3 UInt32, SilverlightVersion4 UInt16, PageCharset String, CodeVersion UInt32, IsLink UInt8, IsDownload UInt8, IsNotBounce UInt8, FUniqID UInt64, HID UInt32, IsOldCounter UInt8, IsEvent UInt8, IsParameter UInt8, DontCountHits UInt8, WithHash UInt8, HitColor FixedString(1), UTCEventTime DateTime, Age UInt8, Sex UInt8, Income UInt8, Interests UInt16, Robotness UInt8, GeneralInterests Array(UInt16), RemoteIP UInt32, RemoteIP6 FixedString(16), WindowName Int32, OpenerName Int32, HistoryLength Int16, BrowserLanguage FixedString(2), BrowserCountry FixedString(2), SocialNetwork String, SocialAction String, HTTPError UInt16, SendTiming Int32, DNSTiming Int32, ConnectTiming Int32, ResponseStartTiming Int32, ResponseEndTiming Int32, FetchTiming Int32, RedirectTiming Int32, DOMInteractiveTiming Int32, DOMContentLoadedTiming Int32, DOMCompleteTiming Int32, LoadEventStartTiming Int32, LoadEventEndTiming Int32, NSToDOMContentLoadedTiming Int32, FirstPaintTiming Int32, RedirectCount Int8, SocialSourceNetworkID UInt8, SocialSourcePage String, ParamPrice Int64, ParamOrderID String, ParamCurrency FixedString(3), ParamCurrencyID UInt16, GoalsReached Array(UInt32), OpenstatServiceName String, OpenstatCampaignID String, OpenstatAdID String, OpenstatSourceID String, UTMSource String, UTMMedium String, UTMCampaign String, UTMContent String, UTMTerm String, FromTag String, HasGCLID UInt8, RefererHash UInt64, URLHash UInt64, CLID UInt32, YCLID UInt64, ShareService String, ShareURL String, ShareTitle String, ParsedParams Nested(Key1 String, Key2 String, Key3 String, Key4 String, Key5 String, ValueDouble Float64), IslandID FixedString(16), RequestNum UInt32, RequestTry UInt8) ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{layer}-{shard}/datasets/hits_v1','{replica}') PARTITION BY toYYYYMM(EventDate) ORDER BY (CounterID, EventDate, intHash32(UserID)) SAMPLE BY intHash32(UserID) SETTINGS index_granularity = 8192
- 1.
- 2.
- 3.
- 4.
- 5.




Query with distributed table from any node
clickhouse-client -h node01 -u default -d datasets --password tzck123.com --query "select count(1) from datasets.hits_v1_all"
clickhouse-client -h node02 -u default -d datasets --password tzck123.com --query "select count(1) from datasets.hits_v1_all"
clickhouse-client -h node03 -u default -d datasets --password tzck123.com --query "select count(1) from datasets.hits_v1_all"
clickhouse-client -h node04 -u default -d datasets --password tzck123.com --query "select count(1) from datasets.hits_v1_all"
clickhouse-client -h node05 -u default -d datasets --password tzck123.com --query "select count(1) from datasets.hits_v1_all"
clickhouse-client -h node06 -u default -d datasets --password tzck123.com --query "select count(1) from datasets.hits_v1_all"
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.











node01 And node05 Of hits_v1 The tables are 1674680 Data
Look at the distribution table hits_v1_all How many pieces are there
Query with distributed table from any node
clickhouse-client -h node01 -u default -d datasets --password tzck123.com --query "select count(1) from datasets.hits_v1_all"
clickhouse-client -h node02 -u default -d datasets --password tzck123.com --query "select count(1) from datasets.hits_v1_all"
clickhouse-client -h node03 -u default -d datasets --password tzck123.com --query "select count(1) from datasets.hits_v1_all"
clickhouse-client -h node04 -u default -d datasets --password tzck123.com --query "select count(1) from datasets.hits_v1_all"
clickhouse-client -h node05 -u default -d datasets --password tzck123.com --query "select count(1) from datasets.hits_v1_all"
clickhouse-client -h node06 -u default -d datasets --password tzck123.com --query "select count(1) from datasets.hits_v1_all"
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.






Distributed tables have 5024040 The data is 1674680 Of 3 times Because clusters are 3 Fragmentation 2 Cluster of replicas
Through the above tests and the characteristics of the cluster We can apply it in production clickhouse When writing local tables , Reading distributed tables
The load of the upper layer can be openresty do tcp Of 8123 Ports are proxy connections
- 1.
- 2.
- 3.
- 4.
- 5.
Two : About openresty Agent for clickhouse load
How to install openresty Omitted here You can refer to flyfish The article :https://blog.51cto.com/flyfish225/3108573
Need to give openresty Add plug-ins --with-stream Module support tcp Agent for :
Here is a list of openresty You can refer to the configuration file of :
cd /usr/local/openresty/nginx/conf
vim nginx.conf
-----
#user nobody;
worker_processes 8;
error_log /usr/local/openresty/nginx/logs/error.log;
#error_log logs/error.log notice;
#error_log logs/error.log info;
pid logs/nginx.pid;
events {
worker_connections 1024;
}
stream {
log_format proxy '$remote_addr [$time_local] '
'$protocol $status $bytes_sent $bytes_received '
'$session_time "$upstream_addr" '
'"$upstream_bytes_sent" "$upstream_bytes_received" "$upstream_connect_time"';
access_log /usr/local/openresty/nginx/logs/tcp-access.log proxy ;
open_log_file_cache off;
include /usr/local/openresty/nginx/conf/conf.d/*.stream;
}
http {
include mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /usr/local/openresty/nginx/logs/access.log main;
sendfile on;
#tcp_nopush on;
#keepalive_timeout 0;
keepalive_timeout 60;
gzip on;
server {
listen 18080;
server_name localhost;
#charset koi8-r;
#access_log logs/host.access.log main;
location / {
root html;
index index.html index.htm;
}
#error_page 404 /404.html;
# redirect server error pages to the static page /50x.html
#
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root html;
}
}
}
-----
It is important to enable openrestry Of tcp Agent module here
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
- 21.
- 22.
- 23.
- 24.
- 25.
- 26.
- 27.
- 28.
- 29.
- 30.
- 31.
- 32.
- 33.
- 34.
- 35.
- 36.
- 37.
- 38.
- 39.
- 40.
- 41.
- 42.
- 43.
- 44.
- 45.
- 46.
- 47.
- 48.
- 49.
- 50.
- 51.
- 52.
- 53.
- 54.
- 55.
- 56.
- 57.
- 58.
- 59.
- 60.
- 61.
- 62.
- 63.
- 64.
- 65.
- 66.
- 67.
- 68.
- 69.
- 70.
- 71.
- 72.
- 73.
- 74.
- 75.
- 76.
- 77.
- 78.
- 79.
- 80.
- 81.
- 82.

cd /usr/local/openresty/nginx/conf/conf.d
vim ck_prod.stream
----
upstream ck {
server 192.168.100.142:8123 weight=25 max_fails=3 fail_timeout=60s;
server 192.168.100.143:8123 weight=25 max_fails=3 fail_timeout=60s;
server 192.168.100.144:8123 weight=25 max_fails=3 fail_timeout=60s;
server 192.168.100.145:8123 weight=25 max_fails=3 fail_timeout=60s;
server 192.168.100.146:8123 weight=25 max_fails=3 fail_timeout=60s;
}
server {
listen 18123;
proxy_pass ck;
}
----
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.







3、 ... and : About chproxy agent
3.1 chproxy Introduction to :
chproxy A powerful clickhouse http Agent and load balancing middleware
chproxy Is based on golang Compiling clickhouse http Service proxy and load balancing middleware , It has rich functions
be based on yaml To configure , It is a good tool for multi cluster traffic processing
github:
https://github.com/Vertamedia/chproxy
Official website :
https://www.chproxy.org/cn
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.


3.2 chproxy Deployment of :

To configure chproxy Agent for
mkdir /etc/chproxy/
cd /etc/chproxy/
vim chproxy.yml
-----------------
server:
http:
listen_addr: ":19000"
allowed_networks: ["192.168.100.0/24","192.168.120.0/24" ]
users:
- name: "distributed-write"
to_cluster: "distributed-write"
to_user: "default"
- name: "replica-write"
to_cluster: "replica-write"
to_user: "default"
- name: "distributed-read"
to_cluster: "distributed-read"
to_user: "default"
max_concurrent_queries: 6
max_execution_time: 1m
clusters:
- name: "replica-write"
replicas:
- name: "replica"
nodes: ["node01:8123", "node02:8123", "node03:8123", "node04:8123","node05:8123","node06:8123"]
users:
- name: "default"
password: "tzck123.com"
- name: "distributed-write"
nodes: [
"node01:8123",
"node02:8123",
"node03:8123",
"node04:8123",
"node05:8123",
"node06:8123"
]
users:
- name: "default"
password: "tzck123.com"
- name: "distributed-read"
nodes: [
"node01:8123",
"node02:8123",
"node03:8123",
"node04:8123",
"node05:8123",
"node06:8123"
]
users:
- name: "default"
password: "tzck123.com"
caches:
- name: "shortterm"
dir: "/etc/chproxy/cache/shortterm"
max_size: 150Mb
expire: 130s
-----------------
Startup file :
vim chproxy.sh
-------
#!/bin/bash
cd /etc/chproxy
ps -ef | grep chproxy | head -2 | tail -1 | awk '{print $2}' | xargs kill -9
nohup /usr/bin/chproxy -config=/etc/chproxy/config.yml >> ./chproxy.out 2>&1 &
-----
chmod +x chproxy.sh
./chproxy.sh
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
- 21.
- 22.
- 23.
- 24.
- 25.
- 26.
- 27.
- 28.
- 29.
- 30.
- 31.
- 32.
- 33.
- 34.
- 35.
- 36.
- 37.
- 38.
- 39.
- 40.
- 41.
- 42.
- 43.
- 44.
- 45.
- 46.
- 47.
- 48.
- 49.
- 50.
- 51.
- 52.
- 53.
- 54.
- 55.
- 56.
- 57.
- 58.
- 59.
- 60.
- 61.
- 62.
- 63.
- 64.
- 65.
- 66.
- 67.
- 68.
- 69.
- 70.
- 71.
- 72.
- 73.
- 74.
- 75.
- 76.
- 77.
- 78.
- 79.
- 80.
- 81.
- 82.
- 83.

test chproxy
stay node Query on the node
echo 'select * from system.clusters' | curl 'http://localhost:8123/?user=default&password=tzck123.com' --data-binary @-
stay chproxy The agent queries :
echo 'select * from system.clusters' | curl 'http://192.168.100.120:19000/?user=distributed-read&password=' --data-binary @-
echo 'select * from system.clusters' | curl 'http://192.168.100.120:19000/?user=distributed-write&password=' --data-binary @-
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.




边栏推荐
- C language: string function and memory function
- [sword finger offer] interview question 53- Ⅱ: missing numbers in 0 ~ n-1 - binary search
- Analysis of spark task scheduling exceptions
- Multimap case
- 【云享读书会第13期】音频文件的封装格式和编码格式
- 表格插入行内公式后,单元格失去焦点
- [sword finger offer] interview question 41: median in data flow - large and small heap implementation
- Using Lombok results in the absence of parent class attributes in the printed toString
- 兆骑科创创业大赛策划承办机构,双创平台,项目落地对接
- Network device hard core technology insider router Chapter 22
猜你喜欢
随机推荐
Under the ban, the Countermeasures of security giants Haikang and Dahua!
[正则表达式] 单个字符匹配
[sword finger offer] interview question 56-i: the number of numbers in the array I
线程中死锁的成因及解决方案
Binder初始化过程
初识MySQL数据库
Interview focus - TCP protocol of transport layer
Half find
The shell script reads the redis command in the text and inserts redis in batches
C语言:三子棋游戏
[regular expression] matching grouping
台积电6纳米制程将于明年一季度进入试产
台积电的反击:指控格芯侵犯25项专利,并要求禁售!
Go language slow start - package
Static关键字的三种用法
Talk about ThreadLocal
DRF学习笔记(三):模型类序列化器ModelSerializer
C: On function
Extended log4j supports the automatic deletion of log files according to time division and expired files
文字批量替换功能







