当前位置:网站首页>Clickhouse2 fragment 2 replica high availability cluster setup and chproxy proxy configuration
Clickhouse2 fragment 2 replica high availability cluster setup and chproxy proxy configuration
2022-06-09 06:40:00 【shy_ snow】
clickhouse Clusters are table level , At the node level clickhouse Each node is independent , Even if the cluster is formed, it is independent , So you can only connect to clickhouse On a single node of . By analogy redis, It's kind of like building blocks , It can be assembled randomly according to the configuration file . As long as... Is installed on each node clickhouse Configure after stand-alone config.xml and /etc/metrika.xml Just configure the connection information of other nodes and the sharding and replica conditions .
Reference resources ClickHouse Single node deployment https://blog.csdn.net/shy_snow/article/details/123477519
modify config.xml
2. ## install zookeeper
Reference resources zookeeper Cluster installation deployment https://blog.csdn.net/shy_snow/article/details/124547559
zookeeper The connection information of needs to be configured to clickhouse Of config.xml in
3. ## /etc/metrika.xml Cluster configuration
Because I have 106-109 There is one on each of the four machines clickhouse, So I choose to form 2shard Fragmentation 2replicas Replica clusters .
copy : clickhouse The replication table of will synchronize the data to all replicas , adopt 2 Replicas can be used in the event of a single node failure , Another copy of the other node is still there , Data security can be ensured , No loss .
Fragmentation : Sharding is to divide the records of a table into Shards .
clickhouse Single machine configuration config.xml Can be used in incl To include /etc/metrika.xml Configuration in ,
vi /etc/clickhouse-server/config.xml
stay clickhouse Add... To the label
<include_from>/etc/metrika.xml</include_from>
<!-- zookeeper Used for synchronization of replicated tables -->
<zookeeper>
<node>
<host>zk1</host>
<port>2181</port>
</node>
<node>
<host>zk2</host>
<port>2181</port>
</node>
<node>
<host>zk3</host>
<port>2181</port>
</node>
</zookeeper>
<remote_servers incl="clickhouse_remote_servers" />
<macros incl="macros" optional="true" />
vi /etc/metrika.xml
<yandex>
<clickhouse_remote_servers>
<!-- cluster_2shards_2replicas The tag name is the cluster name, which can be customized , Inquire about cluster Cluster name select * from system.clusters; 2 Fragmentation 2 copy 2 When one node in the replica fails, the other node still has complete replica data -->
<cluster_2shards_2replicas>
<shard> <!-- shard1 -->
<internal_replication>true</internal_replication>
<replica>
<host>192.168.129.106</host>
<port>9000</port>
<user>default</user>
<password></password>
</replica>
<replica>
<host>192.168.129.108</host>
<port>9000</port>
<user>default</user>
<password></password>
</replica>
</shard>
<!-- shard2 -->
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>192.168.129.107</host>
<port>9000</port>
<user>default</user>
<password></password>
</replica>
<replica>
<host>192.168.129.109</host>
<port>9000</port>
<user>default</user>
<password></password>
</replica>
</shard>
</cluster_2shards_2replicas>
</clickhouse_remote_servers>
<!-- macros Different configurations are required for each node , When creating a copy table, you will use macros Variable values configured in -->
<!-- Inquire about macros The configuration query indicates that the configuration is successful SELECT * FROM system.macros; -->
<macros>
<shard>02</shard>
<replica>host107</replica>
</macros>
<networks>
<ip>::/0</ip>
</networks>
<!-- Configure compression -->
<clickhouse_compression>
<case>
<min_part_size>10000000000</min_part_size>
<min_part_size_ratio>0.01</min_part_size_ratio>
<method>lz4</method>
</case>
</clickhouse_compression>
</yandex>
Copy metrika.xml You need to modify each node after you connect to it metrika.xml file macros Under the replica Value , Each node needs to be different , You can use... In the cluster host value .
-- Inquire about cluster Cluster name such as cluster_2shards_2replicas
select * from system.clusters;
SELECT * FROM system.macros;
drop database if exists test on cluster cluster_2shards_2replicas ;
-- Create local tables (on cluster Tables will be created on each node of the cluster , however insert The data will only be in the current node )
drop table if exists test.cmtest on cluster cluster_2shards_2replicas;
-- Copy table
CREATE TABLE test.cmtest on cluster cluster_2shards_2replicas (
`id` String COMMENT 'id', `nginxTime` DateTime COMMENT 'nginxTime'
) ENGINE = ReplicatedMergeTree() partition by toYYYYMMDD(nginxTime) primary key (id) ORDER BY (id);
-- ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/st_order_mt','{replica}')
--
drop table if exists test.cmtest_dist on cluster cluster_2shards_2replicas;
-- Distributed table , Similar view , Don't save data , But the data of each local table will be merged during query , When inserting, it will be balanced to each node according to the policy ;
create TABLE test.cmtest_dist on cluster cluster_2shards_2replicas as test.cmtest
ENGINE = Distributed("cluster_2shards_2replicas", "test", "cmtest", rand());
-- For local insert Insert locally only
insert into test.cmtest values ('109',now());
-- Inserting distributed tables will be routed to a node according to the rules
insert into test.cmtest values ('1004000',now()+3600*24);
-- The local table can only query the data on the current node
select * from test.cmtest;
-- The query of distributed table will query the data on all nodes
select * from test.cmtest_dist;
-- Deleting distributed tables does not delete data , After re creating the distributed table, you can still query the full amount of data
drop table if exists test.cmtest_dist on cluster cluster_2shards_2replicas;
-- Deleting a local table will delete data
drop table if exists test.cmtest on cluster cluster_2shards_2replicas;
drop database if exists test on cluster cluster_2shards_2replicas ;
download chproxy
https://github.com/ContentSquare/chproxy/releases
chproxy yes clickhouse Officially recommended agency tools , Just configure config.yml that will do ,chproxy Official website address https://www.chproxy.org/cn
config.yml Configuration example for
vi config.yml
#o
server:
http:
listen_addr: ":9090"
allowed_networks: ["192.168.0.0/16","2.0.0.0/16"]
users:
- name: "clickhouse"
password: "123456"
to_cluster: "cluster_2shards_2replicas"
to_user: "default"
clusters:
- name: "cluster_2shards_2replicas"
replicas:
- name: "replica1"
nodes: ["192.168.129.106:8123","192.168.129.107:8123"]
- name: "replica2"
nodes: ["192.168.129.108:8123","192.168.129.109:8123"]
users:
- name: "default"
password: ""
max_concurrent_queries: 8
max_execution_time: 2m
If there is no replica, you can not configure replicas, Because I have a copy here, I configured replicas,replicas You can refer to the following figure for the nodes of .

chproxy decompression 、 Start stop 、 Thermal loading config.yml
tar -zxvf chproxy_1.15.1_linux_amd64.tar.gz
chmod +x chproxy
# start-up
./chproxy -config=/usr/local/chproxy/config.yml
# Don't stop chproxy Restart hot load config.yml, Want to chproxy Process send SIGHUP The signal is good
kill -sighup pid
ps -ef | grep chproxy |grep -v grep | head -1 | awk '{print $2}' | xargs kill -sighup
# stop it , direct kill fall chproxy process
ps -ef | grep chproxy |grep -v grep | head -1 | awk '{print $2}' | xargs kill
killChproxy.sh
pid=`ps -ef | grep chproxy |grep -v grep | head -1 | awk '{print $2}'`
if [ -z "$pid" ];then
echo "No found pid."
exit 1
fi
echo "kill ${pid}"
kill $pid
reloadConfig.sh
ps -ef | grep chproxy |grep -v grep | head -1 | awk '{print $2}' | xargs kill -sighup
- Reference resources
https://blog.csdn.net/weixin_37692493/article/details/114452689
chproxy Official website https://www.chproxy.org/cn
https://blog.csdn.net/Jason_light/article/details/120888679
https://blog.csdn.net/iceyung/article/details/107524922
边栏推荐
- [raspberry pie 4B deep learning garbage classification] Chap.3 raspberry pie installs opencv and tests the garbage classification of the real-time video stream of the video interface [deep learning mo
- UML系列文章(26)体系结构建模---制品
- Chapter_01 Mat: 基本的图像容器
- Raspberry pie installation opencv - pro test available
- 类和对象初阶
- 修改IDEA格式化单行注释 后增加空格
- Chapter_06 更改图像的对比度和亮度
- [deep learning moves] Chap.1 Boston house price forecast - univariate linear regression problem - tensorflow2.0+keras & paddlepaddle
- Preparation of hummingbird e203 development environment
- Kotlin 's Null safety
猜你喜欢

UML系列文章(27)體系結構建模---部署

BSP tailoring of Quanzhi platform (7) rootfs tailoring -- tailoring of user tools and Libraries & rootfs compression

UML系列文章(24)高级行为---时间和空间

Text preprocessing of natural language

Avez - vous vraiment compris l'entropie (y compris l'entropie croisée)

二叉树的递归套路

UML Series (27) Architecture Modeling - Deployment

MySQL联合查询(多表查询)

CodeBlocks老是重复运行上一次结果

Qt--- create dialog box 1: implementation of the subclass search keyword dialog box of qdialog
随机推荐
Yum command error: rpmdb open failed
Chapter_02 如何扫描查看图像,查询表以及Opencv中的时间度量
RNN以及其改进版(附2个代码案列)
echo -e打印换行符
戒烟日志_01 (day_02)
Mendeley 等文献管理工具在word中插入参考文献的报错解决
DS_ How to solve the problem of files automatically generated by the store under the folder?
Uniapp introduces online JS
SQLServer 导入导出数据,后台有进程,前台无显示。
ImportError: cannot import name ‘joblib‘ from ‘sklearn. externals‘
戒烟日志_03 (day_07)
若依 思维导图
Novice, I bought a financial product before. How do you see the income?
Sqlserver imports and exports data. There is a process in the background and no display in the foreground.
MySQL联合查询(多表查询)
Quit smoking log_ 03 (day_07)
(cobbler) partition problem and solution 2
UML系列文章(21)高级行为---事件和信号
Do you really understand entropy (including cross entropy)
The BSP tailoring of the Quanzhi platform (1) kernel tailoring -- tailoring of debugging tools and debugging information