当前位置:网站首页>Clickhouse2 fragment 2 replica high availability cluster setup and chproxy proxy configuration

Clickhouse2 fragment 2 replica high availability cluster setup and chproxy proxy configuration

2022-06-09 06:40:00 shy_ snow

clickhouse Clusters are table level , At the node level clickhouse Each node is independent , Even if the cluster is formed, it is independent , So you can only connect to clickhouse On a single node of . By analogy redis, It's kind of like building blocks , It can be assembled randomly according to the configuration file . As long as... Is installed on each node clickhouse Configure after stand-alone config.xml and /etc/metrika.xml Just configure the connection information of other nodes and the sharding and replica conditions .

  1. install clickhouse stand-alone

Reference resources ClickHouse Single node deployment https://blog.csdn.net/shy_snow/article/details/123477519
modify config.xml
2. ## install zookeeper
Reference resources zookeeper Cluster installation deployment https://blog.csdn.net/shy_snow/article/details/124547559
zookeeper The connection information of needs to be configured to clickhouse Of config.xml in
3. ## /etc/metrika.xml Cluster configuration
Because I have 106-109 There is one on each of the four machines clickhouse, So I choose to form 2shard Fragmentation 2replicas Replica clusters .
copy : clickhouse The replication table of will synchronize the data to all replicas , adopt 2 Replicas can be used in the event of a single node failure , Another copy of the other node is still there , Data security can be ensured , No loss .
Fragmentation : Sharding is to divide the records of a table into Shards .
clickhouse Single machine configuration config.xml Can be used in incl To include /etc/metrika.xml Configuration in ,
vi /etc/clickhouse-server/config.xml
stay clickhouse Add... To the label

<include_from>/etc/metrika.xml</include_from>

<!-- zookeeper Used for synchronization of replicated tables  -->
     <zookeeper>
        <node>
            <host>zk1</host>
            <port>2181</port>
        </node>
        <node>
            <host>zk2</host>
            <port>2181</port>
        </node>
        <node>
            <host>zk3</host>
            <port>2181</port>
        </node>    
    </zookeeper>
<remote_servers incl="clickhouse_remote_servers" />
<macros incl="macros" optional="true" />

vi /etc/metrika.xml

<yandex>
	<clickhouse_remote_servers> 
	<!-- cluster_2shards_2replicas The tag name is the cluster name, which can be customized ,  Inquire about cluster Cluster name  select * from system.clusters; 2 Fragmentation 2 copy  2 When one node in the replica fails, the other node still has complete replica data  -->
		<cluster_2shards_2replicas>
			<shard> <!-- shard1 -->
				<internal_replication>true</internal_replication>
				<replica>
					<host>192.168.129.106</host>
					<port>9000</port>
          <user>default</user>
          <password></password>
				</replica>
				<replica>
					<host>192.168.129.108</host>
					<port>9000</port>
          <user>default</user>
          <password></password>
				</replica>
			</shard>
			<!-- shard2 -->
			<shard>
				<internal_replication>true</internal_replication>				
				<replica>
					<host>192.168.129.107</host>
					<port>9000</port>
          <user>default</user>
          <password></password>
				</replica>
				<replica>
					<host>192.168.129.109</host>
					<port>9000</port>
          <user>default</user>
          <password></password>
				</replica>
			</shard>
		</cluster_2shards_2replicas>
	</clickhouse_remote_servers> 
	    
  <!-- macros Different configurations are required for each node , When creating a copy table, you will use macros Variable values configured in  -->
  <!--  Inquire about macros The configuration query indicates that the configuration is successful  SELECT * FROM system.macros; -->
	<macros>
		<shard>02</shard>
		<replica>host107</replica>
	</macros>
	<networks>
		<ip>::/0</ip>
	</networks> 
	<!--  Configure compression  -->
	<clickhouse_compression>
		<case>
			<min_part_size>10000000000</min_part_size>
			<min_part_size_ratio>0.01</min_part_size_ratio>
			<method>lz4</method>
		</case>
	</clickhouse_compression>
</yandex>

Copy metrika.xml You need to modify each node after you connect to it metrika.xml file macros Under the replica Value , Each node needs to be different , You can use... In the cluster host value .

  1. Distributed replication table creation test

--  Inquire about cluster Cluster name   such as cluster_2shards_2replicas
select * from system.clusters;
SELECT * FROM system.macros;

drop database if exists test on cluster cluster_2shards_2replicas ;
--  Create local tables (on cluster  Tables will be created on each node of the cluster ,  however insert The data will only be in the current node )
drop table if exists test.cmtest on cluster cluster_2shards_2replicas;
--  Copy table 
CREATE TABLE test.cmtest on cluster cluster_2shards_2replicas (
`id` String COMMENT 'id', `nginxTime` DateTime COMMENT 'nginxTime' 
) ENGINE = ReplicatedMergeTree() partition by toYYYYMMDD(nginxTime) primary key (id) ORDER BY (id);
-- ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/st_order_mt','{replica}')
-- 
drop table if exists test.cmtest_dist on cluster cluster_2shards_2replicas;
--  Distributed table , Similar view , Don't save data , But the data of each local table will be merged during query , When inserting, it will be balanced to each node according to the policy ;
create TABLE test.cmtest_dist on cluster cluster_2shards_2replicas as test.cmtest
ENGINE = Distributed("cluster_2shards_2replicas", "test", "cmtest", rand());
--  For local insert Insert locally only 
insert into test.cmtest  values ('109',now());
--  Inserting distributed tables will be routed to a node according to the rules 
insert into test.cmtest  values ('1004000',now()+3600*24);
--  The local table can only query the data on the current node 
select * from test.cmtest;
--  The query of distributed table will query the data on all nodes 
select * from test.cmtest_dist;
--  Deleting distributed tables does not delete data , After re creating the distributed table, you can still query the full amount of data 
drop table if exists test.cmtest_dist on cluster cluster_2shards_2replicas;
--  Deleting a local table will delete data 
drop table if exists test.cmtest on cluster cluster_2shards_2replicas;
drop database if exists test on cluster cluster_2shards_2replicas ;

  1. chproxy Configure unified agent to use

download chproxy
https://github.com/ContentSquare/chproxy/releases
 Insert picture description here

chproxy yes clickhouse Officially recommended agency tools , Just configure config.yml that will do ,chproxy Official website address https://www.chproxy.org/cn
config.yml Configuration example for

vi config.yml
#o
server:
  http:
      listen_addr: ":9090"
      allowed_networks: ["192.168.0.0/16","2.0.0.0/16"]

users:
  - name: "clickhouse"
    password: "123456"
    to_cluster: "cluster_2shards_2replicas"
    to_user: "default"

clusters:
  - name: "cluster_2shards_2replicas"
    replicas:
    - name: "replica1"
      nodes: ["192.168.129.106:8123","192.168.129.107:8123"]
    - name: "replica2"
      nodes: ["192.168.129.108:8123","192.168.129.109:8123"]

    users:
      - name: "default"
        password: ""
        max_concurrent_queries: 8
        max_execution_time: 2m

If there is no replica, you can not configure replicas, Because I have a copy here, I configured replicas,replicas You can refer to the following figure for the nodes of .
 Insert picture description here
 Insert picture description here

chproxy decompression 、 Start stop 、 Thermal loading config.yml

tar -zxvf chproxy_1.15.1_linux_amd64.tar.gz
chmod +x chproxy
# start-up 
./chproxy -config=/usr/local/chproxy/config.yml
#  Don't stop chproxy Restart hot load config.yml, Want to chproxy Process send SIGHUP The signal is good 
kill -sighup pid
ps -ef | grep chproxy |grep -v grep | head -1 | awk '{print $2}' | xargs  kill -sighup 
#  stop it , direct kill fall chproxy process 
ps -ef | grep chproxy |grep -v grep | head -1 | awk '{print $2}' | xargs kill

killChproxy.sh

pid=`ps -ef | grep chproxy |grep -v grep | head -1 | awk '{print $2}'`
if [ -z "$pid" ];then
        echo "No found pid."
        exit 1
fi
echo "kill ${pid}"
kill $pid

reloadConfig.sh

ps -ef | grep chproxy |grep -v grep | head -1 | awk '{print $2}' | xargs  kill -sighup 
  1. Reference resources
    https://blog.csdn.net/weixin_37692493/article/details/114452689
    chproxy Official website https://www.chproxy.org/cn
    https://blog.csdn.net/Jason_light/article/details/120888679
    https://blog.csdn.net/iceyung/article/details/107524922
原网站

版权声明
本文为[shy_ snow]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/160/202206090626188325.html