当前位置:网站首页>Gbase 8A MPP cluster capacity expansion practice
Gbase 8A MPP cluster capacity expansion practice
2022-07-27 09:51:00 【51CTO】
The authors introduce
Wang Hao , New torch network core business system DBA, Mainly involves Oracle、Greenplum、Gbase Wait for database management 、IT Operation and maintenance management , Have rich practical experience in database multi business scenario performance optimization , Focus on database performance optimization 、IT Operation and maintenance automation .
One 、 background
Due to the development of business model and the need of data cycle retention , Recently a provincial operator plans to GBase 8a Cluster expansion , From existing 3coor+21data Expand the node to 3coor+61data node .
At present GBase 8a The cluster version is GBase8a_MPP_Cluster-NoLicense-8.6.2_build33-R12-redhat7.3-x86_64. Newly added 40 Nodes are only data nodes . I have the honor to participate in , Responsible for specific capacity expansion and subsequent data redistribution operations .
Two 、 environmental information
1、 Hardware configuration information
CPU:
- CPU Count :4*8C (4 Physics cpu, Every physics cpu Yes 8 A logic cpu)
Memory :
- MemTotal: 512GB
2、 Software version
GBase 8a Cluster version GBase8a_MPP_Cluster-NoLicense-8.6.2_build33-R12-redhat7.3-x86_64
3、 Expansion machine planning
To ensure the application access interface ip The address group does not change , After expansion , Still retain 3 individual coordinator( The management node ) The nodes remain the same , Capacity expansion 40 All nodes are data( data ) node . The planning host name is gbase25-gbase64.
3、 ... and 、 Preparation before implementation
1、 Network environment requirements for capacity expansion implementation
The network environment of the site is the original cluster 24 Servers and new expansion 40 The servers are all in the intranet , The Internet is gigabytes , Double network card binding , The network test results meet the expansion requirements .
2、 Storage space requirements for expansion implementation
In order to ensure the absolute safety of expansion implementation , Each server should have enough space for redistributed temporary data . Every node of the existing nodes in the cluster opt The directory has free space 13TB, Root free space 439GB; New node opt There is free space 22TB, Root free space 149GB, Meet the expansion requirements .
The inspection found that there were two servers (IP The address is 190、193) The disk write speed is obviously abnormal , The host personnel confirm that it is RAID Card battery failure , After the repair, the disk read and write speed is normal .

3、 Server requirements for expansion implementation
Unified MPP The operating system version of the cluster node . Before the expansion, the new expansion node operating system has been unified and re integrated , It is consistent with the existing node operating system version of the cluster , by rhel7.3, Meet the expansion requirements .
Four 、 Expansion implementation
1、 Add new node root And gbase Users trust each other
2、 To configure C3 Tools ( The tool is used for GBASE Each node executes the execution command at the same time )
3、 Use C3 The tool configures the expansion node environment
4、 Set up the cluster readonly Then back up the cluster information
5、 Perform the expansion
6、 Basic verification of cluster availability after capacity expansion
5、 ... and 、 Data redistribution
be-all MPP Clusters are distributed in many data nodes , So after the expansion operation is completed , To avoid data skew , All business table data needs to be redistributed to all data nodes ( Including expansion nodes ).
6、 ... and 、 Efficiency analysis
The time consumption of each step in the expansion :
- Capacity expansion :24 Japan 18:30 ~ 24 Japan 20:20, It takes about 2 Hours ;
- Redistribute : altogether 8802 A watch ,231T The amount of data ,24 Japan 20.25 ~ 26 Japan 10.36, It takes about 38 Hours , The original plan 91 Hours ( With engineering experience 35MB/s Speed calculation of ).
notes : Because there is a very uneven table , All data falls on one node ,70 A field ,75 Billion records ,13 Compress , Single slice 350GB. This table alone is used for redistribution 12 Hours . Except for this watch ,8801 The actual time of a watch 27 Hours (24 Japan 20:25~25 Japan 23:25), achieve 118MB/s, The redistribution rate is much faster than expected .
7、 ... and 、 Summary of experience
1、MPP When doing data redistribution operations, the cluster usually , The execution time of business scheduling must be considered , Because the redistributive operation may cause the business table to lock the table and affect the normal execution of business scheduling , The data synchronization time is before the expansion operation 2 Point to the afternoon 15 spot , Scheduling takes a long time , All business tables needed for scheduling are used before scheduling execution , Increase redistribution priority , Complete redistribution ahead of time , Reduce redistributive concurrency during scheduling execution , So that we can do 24 Hour redistribution , And does not affect production scheduling . If the daily scheduling time is short or there are too many tables to filter those tables needed for scheduling execution , It is recommended to redistribute the data in the wrong time .
2、 In addition to connecting the new node with the network of the cluster , Need to consider with data loader , And hadoop colony ( If the hdp When loading data ) Network connectivity .
3、 It is better to check the inclination of the following table before expansion , It is recommended to adjust the distribution key for tables with large tilt , In order to prevent similar expansion “ Because there is a very uneven table , All data falls on one node ,70 A field ,75 Billion records ,13 Compress , Single slice 350GB. This table alone is used for redistribution 12 Hours ” The situation of .
From the past 40 So far this year , The form of database basically experienced the traditional commercial database 、 The evolution process from open source database to cloud native database . How to innovate and innovate the database in the cloud era ? How to move and build the core database of financial industry safely and smoothly ? Come on Gdevops Beijing station of global agile operation and maintenance Summit To find the answer :
- 《All in Cloud Time , Next generation cloud native database technology and trend 》 Vice president of Alibaba Group / Chief database scientist of Dharma Hall Fei Fei Li ( Flying cutter )
- 《AI And the way of database evolution in the cloud era 》 General manager of Tencent database product center Lin Xiaobin ( Ding Qi )
- 《ICBC Of MySQL The road to explore 》 Software development center of ICBC Weiyadong
- 《 Financial industry MySQL High availability practice 》 Technical director of akerson Mingxiyuan
- 《 Minsheng Bank is in SQL Exploration and practice in auditing 》 Minsheng Bank Senior database expert Li Ningning
- 《OceanBase The implementation and practice of distributed database in Bank of Xi'an 》 The ant gold dress P9 Senior experts /OceanBase The core leader Jiang Zhiyong
Let us 9 month 11 Japan in Beijing Jointly look into the future of database development and transformation !

边栏推荐
- 加油程序君
- 2016展望
- July training (day 21) - heap (priority queue)
- Understand chisel language. 23. Chisel sequential circuit (III) -- detailed explanation of chisel shift register
- In depth analysis, sub database and sub table are the most powerful auxiliary sharding sphere
- 一骑入秦川——浅聊Beego AutoRouter是如何工作
- 食品安全 | 垃圾食品越吃越想吃?这份常见食品热量表请收好
- About getter/setter methods
- 面试京东 T5,被按在地上摩擦,鬼知道我经历了什么?
- July training (day 16) - queue
猜你喜欢

Why do microservices have to have API gateways?

Eureka delayed registration of a pit

XML overview

九种方式,教你读取 resources 目录下的文件路径

How to use tdengine sink connector?

It's great to write code for 32 inch curved screen display! Send another one!

如果mysql磁盘满了,会发生什么?还真被我遇到了!

安装了HAL库如何恢复原来的版本

都什么年代了你还在用 Date

How to install cpolar intranet penetration on raspberry pie
随机推荐
How to use tdengine sink connector?
Talk about 10 scenarios of index failure. It's too stupid
Sentinel ten thousand word tutorial | book delivery at the end of the text
Fundamentals of Materials Engineering - key points
监控神器:Prometheus 轻松入门,真香!
Understand chisel language. 25. Advanced input signal processing of chisel (I) -- asynchronous input and de jitter
Expose a technology boss from a poor family
At the end of the year, I'll teach you how to get high performance!
抢了个票,还以为发现了12306的系统BUG
When I went to oppo for an interview, I got numb
拜托!面试请不要再问我 Ribbon 的架构原理
QT | about the problem that QT creator cannot open the project and compile it
Nccl collective communication --collective operations
在Centos 7安装Mysql 5.7.27后无法启动?(语言-bash)
二叉树习题总结
中高级试题」:MVCC 实现原理是什么?
Understand chisel language. 24. Chisel sequential circuit (IV) -- detailed explanation of chisel memory
What age are you still using date
July training (day 07) - hash table
Intermediate and advanced test questions ": what is the implementation principle of mvcc?