当前位置:网站首页>Gbase 8A MPP cluster capacity expansion practice
Gbase 8A MPP cluster capacity expansion practice
2022-07-27 09:51:00 【51CTO】
The authors introduce
Wang Hao , New torch network core business system DBA, Mainly involves Oracle、Greenplum、Gbase Wait for database management 、IT Operation and maintenance management , Have rich practical experience in database multi business scenario performance optimization , Focus on database performance optimization 、IT Operation and maintenance automation .
One 、 background
Due to the development of business model and the need of data cycle retention , Recently a provincial operator plans to GBase 8a Cluster expansion , From existing 3coor+21data Expand the node to 3coor+61data node .
At present GBase 8a The cluster version is GBase8a_MPP_Cluster-NoLicense-8.6.2_build33-R12-redhat7.3-x86_64. Newly added 40 Nodes are only data nodes . I have the honor to participate in , Responsible for specific capacity expansion and subsequent data redistribution operations .
Two 、 environmental information
1、 Hardware configuration information
CPU:
- CPU Count :4*8C (4 Physics cpu, Every physics cpu Yes 8 A logic cpu)
Memory :
- MemTotal: 512GB
2、 Software version
GBase 8a Cluster version GBase8a_MPP_Cluster-NoLicense-8.6.2_build33-R12-redhat7.3-x86_64
3、 Expansion machine planning
To ensure the application access interface ip The address group does not change , After expansion , Still retain 3 individual coordinator( The management node ) The nodes remain the same , Capacity expansion 40 All nodes are data( data ) node . The planning host name is gbase25-gbase64.
3、 ... and 、 Preparation before implementation
1、 Network environment requirements for capacity expansion implementation
The network environment of the site is the original cluster 24 Servers and new expansion 40 The servers are all in the intranet , The Internet is gigabytes , Double network card binding , The network test results meet the expansion requirements .
2、 Storage space requirements for expansion implementation
In order to ensure the absolute safety of expansion implementation , Each server should have enough space for redistributed temporary data . Every node of the existing nodes in the cluster opt The directory has free space 13TB, Root free space 439GB; New node opt There is free space 22TB, Root free space 149GB, Meet the expansion requirements .
The inspection found that there were two servers (IP The address is 190、193) The disk write speed is obviously abnormal , The host personnel confirm that it is RAID Card battery failure , After the repair, the disk read and write speed is normal .

3、 Server requirements for expansion implementation
Unified MPP The operating system version of the cluster node . Before the expansion, the new expansion node operating system has been unified and re integrated , It is consistent with the existing node operating system version of the cluster , by rhel7.3, Meet the expansion requirements .
Four 、 Expansion implementation
1、 Add new node root And gbase Users trust each other
2、 To configure C3 Tools ( The tool is used for GBASE Each node executes the execution command at the same time )
3、 Use C3 The tool configures the expansion node environment
4、 Set up the cluster readonly Then back up the cluster information
5、 Perform the expansion
6、 Basic verification of cluster availability after capacity expansion
5、 ... and 、 Data redistribution
be-all MPP Clusters are distributed in many data nodes , So after the expansion operation is completed , To avoid data skew , All business table data needs to be redistributed to all data nodes ( Including expansion nodes ).
6、 ... and 、 Efficiency analysis
The time consumption of each step in the expansion :
- Capacity expansion :24 Japan 18:30 ~ 24 Japan 20:20, It takes about 2 Hours ;
- Redistribute : altogether 8802 A watch ,231T The amount of data ,24 Japan 20.25 ~ 26 Japan 10.36, It takes about 38 Hours , The original plan 91 Hours ( With engineering experience 35MB/s Speed calculation of ).
notes : Because there is a very uneven table , All data falls on one node ,70 A field ,75 Billion records ,13 Compress , Single slice 350GB. This table alone is used for redistribution 12 Hours . Except for this watch ,8801 The actual time of a watch 27 Hours (24 Japan 20:25~25 Japan 23:25), achieve 118MB/s, The redistribution rate is much faster than expected .
7、 ... and 、 Summary of experience
1、MPP When doing data redistribution operations, the cluster usually , The execution time of business scheduling must be considered , Because the redistributive operation may cause the business table to lock the table and affect the normal execution of business scheduling , The data synchronization time is before the expansion operation 2 Point to the afternoon 15 spot , Scheduling takes a long time , All business tables needed for scheduling are used before scheduling execution , Increase redistribution priority , Complete redistribution ahead of time , Reduce redistributive concurrency during scheduling execution , So that we can do 24 Hour redistribution , And does not affect production scheduling . If the daily scheduling time is short or there are too many tables to filter those tables needed for scheduling execution , It is recommended to redistribute the data in the wrong time .
2、 In addition to connecting the new node with the network of the cluster , Need to consider with data loader , And hadoop colony ( If the hdp When loading data ) Network connectivity .
3、 It is better to check the inclination of the following table before expansion , It is recommended to adjust the distribution key for tables with large tilt , In order to prevent similar expansion “ Because there is a very uneven table , All data falls on one node ,70 A field ,75 Billion records ,13 Compress , Single slice 350GB. This table alone is used for redistribution 12 Hours ” The situation of .
From the past 40 So far this year , The form of database basically experienced the traditional commercial database 、 The evolution process from open source database to cloud native database . How to innovate and innovate the database in the cloud era ? How to move and build the core database of financial industry safely and smoothly ? Come on Gdevops Beijing station of global agile operation and maintenance Summit To find the answer :
- 《All in Cloud Time , Next generation cloud native database technology and trend 》 Vice president of Alibaba Group / Chief database scientist of Dharma Hall Fei Fei Li ( Flying cutter )
- 《AI And the way of database evolution in the cloud era 》 General manager of Tencent database product center Lin Xiaobin ( Ding Qi )
- 《ICBC Of MySQL The road to explore 》 Software development center of ICBC Weiyadong
- 《 Financial industry MySQL High availability practice 》 Technical director of akerson Mingxiyuan
- 《 Minsheng Bank is in SQL Exploration and practice in auditing 》 Minsheng Bank Senior database expert Li Ningning
- 《OceanBase The implementation and practice of distributed database in Bank of Xi'an 》 The ant gold dress P9 Senior experts /OceanBase The core leader Jiang Zhiyong
Let us 9 month 11 Japan in Beijing Jointly look into the future of database development and transformation !

边栏推荐
- 交换机端口镜像配置指南
- XML概述
- 去 OPPO 面试,被问麻了
- Looking for a job for 4 months, interviewing 15 companies and getting 3 offers
- About getter/setter methods
- Nine ways to read the file path under the resources directory
- 通俗易懂!图解Go协程原理及实战
- System parameter constant table of system architecture:
- Easy to understand! Graphic go synergy principle and Practice
- 吃透Chisel语言.22.Chisel时序电路(二)——Chisel计数器(Counter)详解:计数器、定时器和脉宽调制
猜你喜欢

NCCL (NVIDIA Collective Communications Library)

并发之park与unpark说明

Nccl collective communication --collective operations

Nacos configuration center dynamically refreshes the data source

Sentinel 万字教程 | 文末送书

拜托!面试请不要再问我 Ribbon 的架构原理

语音直播系统——开发推送通知需要遵守的原则

Voice live broadcast system - Principles to be followed in developing push notifications

Understand chisel language. 22. Chisel sequential circuit (II) -- detailed explanation of chisel counter: counter, timer and pulse width modulation

Looking for a job for 4 months, interviewing 15 companies and getting 3 offers
随机推荐
Proposed relocation! 211 the new campus of China University of Petroleum (East China) is officially opened!
July training (day 14) - stack
In depth analysis, sub database and sub table are the most powerful auxiliary sharding sphere
年底了,我教你怎么拿高绩效!
省应急管理厅:广州可争取推广幼儿应急安全宣教经验
Talk about 10 scenarios of index failure. It's too stupid
LeetCode.565. 数组嵌套____暴力dfs->剪枝dfs->原地修改
原生input标签的文件上传
Esp8266 Arduino programming example ADC
c'mon! Please don't ask me about ribbon's architecture principle during the interview
S交换机堆叠方案配置指南
Sentinel 万字教程 | 文末送书
Go Basics - arrays and slices
在Centos 7安装Mysql 5.7.27后无法启动?(语言-bash)
深度剖析分库分表最强辅助Sharding Sphere
July training (day 09) - two point search
wordpress禁止指定用户名登录或注册插件【v1.0】
[cloud native • Devops] master the container management tool rancher
I haven't delivered books for a long time, and I feel uncomfortable all over
电机控制器中的MOS驱动