当前位置:网站首页>Greenplum Database Fault Analysis - Why Does gpstart -a Return Failure After Version Upgrade?
Greenplum Database Fault Analysis - Why Does gpstart -a Return Failure After Version Upgrade?
2022-08-05 01:52:00 【Fat Uncle】
案例背景
On-site projects are carried outGreenplumWhen the database minor version is upgraded,The upgrade script reports an error,Indicates that the database failed to start.But we go from the springboard to the database node to usegpstartWhen starting the cluster in interactive mode,The cluster can be started,standby master是不可用的.What caused the usegpstart -a启动失败,使用gpstartIt will be successful to start?A small developer who joined the team for two years,Adhering to the principle that only through fault analysis can quickly cut into the learning database path,Take this job,Inevitably need to work overtime to deal with it.
分析过程
首先我们发现gpstart和gpstart -aExcept for the interaction,会尝试对standby master进行启动,Skip if it doesn't start.Our exclusion direction should also be this way,After replaying the scene first,使用gpstart -mjust pull upmaster节点,utilityMode loginmaster节点,执行select * from gp_segment_configuration where content = -1;
查找master和standby master对应的记录.这里发现standby masterMarked as normal in the system tables,但是我们在standby master节点的gpseg-1The data files found in the directory are not very complete,比如就没有postgresql.conf.So the verdict is definitely yesgpinitstandby脚本运行出错,View the log as shown below:
gpinitstandby:xxx:gpadmin-[ERROR]:-Error initializing standby master: Standby master not configured
gpinitstandby:xxx:gpadmin-[ERROR]:-Request mode to remove warm master standby, but no standby located.
gpinitstandby:xxx:gpadmin-[ERROR]:-Error removing standby master: no standby configured
gpinitstandby:xxx:gpadmin-[INFO]:-Validating environment and parameters for standby initialization...
gpinitstandby:xxx:gpadmin-[INFO]:-------------------------------------------
gpinitstandby:xxx:gpadmin-[INFO]:Greenplum standby master initialization parameters
gpinitstandby:xxx:gpadmin-[INFO]:-------------------------------------------
gpinitstandby:xxx:gpadmin-[INFO]:-Greenplum master hostname = xxx
gpinitstandby:xxx:gpadmin-[INFO]:-Greenplum master data directory = /home/gpadmin/data/master/default/gpseg-1
gpinitstandby:xxx:gpadmin-[INFO]:-Greenplum master port = 5432
gpinitstandby:xxx:gpadmin-[INFO]:-Greenplum standby master hostname = xxx
gpinitstandby:xxx:gpadmin-[INFO]:-Greenplum standby master port = 5432
gpinitstandby:xxx:gpadmin-[INFO]:-Greenplum standby master data directory = /home/gpadmin/data/master/default/gpseg-1
gpinitstandby:xxx:gpadmin-[INFO]:-Greenplum update system catalog = On
gpinitstandby:xxx:gpadmin-[INFO]:-Syncing Greenplum Database extensions to standby
gpinitstandby:xxx:gpadmin-[INFO]:-The packages on xxx are consistent
gpinitstandby:xxx:gpadmin-[INFO]:-Adding standby master to catalog...
gpinitstandby:xxx:gpadmin-[INFO]:-Database catalog updated successfully.
gpinitstandby:xxx:gpadmin-[INFO]:-Updating pg_hba.conf file...
gpinitstandby:xxx:gpadmin-[INFO]:-pg_hba.conf files updated successfully.
gpinitstandby:xxx:gpadmin-[ERROR]:-Failed to copy data directory from master to standby.
gpinitstandby:xxx:gpadmin-[ERROR]:-Failed to create standby
gpinitstandby:xxx:gpadmin-[WARNING]-Trying to rollback changes that have been made...
gpinitstandby:xxx:gpadmin-[INFO]:-Rolling back catalog change...
gpinitstandby:xxx:gpadmin-[ERROR]:-Failed to remove standby from master catalog.
gpinitstandby:xxx:gpadmin-[INFO]:-Restoring pg_hba.conf file...
gpinitstandby:xxx:gpadmin-[INFO]:-Cleaning up pg_hba.conf backup files...
gpinitstandby:xxx:gpadmin-[INFO]:-Backup files of pg_hba.conf cleaned up successfully.
As can be seen from the above log before the upgradeHAComponents are being repairedstandby master,gpinitstandbyrun to frommasterThe data directory copies the data tostandby master时,The upgrade script is closedgreenplum集群,导致的失败.由于greenplumThe cluster is down,回滚gp_segment_configuration中的standby记录失败.因此使用gpstart -a启动时,The script thinksstandby master是正常的,Just try to start,Of course it will fail.
产生原因
在升级前HAComponents are being repairedstandby master,gpinitstandbyrun to frommasterThe data directory copies the data tostandby master时,The upgrade script is closedgreenplum集群,导致的失败.由于greenplumThe cluster is down,回滚gp_segment_configuration中的standby记录失败.
解决方案
Consider three options:
- gpstart -am启动master节点;执行
PGOPTIONS="-c gp_session_role=utility" psql -d postgres -c "select gp_remove_master_standby()"
;执行gpstop -ar - gpstart -am启动master节点;执行
PGOPTIONS="-c gp_session_role=utility" psql -d postgres -c "set allow_system_table_modes=true; update gp_segment_configuration set status = 'd' where content = -1 an role = 'm'; "
;执行gpstop -ar - gpstart -aS.加上大写的SParameters are skipped directlystandby master启动
采用第3中方案,After the upgrade byHA来处理standby masterto fix startup issues.
边栏推荐
- 没有对象的程序员如何过七夕
- 【翻译】CNCF对OpenTracing项目的存档
- 【机器学习】21天挑战赛学习笔记(二)
- Three handshake and four wave in tcp
- 配置类总结
- source program in assembly language
- The use of pytorch: temperature prediction using neural networks
- Use of pytorch: Convolutional Neural Network Module
- Introduction to JVM class loading
- sqlite--nested exception is org.apache.ibatis.exceptions.PersistenceException:
猜你喜欢
蓝牙Mesh系统开发四 ble mesh网关节点管理
快速批量修改VOC格式数据集标签的文件名,即快速批量修改.xml文件名
张驰咨询:揭晓六西格玛管理(6 Sigma)长盛不衰的秘密
pytorch的使用:卷积神经网络模块
Jincang database KingbaseES V8 GIS data migration solution (3. Data migration based on ArcGIS platform to KES)
VOC格式数据集转COCO格式数据集
Dynamic Programming/Knapsack Problem Summary/Summary - 01 Knapsack, Complete Knapsack
[Redis] Redis installation under Linux
英特尔 XDC 2022 精彩回顾:共建开放生态,释放“基建”潜能
Transfer Learning - Joint Geometrical and Statistical Alignment for Visual Domain Adaptation
随机推荐
GC高德坐标和百度坐标转换
深度学习:使用nanodet训练自己制作的数据集并测试模型,通俗易懂,适合小白
新唐NUC980使用记录:在用户应用中使用GPIO
进程在用户态和内核态的区别[独家解析]
4. PCIe interface timing
金仓数据库 KingbaseES V8 GIS数据迁移方案(3. 基于ArcGIS平台的数据迁移到KES)
ExcelPatternTool: Excel table-database mutual import tool
A new technical director, who calls DDD a senior, is convinced
tcp中的三次握手与四次挥手
Lattice PCIe Learning 1
跨域解决方案
LPQ(局部相位量化)学习笔记
Live playback including PPT download | Build Online Deep Learning based on Flink & DeepRec
Day Fourteen & Postman
1349. Maximum number of students taking the exam Status Compression
JWT简单介绍
自定义线程池
蓝牙Mesh系统开发五 ble mesh设备增加与移除
刷爆朋友圈,Alibaba出品亿级并发设计速成笔记太香了
接口自动化测试框架postman tests常用方法