当前位置:网站首页>Greenplum Database Fault Analysis - Why Does gpstart -a Return Failure After Version Upgrade?
Greenplum Database Fault Analysis - Why Does gpstart -a Return Failure After Version Upgrade?
2022-08-05 01:52:00 【Fat Uncle】
案例背景
On-site projects are carried outGreenplumWhen the database minor version is upgraded,The upgrade script reports an error,Indicates that the database failed to start.But we go from the springboard to the database node to usegpstartWhen starting the cluster in interactive mode,The cluster can be started,standby master是不可用的.What caused the usegpstart -a启动失败,使用gpstartIt will be successful to start?A small developer who joined the team for two years,Adhering to the principle that only through fault analysis can quickly cut into the learning database path,Take this job,Inevitably need to work overtime to deal with it.
分析过程
首先我们发现gpstart和gpstart -aExcept for the interaction,会尝试对standby master进行启动,Skip if it doesn't start.Our exclusion direction should also be this way,After replaying the scene first,使用gpstart -mjust pull upmaster节点,utilityMode loginmaster节点,执行select * from gp_segment_configuration where content = -1;
查找master和standby master对应的记录.这里发现standby masterMarked as normal in the system tables,但是我们在standby master节点的gpseg-1The data files found in the directory are not very complete,比如就没有postgresql.conf.So the verdict is definitely yesgpinitstandby脚本运行出错,View the log as shown below:
gpinitstandby:xxx:gpadmin-[ERROR]:-Error initializing standby master: Standby master not configured
gpinitstandby:xxx:gpadmin-[ERROR]:-Request mode to remove warm master standby, but no standby located.
gpinitstandby:xxx:gpadmin-[ERROR]:-Error removing standby master: no standby configured
gpinitstandby:xxx:gpadmin-[INFO]:-Validating environment and parameters for standby initialization...
gpinitstandby:xxx:gpadmin-[INFO]:-------------------------------------------
gpinitstandby:xxx:gpadmin-[INFO]:Greenplum standby master initialization parameters
gpinitstandby:xxx:gpadmin-[INFO]:-------------------------------------------
gpinitstandby:xxx:gpadmin-[INFO]:-Greenplum master hostname = xxx
gpinitstandby:xxx:gpadmin-[INFO]:-Greenplum master data directory = /home/gpadmin/data/master/default/gpseg-1
gpinitstandby:xxx:gpadmin-[INFO]:-Greenplum master port = 5432
gpinitstandby:xxx:gpadmin-[INFO]:-Greenplum standby master hostname = xxx
gpinitstandby:xxx:gpadmin-[INFO]:-Greenplum standby master port = 5432
gpinitstandby:xxx:gpadmin-[INFO]:-Greenplum standby master data directory = /home/gpadmin/data/master/default/gpseg-1
gpinitstandby:xxx:gpadmin-[INFO]:-Greenplum update system catalog = On
gpinitstandby:xxx:gpadmin-[INFO]:-Syncing Greenplum Database extensions to standby
gpinitstandby:xxx:gpadmin-[INFO]:-The packages on xxx are consistent
gpinitstandby:xxx:gpadmin-[INFO]:-Adding standby master to catalog...
gpinitstandby:xxx:gpadmin-[INFO]:-Database catalog updated successfully.
gpinitstandby:xxx:gpadmin-[INFO]:-Updating pg_hba.conf file...
gpinitstandby:xxx:gpadmin-[INFO]:-pg_hba.conf files updated successfully.
gpinitstandby:xxx:gpadmin-[ERROR]:-Failed to copy data directory from master to standby.
gpinitstandby:xxx:gpadmin-[ERROR]:-Failed to create standby
gpinitstandby:xxx:gpadmin-[WARNING]-Trying to rollback changes that have been made...
gpinitstandby:xxx:gpadmin-[INFO]:-Rolling back catalog change...
gpinitstandby:xxx:gpadmin-[ERROR]:-Failed to remove standby from master catalog.
gpinitstandby:xxx:gpadmin-[INFO]:-Restoring pg_hba.conf file...
gpinitstandby:xxx:gpadmin-[INFO]:-Cleaning up pg_hba.conf backup files...
gpinitstandby:xxx:gpadmin-[INFO]:-Backup files of pg_hba.conf cleaned up successfully.
As can be seen from the above log before the upgradeHAComponents are being repairedstandby master,gpinitstandbyrun to frommasterThe data directory copies the data tostandby master时,The upgrade script is closedgreenplum集群,导致的失败.由于greenplumThe cluster is down,回滚gp_segment_configuration中的standby记录失败.因此使用gpstart -a启动时,The script thinksstandby master是正常的,Just try to start,Of course it will fail.
产生原因
在升级前HAComponents are being repairedstandby master,gpinitstandbyrun to frommasterThe data directory copies the data tostandby master时,The upgrade script is closedgreenplum集群,导致的失败.由于greenplumThe cluster is down,回滚gp_segment_configuration中的standby记录失败.
解决方案
Consider three options:
- gpstart -am启动master节点;执行
PGOPTIONS="-c gp_session_role=utility" psql -d postgres -c "select gp_remove_master_standby()"
;执行gpstop -ar - gpstart -am启动master节点;执行
PGOPTIONS="-c gp_session_role=utility" psql -d postgres -c "set allow_system_table_modes=true; update gp_segment_configuration set status = 'd' where content = -1 an role = 'm'; "
;执行gpstop -ar - gpstart -aS.加上大写的SParameters are skipped directlystandby master启动
采用第3中方案,After the upgrade byHA来处理standby masterto fix startup issues.
边栏推荐
- 英特尔 XDC 2022 精彩回顾:共建开放生态,释放“基建”潜能
- 蓝牙Mesh系统开发四 ble mesh网关节点管理
- How DHCP works
- 使用OpenVINO实现飞桨版PGNet推理程序
- Understand the recommendation system in one article: Recall 06: Two-tower model - model structure, training method, the recall model is a late fusion feature, and the sorting model is an early fusion
- 为什么他们选择和AI恋爱?
- CMS建站流程
- Short domain name bypass and xss related knowledge
- C# const readonly static 关键字区别
- 领域驱动设计——MDD
猜你喜欢
直播回放含 PPT 下载|基于 Flink & DeepRec 构建 Online Deep Learning
ExcelPatternTool: Excel table-database mutual import tool
【七夕如何根据情侣倾听的音乐进行薅羊毛】背景音乐是否会影响情侣对酒的选择
手把手基于YOLOv5定制实现FacePose之《YOLO结构解读、YOLO数据格式转换、YOLO过程修改》
快速批量修改VOC格式数据集标签的文件名,即快速批量修改.xml文件名
Exercise: Selecting a Structure (1)
【Endnote】Word插入自定义形式的Endnote文献格式
“嘀哩哩,等灯等灯”,工厂安全生产的提示音
直播预告|30分钟快速入门!来看可信分布式AI链桨的架构设计
Dynamic Programming/Knapsack Problem Summary/Summary - 01 Knapsack, Complete Knapsack
随机推荐
Bit rate vs. resolution, which one is more important?
直播回放含 PPT 下载|基于 Flink & DeepRec 构建 Online Deep Learning
A new technical director, who calls DDD a senior, is convinced
汇编语言之源程序
英特尔 XDC 2022 精彩回顾:共建开放生态,释放“基建”潜能
Residential water problems
Leetcode刷题——22. 括号生成
【PyQT5 绑定函数的传参】
How DHCP works
Live playback including PPT download | Build Online Deep Learning based on Flink & DeepRec
dotnet 6 为什么网络请求不跟随系统网络代理变化而动态切换代理
深度学习:使用nanodet训练自己制作的数据集并测试模型,通俗易懂,适合小白
程序员失眠时的数羊列表 | 每日趣闻
VOC格式数据集转COCO格式数据集
<开发>实用工具
Are testing jobs so hard to find?I am 32 this year and I have been unemployed for 2 months. What should an older test engineer do next to support his family?
source program in assembly language
CNI (Container Network Plugin)
多线程涉及的其它知识(死锁(等待唤醒机制),内存可见性问题以及定时器)
Short domain name bypass and xss related knowledge