当前位置:网站首页>[server data recovery] a case of RAID data recovery of a brand StorageWorks server
[server data recovery] a case of RAID data recovery of a brand StorageWorks server
2022-07-07 15:00:00 【North Asia data recovery】
Server data recovery environment :
A brand StorageWorks The server ;
8 block SAS Hard disk composition raid5, A hot spare .
Server failure :
During the operation of the server, two hard disks are offline successively , The server is down ,lun Not working properly . The server administrator contacts our data recovery center for data recovery .
The server data recovery Engineer in our data recovery center performs physical detection and bad trace detection on all disks in the server , No problems were found .
Server data recovery process :
1、 Mirror all hard disks of the failed server , In case of secondary damage to the original data in the process of data recovery .
Part of the data backed up is shown in the figure below :
2、 Cause analysis of server failure :
At present, the preliminary understanding is based on RAID Of the group LUN Yes 6 individual , All assigned to HP-Unix Small computers use , Made by the top LVM Logic volume , The important data is Oracle Database and OA Server side . In case of failure, the performance of some disks in the server is unstable , The controller in this type of server will kick out the disk that it considers to be a bad disk RAID Group . And once RAID The dropped disk in the group reaches RAID The limit of the level allowed to drop the disk , So this RAID Will not be available , The server is down .
3、 Analysis server RAID Group structure :
Server's LUN It's all based on RAID Of the group , To recover server data, you need to analyze the underlying RAID Group information , Then reconstruct the original according to the analysis information RAID Group . The server data recovery engineer analyzed all the hard disks and found 4 The data of disk No. is different from that of other disks , I think it is hot Spare disc . Then analyze other data disks , analysis Oracle The distribution of database pages in each disk , And according to the data distribution, we get
RAID The stripe size of the Group , Disk sequence and data trend, etc RAID Important information about the group .
4、 Analysis server RAID Set up the cable tray :
According to the above analysis RAID Information , Developed independently through North Asia RAID The virtual recombiner will be the original RAID Group virtual out . But because of the whole RAID There are two offline disks in the group , Therefore, we need to analyze the order of the two hard disks dropping . Carefully analyze each piece of hard disk data , It is found that the data of a hard disk on the same stripe is obviously different from other hard disks , Therefore, it is preliminarily determined that this hard disk may be the first to be disconnected , Developed independently through North Asia RAID The verification program checks this strip , Finally determine the hard disk that drops the line first .
5、 analysis RAID In group LUN Information :
because LUN Is based on RAID Of the group , So it is necessary to use the above analysis information to RAID The latest status of the group is virtualized , Then analysis LUN stay RAID Allocation in the group , as well as LUN Allocated data block MAP. Because there is 6 individual LUN, So just put each LUN Data block distribution MAP extracted , Then write the corresponding program for this information , For all LUN The data of MAP analytical , Then according to the data MAP Export all LUN The data of .
The exported data is shown in the figure below :
6、 The server LVM Logical volumes and VXFS File system repair :
The server data recovery engineer analyzes all generated LUN, Find out all LUN All included in HP-Unix Of LVM Logical volume information . Data recovery engineers try to parse each LUN Medium LVM Information , Found a total of three sets LVM:45G Of LVM There is a LV, Deposit OA Server side data ;190G Of LVM There is a LV, Store temporary backup data ; The remaining 4 individual LUN Form a 2.1T Left and right LVM, Divided into one LV, Deposit Oracle Database files .
The server data recovery engineer writes the explanation LVM The program , Try to put each set LVM Medium LV The volume is explained , But the interpreter was found to be wrong . Carefully analyze the cause of the program error , Development Engineer debug Where the program went wrong , File system engineers are interested in restoring LUN Make a test , testing LVM Whether information will be caused by storage paralysis LMV The information of the logical volume is corrupted . After testing, it was found that it was really caused by storage paralysis LVM Information corruption .
Try to repair the damaged area manually , And synchronously modify the program , Reinterpret LVM Logic volume .
build HP-Unix Environmental Science , Will be explained LV Volume mapping to HP-Unix, And try Mount file system , result Mount File system error . Try to use “fsck –F vxfs” The command to repair vxfs file system , After repair, it still cannot be mounted . Doubt the bottom vxfs Some metadata of the file system may be corrupted , Manual repair is required .
Analyze it carefully LV, And according to VXFS The underlying structure of the file system verifies whether the file system is complete . Analysis found that the bottom VXFS There is something wrong with the file system , The file system is executing while the original storage is paralyzed IO operation , Therefore, some file system meta files are not updated and damaged . Data recovery engineers manually repair these damaged meta files , Guarantee VXFS The file system can parse normally . Once again, the repaired LV Mount the volume to HP-Unix On the little plane , Try Mount file system , There is no error in the file system , Successfully mount .
7、 testing Oracle Database file and start the database :
stay HP-Unix On the machine mount After the file system , Back up all user data to the specified disk space . The size of all user data is 1.2TB about .
Screenshots of some file directories are as follows :
Use Oracle The database file detection tool detects whether each database file is complete , No errors found . Use the... Independently developed by North Asia Oracle Database detection tool detects , It is found that some database files and log files are inconsistent , The database data recovery engineer repairs and verifies such files , Until all documents have passed the verification .
Will recover Oracle The database is attached to the original production environment HP-Unix Server , Try to start Oracle database ,Oracle Database started successfully .
8、 start-up Oracle database , start-up OA Server side , Install... On your local computer OA client . adopt OA The client verifies the latest data records and historical data records , And arrange personnel from different departments to conduct remote verification . The final data is verified to be correct , Data integrity , Data recovery successful .
边栏推荐
- 回归测试的分类
- Protection strategy of server area based on Firewall
- PLC: automatically correct the data set noise, wash the data set | ICLR 2021 spotlight
- Pytorch model trains practical skills and breaks through the bottleneck of speed
- CTFshow,信息搜集:web7
- 什么是pv和uv? pv、uv
- 知否|两大风控最重要指标与客群好坏的关系分析
- 用于增强压缩视频质量的可变形卷积密集网络
- 在软件工程领域,搞科研的这十年!
- Yyds dry goods inventory # solve the real problem of famous enterprises: cross line
猜你喜欢
电脑Win7系统桌面图标太大怎么调小
Why do we use UTF-8 encoding?
Stream learning notes
Ctfshow, information collection: web1
[understanding of opportunity -40]: direction, rules, choice, effort, fairness, cognition, ability, action, read the five layers of perception of 3GPP 6G white paper
asp. Netnba information management system VS development SQLSERVER database web structure c programming computer web page source code project detailed design
Huawei cloud database DDS products are deeply enabled
Pytorch model trains practical skills and breaks through the bottleneck of speed
Niuke real problem programming - Day11
Protection strategy of server area based on Firewall
随机推荐
[understanding of opportunity -40]: direction, rules, choice, effort, fairness, cognition, ability, action, read the five layers of perception of 3GPP 6G white paper
What is data leakage
Stm32cubemx, 68 sets of components, following 10 open source protocols
Ctfshow, information collection: web12
Full details of efficientnet model
时空可变形卷积用于压缩视频质量增强(STDF)
In the field of software engineering, we have been doing scientific research for ten years!
IDA pro逆向工具寻找socket server的IP和port
激光雷達lidar知識點滴
一个需求温习到的所有知识,h5的表单被键盘遮挡,事件代理,事件委托
Ctfshow, information collection: web9
拼多多败诉,砍价始终差0.9%一案宣判;微信内测同一手机号可注册两个账号功能;2022年度菲尔兹奖公布|极客头条...
Ascend 910 realizes tensorflow1.15 to realize the Minist handwritten digit recognition of lenet network
Navigation - are you sure you want to take a look at such an easy-to-use navigation framework?
Several ways of JS jump link
数据库如何进行动态自定义排序?
Ctfshow, information collection: web14
JSON解析实例(Qt含源码)
Zhiting doesn't use home assistant to connect Xiaomi smart home to homekit
Niuke real problem programming - Day10