当前位置:网站首页>How to deal with the core task delay caused by insufficient data warehouse resources
How to deal with the core task delay caused by insufficient data warehouse resources
2022-07-03 08:34:00 【Impl_ Sunny】
“ The company cluster machine goes offline , How to deal with the shortage of resources in data warehouse , The problem that causes the core task to be delayed ?” This kind of fault is very common , Today, let's talk about general solutions , altogether 9 recruit , Do it in order .
One 、 Repair the fault urgently
The company cluster machine goes offline , There must be something wrong , Of course, the first priority is to check the reason why the cluster machine is offline as soon as possible , Then give targeted solutions , If the cluster problem cannot be solved in a short time , We should also upgrade our leadership as soon as possible , Clarify the impact on the business , If the superior pays attention , It may help you coordinate to higher-end technical resources , This work must be carried out synchronously , We must put enough pressure on the supporting Party of the cluster , This is called the right medicine , It is also a way to cure the root cause , Other methods are curve to save the country .
This step is in place , If time is really urgent , Then go to the next step .
Two 、 Dynamic expansion of resources
Since it's a cluster , According to reason, resources are redundant , Then temporary dynamic expansion is the most basic method , This is also the meaning of Cloud Computing , If you can't do this step , At least it means that the system planning is not done well , Is it difficult that the data warehouse is still stand-alone ? If so , We should consider the method of cloud data warehouse , Now? hadoop The big data platform architecture is very mature .
If this step cannot be done , Then write it down and settle accounts with the planning department after autumn , And then you keep going down .
3、 ... and 、 Resource transfer adjustment
The use of cluster resources also exists 2/8 The phenomenon , Since it is the core task , There must be many non core tasks , Then transfer the resources of other non core tasks to the core tasks , If so hadoop colony , You can adjust the queue to quickly increase resources , Of course, the premise is to be able to roughly judge the adjusted business impact , However, this way of robbing others' resources is still relatively simple and rough .
If resources are not adjustable , Then keep going down .
Four 、 Adjust priority
If the transfer of resources is unrealistic , For example, it is difficult to carry out refined scheduling at the resource level in a short time , Then prioritize all tasks , Increase the scheduling priority of core tasks , Lower the priority of other tasks , Ensure that the generation time of core tasks meets the requirements , Of course, the premise is the importance of all tasks 、 Have a clear understanding of the degree of interdependence and the impact on the business , These Kung Fu are outside the poem , Temporarily and hurriedly adjusting task priorities may lead to serious consequences .
If there are thousands of tasks , There are countless relationships between each other , There is no condition for clear sorting and operation in a short time , Then keep going down .
5、 ... and 、 Task code optimization
The core tasks are generally complex , It also consumes more resources , It means there is more room for optimization , It turns out that when there is surplus food at home, you may not pay much attention to the quality and efficiency of the code , Now I have to optimize , It depends on the ability of developers , Technology winners often show value at this time , We used to pass hive Switch to spark Achieved a good acceleration effect .
If the space for code optimization is still limited , Then keep going down .
6、 ... and 、 Reduce task dependency
Data warehouse modeling can effectively improve the overall support efficiency of the upper application through the reuse of models , But the task of returning to a single application , Due to the need to rely on the generation of warehouse models , Instead, it affects the generation speed , This is the contradiction between local and global optimization . By stripping the dependence of core tasks on the data warehouse model , Customize a set of data processing logic for it , Can greatly improve efficiency , The result is a waste of resources , It intensifies the shortage of the overall resources of the data warehouse , Of course, in extraordinary times, very means , Sometimes we have to do this in order to ensure the assessment .
If that doesn't work , Then keep going down .
7、 ... and 、 Core task disaster recovery
Since the core task is so important , And a single cluster is also untrustworthy , Then you can't put all your eggs in one basket , Disaster recovery or emergency response is a conventional practice , For example, in order to ensure that certain assignments are safe , Often take different heterogeneous ( The core tasks are placed in hadoop and MPP colony ) Solutions for , The premise is that the scale of the core task is not large , Otherwise, the investment and cost are unbearable , Data warehouse is characterized by large amount of data , Generally, disaster recovery plans are not made , Although cluster collapse is a very small probability event , But cluster performance degradation is a highly probable event , such as hadoop A parameter adjustment may greatly reduce the efficiency of data processing .
This is the last move , Now let's talk about management methods .
8、 ... and 、 Do a good job of explanation
The delay of core tasks definitely affects the business , In the face of this situation , On the one hand, we should communicate and report to our superiors in time , Cooperate with all parties to analyze the fault , Give a solution that can be implemented , If subordinates can hold this 7 A plan to let me make a decision , I will be very satisfied , On the other hand , We should make a good assessment of the actual impact of core task delay on the business , Know what you know , At the same time, explain to the business side , Reduce business expectations appropriately .
Can do this , I think it has surpassed most people , Because this is not a simple technical problem , The comprehensive quality requirements for processing personnel are very high .
Nine 、 Turn crisis into opportunity
Failure is a challenge for data warehouse , It's also an opportunity , When there is no problem at ordinary times, the business cannot feel the value of data warehouse , It's hard to get some resources , If the fault really has a great impact on the business , It may make the company reconsider the value of data warehouse .
I remember once IT The system is down , It has caused great social impact , The reason analyzed later is insufficient capacity , And then the company to the planning department 、 Marketing Department 、IT The heads of the departments call each other 50 Big board , Said that the planning department did not plan the capacity , Invest less money , Say that the market department raises demand indiscriminately , Not doing a good job in business planning , Through this incident ,IT On the contrary, it has received more attention , And get more resources to ensure production , Various disaster recovery systems have sprung up , Then the whole system didn't hang up .
Reference material :
1. WeChat official account ( Big fish's data life )-《 How does the data warehouse cope with the delay of core tasks caused by insufficient resources ?》
边栏推荐
- Markdown learning
- Osgearth target selection
- Conversion between string and int types in golang
- producer consumer problem
- P1596 [USACO10OCT]Lake Counting S
- matlab神经网络所有传递函数(激活函数)公式详解
- Thymeleaf 404 reports an error: there was unexpected error (type=not found, status=404)
- Unity editor expansion - draw lines
- Cesium for unreal quick start - simple scenario configuration
- redis集群系列四
猜你喜欢
【云原生】微服务之Feign的介绍与使用
Simply start with the essence and principle of SOM neural network
Creation of osgearth earth files to the earth ------ osgearth rendering engine series (1)
Gradle's method of dynamically modifying APK package name
MXone Pro自适应2.0影视模板西瓜视频主题苹果cmsV10模板
图像处理8-CNN图像分类
Vscode, idea, VIM development tool shortcut keys
Display terrain database on osgearth ball
Student educational administration management system of C # curriculum design
P1596 [USACO10OCT]Lake Counting S
随机推荐
UE4 source code reading_ Bone model and animation system_ Animation compression
Golang的range
Unity notes 1
Visual Studio (VS) shortcut keys
Chocolate installation
Advanced OSG collision detection
Base64 and base64url
LinkedList set
Intersectionpicker in osgearth
Location of package cache downloaded by unity packagemanager
【音视频】ijkplayer错误码
Unity editor expansion - window, sub window, menu, right-click menu (context menu)
Unity Editor Extension - event handling
Cesium for unreal quick start - simple scenario configuration
单调栈-42. 接雨水
Creation and content of mapnode -- osgearth rendering engine series (2)
go 解析身份证
Image processing 8-cnn image classification
Campus lost and found platform based on SSM, source code, database script, project import and operation video tutorial, Thesis Writing Tutorial
Golang url的编码和解码