当前位置:网站首页>[operation and maintenance thinking] how to do a good job in cloud operation and maintenance services?

[operation and maintenance thinking] how to do a good job in cloud operation and maintenance services?

2020-11-09 15:20:00 Jiawei Technology

Do you need operation and maintenance after going to the cloud ? Of course. . Going to the cloud can really simplify part of the operation and maintenance work , For example, for the daily operation and maintenance of the server , But because of the nature of cloud computing ( Open and use 、 Flexible expansion, etc ) And fast iteration of cloud products , Compared with the traditional operation and maintenance of stable , Cloud operation and maintenance more reflects the sensitive state characteristics . How to do the cloud operation and maintenance services well ? In addition to doing a good job in the cloud operation and maintenance basic service guarantee system, there is no problem , We should do a good job in value-added core operation and maintenance services . meanwhile , Cloud computing technology is changing day by day , The ability of operation and maintenance personnel also needs to keep pace with the times . The cloud age is reshaping IT Operation and maintenance .

The difference between cloud operation and maintenance and traditional operation and maintenance

Cloud computing is in full swing , From the Internet industry to manufacturing 、 Finance 、 traffic 、 Medical and other traditional industries are constantly infiltrating and integrating , Promote the transformation and upgrading of traditional industries , Enterprises are enjoying the huge dividend released by cloud computing . however , Is everything going to happen when you get into the cloud ? Of course not. , To enjoy the cloud computing bonus , You need to use the cloud well , Cloud operation and maintenance services are essential .

What is the difference between cloud operation and maintenance and traditional operation and maintenance ? The construction goal of cloud and traditional data center is consistent , They are all for the enterprise IT service ; The operation and maintenance responsibilities are all to guarantee IT The quality of service , Around service level agreements SLA Carry out various operation and maintenance activities . But because of the nature of cloud computing , Cloud operation and maintenance is quite different from traditional operation and maintenance :

The operation and maintenance objects are different

Traditional operation and maintenance , The starting point of contact is mostly hardware , Server, such as 、 Network devices 、 Storage equipment and wind, fire, water and electricity , In the era of cloud computing , Operation and maintenance personnel have been unable to see any physical equipment , The object of cloud operation and maintenance is more soft .

The operation and maintenance requirements are different

Because of the nature of Cloud Computing , Rapid deployment of applications on cloud platform 、 Quick update 、 Real time monitoring and other aspects put forward higher requirements for cloud operation and maintenance ; At the same time for flexible expansion 、 The operation and maintenance services with cloud characteristics such as cloud native also put forward higher requirements .

The operation and maintenance form is different

Cloud vendors' products are constantly and rapidly iterating , Compared with the traditional operation and maintenance of stable , Cloud operation and maintenance is more in a sensitive state , Operation and maintenance personnel need to constantly update their knowledge base along with the upgrade and iteration of cloud products .

The key points of operation and maintenance services on the cloud

Do a good job in cloud operation and maintenance services , You need to be clear about the content and focus of the service , And the way the service is done , The cloud operation and maintenance service system is shown in the figure below :

01 Cloud maintenance basic services

The basic services of cloud operation and maintenance are mainly reflected in three aspects : Monitoring alarm 、 Safety operation and maintenance 、 Daily problem handling .

Monitoring is the most important part of the whole operation and maintenance and even the whole product life cycle , Early warning and fault detection in advance , After the event, provide detailed data for tracking and positioning problems . Monitoring involves both infrastructure such as hosts CPU、 Memory 、 disk IOPS、 Network traffic monitoring , It's also about applications APM monitor , At the same time, whether the operation and maintenance personnel confirm to receive the alarm information 、 Whether the alarm problem is being handled 、 It is equally important to track and manage the process and results of processing .

Safety operation and maintenance includes safety reinforcement 、 Vulnerability scanning 、 Patch fix 、 Security architecture optimization . Security reinforcement is to check the safety baseline and reinforce the safety baseline configuration , Vulnerability scanning for application security scanning , Patch fix for operating system 、 middleware 、 Database for patch updates or bug fixes , Security architecture optimization improves the security capability of existing architecture and security products .

Daily problem handling , Including the installation and configuration of various resource products on the cloud 、 Lifting fitting 、 Backup service 、 Timely handling of technical problems .

02 Cloud operation and maintenance core services

The basic service of cloud operation and maintenance ensures that the system can run smoothly , For enterprises , This is not enough . What enterprises need is more standard 、 More standard 、 Lower cost 、 Lower risk 、 Better scalability to ensure the operation of the system , For operation and maintenance personnel , More efforts need to be devoted to the core aspects of operation and maintenance services .

The core services of cloud operation and maintenance are also reflected in three aspects : Cloud best practice standards 、 Cloud cost optimization 、 Cloud depth inspection and optimization .

Cloud best practices mean that deployment on the cloud is configured according to optimal standards . Including architecture best practices 、 Best practices for cloud resource selection and configuration 、 Best practices for cloud management . For any enterprise on the cloud , They want to use the cloud according to best practice standards .

such as , Here is the hybrid Cloud Architecture , It's designed to best practice standards .

Enterprise local IDC And the public cloud through a dedicated line or VPN Get through , Deploy applications across availability zones in the public cloud , Distribute user requests through load balancing , Form a high availability Architecture , Use cloud database as much as possible , On the Internet 、 host 、 application 、 Deploy cloud security products at different levels such as databases to ensure application and data security in an all-round way , file 、 The database is regularly backed up through the cloud native backup service , Use the fortress machine 、 Access control services 、 Cloud monitoring and other products assist operation and maintenance management .

Again , In terms of cloud resources , Five traditional cloud resources ( Calculation 、 Storage 、 The Internet 、 database 、 Security ) And the selection and allocation of other resources also need to establish standards . For example, planning a VPC There are still more than one. VPC, How to plan and design the subnet segment of switching machine 、 Naming rules for virtual machines 、 Virtual machine selection standard 、 Security group port opening principles, etc , Only match according to best practice standards , Operation and maintenance will be more efficient in the future , Scalability is better .

Best practices for cloud management include backup 、 disaster 、 account number 、 Authority and so on .

Another core service for cloud operation and maintenance , It is to help enterprises save costs . According to the professional advisory body RightScale According to a research report from , Today's enterprise users spend as much as 30%.

How to do a good job in cloud cost optimization operation and maintenance services ? It involves the establishment of cloud cost optimization system , Establish a set of pre engagement planning to in-process analysis , Then to the continuous cycle of cost optimization system after evaluation improvement .

The system focuses on implementation , For example, analyze how to reduce the waste of cloud resources , You can see from the figure below 5 Dimensions are examined and analyzed :

Cloud depth inspection and optimization , By regular ( Like once a quarter ) On Architecture 、 resources 、 Configuration and cost of comprehensive health inspection , Find out the possible unreasonable allocation and weak links , Adjust and optimize , Improve the efficiency of resource utilization 、 Reduce the probability of system failure .

Cloud deep inspection must inspect all cloud resources , The patrol object includes the overall architecture 、 Security 、 cost 、 The Internet 、 Calculation 、 Storage 、 database 、 monitor 、 account number 、 Backup .

High level on the problems found 、 in 、 Low grade , High level issues are already running on the system 、 Usability 、 Function has an impact , Suggest immediate action . The level is medium 、 Low issues suggest action or attention .

03 Use tool platform to assist cloud operation and maintenance

After the workload and complexity of operation and maintenance reach a certain level , We need an automated tool platform to help , For example, automatic inspection 、 Automated release 、 Automatic processing of alarm events, etc , Improve efficiency at the same time reduce the error rate of manual processing .

in addition , Many large and medium-sized enterprises will choose multi cloud strategy when making cloud strategic planning , How to manage multi cloud in a unified way , For example, the whole life cycle management of various cloud platform resources is carried out in a unified interface 、 How to manage the application and approval process of resources 、 How to carry out dose analysis 、 Cost analysis, etc , At this time, you need a multi cloud management platform .

Further thinking about cloud operation and maintenance services

Cloud operation and maintenance is a tedious task 、 Demanding work , In addition to routine daily operation and maintenance services , Sometimes we need to do a good job in special operation and maintenance support services . For example, after the highly concurrent e-commerce business of enterprises is put into the cloud , Similar to 618、 double 11 Wait for the promotion period , Often also need to do a good job in high concurrent escort special support work , For example, full link pressure test 、 Database tuning 、 Disaster recovery drills and so on , Ensure that the system is running smoothly during the event .

Cloud computing technology is changing day by day , Containers 、 There is no server 、 Microservices 、IOT、 Artificial intelligence and other emerging technologies bring more advantages and convenience at the same time , It also puts forward higher requirements for cloud operation and maintenance personnel , The ability of operation and maintenance personnel needs to keep pace with the times . The cloud age is reshaping IT Operation and maintenance .

版权声明
本文为[Jiawei Technology]所创,转载请带上原文链接,感谢