author : Chen Haibo ( Boulder )
Midnight , The phone rings. , Online emergency . You wake up from a thousand cries , Sleepy eyes , Act with confusion .
In a trance , Finally, I sorted out what had happened , An old application suddenly shuts down , A pile of news , The system stops . And you're like a sewer worker dredging the toilet , Do wonders vigorously , Restart the machine , Recovery services . I saw the accumulated events in the message queue flow down a thousand miles , Warning elimination .
The next day , You don't want to repeat the midnight bell , Decide to permanently repair the system : Add a script to periodically restart the service .
Time goes , Many days have passed , You have added countless scripts in different corners , It's been a long time since I woke up in the middle of the night , You can even enjoy afternoon tea at leisure . however , Until today, .
today , Your department should reduce costs and increase efficiency , In order to avoid becoming a cut cost , You decide to take the initiative to operate on the machine on the line : Merge Services , Dismantle the machine . Although you know what applications are deployed on the machine , But you don't remember which scaffolds support the operation of the application .
Rebuilding the environment is as difficult as heaven , You're having trouble sleeping again .
Changing infrastructure
complex 、 The mysterious and magical online environment carries half of the urban legends , Even the online environment of some companies is an ancient myth that needs to be passed down by word of mouth .
Although I have never seen a rainstorm in the desert , But no one has said clearly what the environment is . I haven't seen the sea kiss a shark , But I've seen rebooting machines that can never be started again .
If you can make it clear what applications are deployed on any machine , And what the dependencies between services on the machine look like , You have defeated 80% of the players . The fundamental reason for this situation is the variability of infrastructure .
It seems that routine O & M changes , Documented or undocumented , Every environment is so unique and unique , Plus personnel changes , Created countless “ Ancestral secret skill ”.
What is immutable infrastructure
While containers are widely accepted ,「 immutable 」 Also gradually be imperceptibly accepted .
After the application container dies , Changes in the container will disappear with the container , Don't make changes in the container is the most simple immutable idea . Problems encountered during program startup , Few people would take repairing in the container as a serious plan , But will return to the initial stage , Solve the problem from the container construction phase .
What follows is an endless sense of trust in the container image , The container suddenly has a problem ? Try rebuilding . Another machine can't run ? It must be the machine .
What does the container bring
It is generally agreed that Docker Or the container triggered a revolution . But why is the container a revolution , What kind of life did you end up with ?
There is a saying that Docker Simplified service management , It is easier to stop and start the service , But in fact systemd perhaps supervisor And so on 、 Operation and maintenance tools are not necessarily better than docker Difficult to use .
I think mirror technology triggered the revolution , A non modifiable 、 Solidified 、 Self contained deliverables , A real one-time build runs everywhere , An entity that can be quickly removed and deployed , There is no more 「Works on my machine」, Is so reassuring and reliable .
It is precisely because the reconstruction of containers is so fast , It gives us a chance to 「 Try restarting 」 Play to the extreme , Something is wrong. , Rebuild the container to see ?
You want to try Docker, But your business is so complicated , What is the essential difference between rebuilding the container on the machine and directly restarting the service on the machine .
Pet vs Cattle
then Kubernetes The arrival of the , Large scale container management becomes possible , The delineation of management best practices for large-scale containers has also become a hot topic .
How to raise a cat
If you get a cat , How would you feed it ? I remember when I got back a cat , It took a long time and brain cells to agree on a name that I and it could accept .
Then I took my kitten to the hospital , Make a vaccine plan , To ensure the future health of my kitten . Then there is the shoveling of excrement day after day and waiting for the Lord's care .
The little prince told me : Because you spent time with your cat , That makes your cat so important .
The classic operation and maintenance mode is the same as that of cat breeding , We can call it pet operation and maintenance . You will make a detailed plan for the machine and application , Even give the environment a name , For example, it is called production .
You will also take good care of this environment , Check the monitoring regularly 、 Do the upgrade maintenance . You spend time with your environment , Your environment is so important .
It's hard to imagine your cat and your environment suddenly leaving you , Maybe you will be sad and bankrupt .
Introduction to farmers
After acquiring a cat , You inadvertently acquired another ranch . But you may not be able to keep thousands of cows like a cat , Because thousands of names should be hard to remember .
You may not take your cattle to the hospital for vaccination in turn , Why not directly wholesale vaccines for unified vaccination . If a calf has serious defects , Maybe you will accompany it and take good care of it , But it is more likely to remove it from the process as soon as possible , Save more waste of feed .
From then on, you have no idea of your life , As long as enough support facilities are set up for the cattle , What effect does the state of a certain cow have on the whole .
Grazing operation and maintenance
When you have built a perfect breeding facility for thousands of cows , You will be surprised to find out , Even if you add 100 Head ox , It won't add too much cost . Not like raising cats , When you have two cats , It's better to have 3 A litter bowl , Otherwise you will have a chance to experience the flying chicken and jumping dog .
You are determined to be a cattle Herder , I don't want to shovel shit for the owners of the production environment anymore . hug Kubernetes, Manage your apps like herding .
Herding a group Pod
You put the owners of the production environment into containers , Standardize the deployment method through container image , No one wants to do non-standard operations , Reuse Pod Deploy the application , No longer care about where the application is Node Start the , All this has its own pasture (Kubernetes) Do it yourself .
One Pod Abnormal ? No problem , Delete to see , Next time it will be a new application . It will not affect the normal operation of the pasture .
Everything is silky smooth , Years static good .
Midnight , The phone rings. , Online emergency . You wake up from a thousand cries , Sleepy eyes , Act with confusion .
In a trance , Finally, I sorted out what had happened , It turns out that I don't know where the traffic is emerging , Node avalanche . And you're like a sewer worker dredging the toilet , Do wonders vigorously , Expansion machine , Recovery services .pending Pod Disappear from sight , Warning elimination .
The next day , You don't want to repeat the midnight bell , But deep in thought , As a rancher, you , Gradually realize that there is a problem of darkness under the lamp .
You can easily homogenize every cow , But you can't homogenize the whole ranch .
Herding real infrastructure
Sort it out several times , You found the key to the problem : Although it can be herded Pod, But the operation and maintenance of the machine is still a pet .
You will still take good care of every machine : Plan ahead 、 Take a name 、 Individually controlled specifications 、 Pick the operating system , There are even a few that you love , ask people whether they feel hot , Day and night , Have a separate internal nickname code .
As a leading rancher in the industry , You are determined to transform your pasture , If we can manage Pod Same management Node Is that a good idea ?
If Node Delete the exception directly Node, Waiting for the new ? Not finished yet , Your back is a little cold , You haven't trusted virtual machines as much as you trust containers , Although you know that restarting can solve 90% Failure of , And reinstalling the system can solve 99% Failure of .
reasoning , You still want to try , Think about it , There are two locking pain points :
Tame virtual machine
Through investigation and research , You find that things are not as unfathomable as originally thought , Mainstream cloud vendors have long provided wild tools , You just need to be a little tame , You can serve your own pasture .
Cloud manufacturers have long introduced elastic scalability groups , The number of virtual machines that can be maintained according to load and expectations . Alibaba cloud also has its own implementation (ESS), It is possible to set the rules without human intervention ECS Expand and shrink .
You see hope , Isn't that what Node Of Deployment Do you ? But just expanding the virtual machine is of no value to you , You know what you need is a cow pen , Not wood , You need to tame them .
At this time , You turn around and see your Kubernetes colony , Inspiration leaps out , Why face virtual machines , As long as the newly expanded machines can be managed in the cluster, the basic problem will be solved .
Do as you say , adopt AutoScaling The startup command is defined , Standardized installation and execution after startup kube join action . When the machine starts up soon , Can appear in Kubernetes In the cluster . You feel a step closer to your goal .
tame OS
Soon , You have found a new problem , Conventional OS Startup and container are really incomparable , It is clear that all dependencies are inside the container , And all the applications are already running in the container , But you still have to OS Built in 、 Services that no one uses pay for , These services slow down startup , Security vulnerabilities have also been introduced .
and , There are always people who don't realize the benefits of homogeneous management , From time to time, someone makes some unknowable changes on the machine , So you need to shout in the group every time you want to release the virtual machine , Avoid any magical Bug. bring trouble on oneself , You smiled bitterly .
You intend to tame OS, To the traditional OS Cut , Clean up everything except the container , It can not be said that it can greatly speed up the starting speed of the machine . also , It's best to put a notice somewhere , Please do not write on the machine , Avoid file loss when releasing the machine .
One day , You found Alibaba cloud ContainerOS, An operating system tailored and optimized for containers , You don't have to cut it yourself , There is even no need to add a notice , because RootFS All read-only , even SSH Will not open by default , Fundamentally put an end to non-standard operation .
You tried , Optimized for containers OS Start the thief, come on , Click to pop up , You can dispatch the business in a minute . hold Node treat as Pod management , You see hope .
Managed grazing
But soon , You have a new problem , Create a machine for a while , Your boss told you , You have a batch of machines in your hand. There is an important CVE Security vulnerabilities , We need to hurry up and do something , You're starting to get big , Because you know , In addition to the stock node , Now as long as you create a new machine, there are security vulnerabilities .
You have a vague feeling : You're heading for deep water .
After some searching , You heard that someone proposed that you should host a cluster like autonomous driving , I also saw Alibaba cloud ACK Managed node pool . Expansion and contraction capacity of nodes 、 Self recovery of node failure 、 Safety reinforcement 、OS The managed , Everything touches you . You realize that you should solve the problem fundamentally : Let go of self built Kubernetes, Fully embrace trusteeship .
After embracing trusteeship , You suddenly see the light , It should have been so simple , Years of exploration is like a seed , When you see the managed node pool, the results will sprout instantly .
From now on , The new three board axe tradition has spread all over the world : Wait and see if you can heal yourself ?Pod Delete and have a look ?Node Delete and have a look ? Plain and effective .
Click on
, See more details of managed node pool functions ~
原网站版权声明
本文为[InfoQ]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/188/202207070111176425.html