当前位置:网站首页>[cloud resident co creation] data center pue optimization model generation service: a money saving strategy for data centers under the AI wave, just use these tips
[cloud resident co creation] data center pue optimization model generation service: a money saving strategy for data centers under the AI wave, just use these tips
2022-06-09 07:43:00 【Hua Weiyun】
List of articles
- Preface
- One 、 Data center energy conservation can save a lot of money
- 1.1、 The worldwide connection promotes the sustained and rapid development of the data center market
- 1.1.1、 User connections proliferate
- 1.1.2、 The global data center infrastructure is developing rapidly
- 1.2、 The growth of the data center leads to excessive power consumption
- 1.3、 Data center energy saving is an inevitable trend
- 1.4、 What is? PUE?
- 1.5、 Cooling principle of data center
- 1.6、 Traditional energy-saving technologies and existing bottlenecks
- 1.6.1、 Traditional single system regulation
- 1.6.2、 Traditional whole system optimization
- 1.6.3、 The bottleneck of traditional energy-saving technology
- Two 、NAIE Secrets of data center energy saving technology
- 2.1、AI Technology has become a new direction of data center energy conservation
- 2.2、 Huawei NAIE Secrets of data center energy conservation
- 2.3、 Refrigeration energy consumption prediction modeling : Intensive fire
- 2.4、 Prediction of refrigeration energy consumption : Precision guidance
- 2.5、 Predict equipment operating conditions , Ensure the safety of the equipment
- 2.6、 How the control parameters make decisions ?
- 2.7、 Secret book : Bayesian optimization
- 2.8、 At the right time , Update the model
- 2.9、 Ace in the hole :NAIE Cloud earth collaboration
- 2.10、 Huawei XX Cloud data center : Average annual PUE Reduce 8-12%
- 3、 ... and 、NAIE The model generation service takes you to fly
- 3.1、 Where to start with a wide variety of refrigeration technologies ?
- 3.2、 Modeling is accompanied by old experts ?
- 3.3、 Data Center PUE The optimization model generation service takes you to fly
- summary

How important is the data center electricity charge , Saving energy can really save a lot of money ?
1.1、 The worldwide connection promotes the sustained and rapid development of the data center market
At present We are in a fully connected world , How many connections are there ? Let's look at a set of Statistics , The details are shown in the following figure :

We can see from the picture above that :
- stay 2015 The number of global intelligent terminals in 70 Billion , Expect to 2025 The annual meeting will be held with 5.6 Times as fast as 400 Billion .
- stay 2015 The annual number of global connections is 200 Billion , Expect to 2025 The annual meeting will be held with 5 Times as fast as 1000 Billion .
- stay 2015 Annual global data flow in 9 ZB, Expect to 2025 The annual meeting will be held with 20 Times as fast as 180 ZB.
The growth of massive data lies in the consumption of a large number of services , A large number of data centers are required to host these services .
The other group comes from MarketsAndMarkets The data about the global data center infrastructure can be more explicit Rapid development of data center infrastructure , The details are shown in the following figure :

We can see from the picture above that :
- stay 2017 The total value of global data centers in is about 130.7 Billion dollars and showing an increasing trend year by year , Expect to 2022 The total annual value will reach 490 Billion dollars .
The operation and maintenance of massive data centers cannot be separated from excessive power consumption .
We use specific cases , A large data center 10 To further analyze the composition of operating costs in , The details are shown in the following figure :

We can see from the picture above that :
- The data center has 70% Of the operating costs are invested in the electricity bill .
- For the electricity charge in the data 70% For server power supply , It is a required consumption , only 30% Used for refrigeration 、 lighting 、 Office, etc .
Then we Quantify the power consumption of the data center Well ? According to the statistics :
- Global data center uses electricity to account for global electricity consumption 3%, The annual growth rate exceeds 6%, amount to 30 A nuclear power plant (2017).
- Only data centers in China consume electricity every year 1200 Million kilowatt hour , It exceeds the annual generating capacity of the Three Gorges power station (2017 1000 Million kilowatt hour ).
- Data Center 3 One year's electricity bill can create a new data center .
For enterprises , Saving electricity costs is equivalent to increasing enterprise profits .
Remove internal operational challenges , That is, in addition to the excess electricity consumption of the data center , Relevant policies of governments / The regulations also put forward strict requirements for energy efficiency indicators , Data center energy conservation has become an inevitable trend . The table is as follows :
| Institutions | policy | Relevant requirements |
|---|---|---|
| Ministry of industry and information technology Administration of state organ affairs National Energy Administration | 《 Guidance on strengthening the construction of green data center 》 | To 2022 year , New large 、 Super large data center PUE<1.4 |
| Beijing Municipal Government | 《 List of prohibitions and restrictions on new industries in Beijing 》 | It is forbidden to build or expand data centers in the central urban area |
| Shanghai Municipal Government | 《 Shanghai energy conservation and climate change “ Much starker choices-and graver consequences-in ” planning 》 | New data center PUE<1.3, Stock data center PUE<1.4 |
| Shenzhen Development and Reform Commission | 《 Circular of the Shenzhen Municipal Development and Reform Commission on matters related to the review of energy conservation in data centers 》 | PUE<1.4 Stepped energy support , Encourage new construction DC PUE<1.25 |
| Joint research centre of the European Commission | Data center code of conduct (the EU Code of Conduct for Data Centers) | Encourage data center operators to reduce energy consumption , And award relevant awards every year (PUE Best Practice) |
| United States federal government | Data center optimization initiatives (DCOI) | The data center is recommended to PUE The goal is 、 virtualization 、 Monitor server utilization and other indicators |
From the above table, we can know :
- Ministry of industry and information technology 《 Guidance on strengthening the construction of green data center 》 New data center is required in PUE<1.4, Beijing 、 Shanghai 、 Shenzhen has also put forward relevant regulations , In particular, Shenzhen encourages new construction DC PUE<1.25, This is a very challenging number .
In the relevant policies and regulations above, it is mentioned that PUE value , What is PUE?
Power efficiency (Power Usage Effectiveness): The data center industry through measurement PUE( That is, the use efficiency of electric energy ) To measure energy efficiency .
The power consumption unit and composition of a data center are shown in the figure below :

Google Of PUE Measurement standards by :

explain : More energy consumption measurement points , And the closer it gets IT Device terminals , be PUE The higher the reliability of the final calculated value .
If PUE The value is 2.0, said IT Every time the equipment consumes 1 Watt Power , We have to consume more 1 It is cooled and distributed by watts of electricity .PUE It's close to 1.0 It means that almost all energy consumption is used for calculation .
We mentioned above that the electricity used for refrigeration in the data center accounts for non IT energy consumption 2/3 The proportion of , therefore Save data center expenses by reducing cooling energy consumption Is a great starting point .
The system structure of water-cooled chiller used for data center refrigeration is shown in the figure below :

Refrigeration principle of water-cooled refrigeration station :
- water chilling unit : Compressed refrigerant , The heat is replaced from the evaporator to the condenser through the phase change of the refrigerant .
- Cooling pump : Drive the cooling water flow through the cooling tower and cooler , Complete cooling water circulation .
- The cooling tower : The fan drives the air flow , The heat of cooling water is radiated into the outside air , Cool the cooling water .
- Frozen pump : Drive the chilled water flow through LCU End and chiller , Complete the circulating flow of chilled water .
- Terminal air conditioner : The fan drives the air flow , Chilled water absorbs heat from the air , The ambient temperature decreases .
For such a complex power consuming system , How do we save energy in tradition ?
Single system regulation The specific structure is shown in the figure below :

Its core lies in :
- Tuning a single device .
- Adjust single system efficiency ( Such as the ratio of compressor and water pump ).
Whole system optimization The specific structure is shown in the figure below :

Its core lies in :
- Based on experience , By experienced “ Old expert ” Set the best system working condition ( Such as cooling tower 、 water chilling unit 、 End linkage ).
- The application of product level energy saving technology is close to the ceiling .
- Complex system 、 Many devices , The relationship of energy consumption among equipments is complex , It is difficult to simulate with traditional engineering formula , The traditional way of control goes its own way , The role of expert experience has reached its limit .
- Each data center is a unique environment and architecture , Although many engineering practices and rules of thumb can be fully applied , but A customized model of one system does not guarantee the success of another system .
According to the relevant research data ,70% Of users think AI Technology should be applied in the field of data center , The details are shown in the following figure :

Gartner: end 2020 year ,30% Data centers that are not ready for AI , Its business operation will not be economical .
And also enumerates Artificial intelligence can improve the daily operation of data center in three ways :
- Use predictive analysis to optimize workload distribution , Optimize storage and computing load balancing in real time .
- Machine learning algorithms handle transactions in the best way , Using artificial intelligence to optimize data center energy consumption .
- Artificial intelligence can alleviate the shortage of people , Automatically perform system updates 、 Security Patch .
In the industry There are many uses AI Technology's experience in saving energy in the data center , Such as JimGao And DeepMind Teamwork , Neural networks are used to predict PUE、DC temperature 、 Load pressure , Control about 120 A data center variable , Realization PUE Reduce , The details are shown in the following figure :

Baidu uses deep learning neural network prediction model , stay K2 Intelligent building project testing . Baidu Yangquan Cloud Data Center , According to the humidity of outdoor weather 、 Temperature and load ,AI Automatically judge and switch the operation mode of the chiller , The details are shown in the following figure :

Huawei NAIE Data center energy conservation includes many aspects , This time we only introduce the energy saving aspects of the refrigeration system . Through the purposeful adjustment of the refrigeration system, the system can achieve a better state .
By means of Original data feature Engineering 、 Energy consumption prediction and security model 、 Optimization of control parameters To achieve the ultimate “ Wang fried ”! The specific implementation is shown in the figure below :

For which “ Wang fried ” Let's sell the content of the first pass .
We also mentioned above , end 2020 year ,30% Data centers that are not ready for AI , Its business operation will not be economical . Many data centers are also gradually becoming AI Prepare for the launch of the data center , Store relevant historical data and samples , If the number of samples is too large, a deep learning network can be used , Full open fire to model energy consumption , Train multiple networks , The details are shown in the following figure :

Conduct multiple assessments during training , If the accuracy is not up to standard, it can be removed or the depth residual network can be used (ResNet), Compared with the traditional network, it can better solve the problem of gradient disappearance , In practice, the former method can already solve 80% The problem of , The rest 20% Can be based on ResNet Modeling .
If We have a small sample of scenarios , It is not possible to use concentrated firepower , Deep learning networks can't be better trained , Then we It is necessary to adopt the method of precise guidance on the original system , Adopt as K a near neighbor 、 Gaussian process regression algorithm, etc , The details are shown in the following figure :

The refrigeration system is a safety guarantee system , Safety comes first . Some students may have questions :
ask : Turn off all the equipment , Isn't it the most energy-saving one ?
answer : The cooling capacity of refrigeration equipment shall be greater than IT Calorific value ,4.2 × 𝑀 × ∆𝑇 > Safety factor × 3.6 × 𝐼𝑇𝑒𝑛𝑒𝑟𝑔𝑦, So it must not be all closed .(M And ∆𝑇 It is the temperature difference of the refrigerator inferred from the safety assurance model 、 Flow and other parameters , Never for 0)
ask : Using less equipment will definitely save more power than using more equipment !
answer : If you only use one pump , The operating frequency may exceed 56Hz, I don't know whether to save electricity , But the pump may be damaged ,“ There is no hair in the skin ”.
In addition to the predicted energy consumption , Also predict the operating frequency of such pumps , If the actual operation and maintenance experience is exceeded , It will be considered that the control parameters are unreasonable , The details are shown in the following figure :

After the energy consumption prediction model and security model are established , It is necessary to make decisions on the current control parameters .
We treat the control parameter as an independent variable , Energy consumption as a value , You can generate a N Energy consumption model hypersurface in dimensional space , Each point on the plane represents a control parameter energy consumption , Because the control parameters are not absolutely safe , So you can see that there are some holes , The details are shown in the following figure :

So how do we see in the picture above Find a relatively optimized control parameter to reduce energy consumption and ensure the safety of control parameters ?
This requires our reference NAIE Training platform SDK, heuristic 、 Bayesian optimization .
The Bayesian optimization process is shown in the figure below :

explain : The black line in the figure above represents the real function , The black two points indicate the observation points that have been sampled , Build Gaussian regression model , The black dotted line is the predicted value , The purple area indicates that there is no uncertainty at any point , Green represents the sampling function . Iterating through different sampling points to find the optimal sample .
Open source implementation of Bayesian optimization There are several kinds for you to understand :
- SMAC Bayesian optimization method using random forest as performance prediction model ,https://github.com/automl/SMAC3
- Hyperopt use TPE Bayesian optimization method as a performance prediction model ,https://jaberg.github.io/hyperopt/
- Spearmint use GP Genetic algorithm as a Bayesian optimization method for performance prediction model ,https://github.com/HIPS/Spearmint
ask : as time goes on , More and more samples will be collected , The model will not “ obsolete ” Well ?
answer : The answer is yes . When we collect more samples , Not only to update the model , We need to update a dozen models , The details are shown in the following figure :

Then there will be more problems :
- Do you want to update the model , When to trigger a model update , How to update ?
- It seems to have been said before , There will be a dozen models , Simultaneous updating ?
NAIE Cloud land collaboration is what we mentioned above “ Wang fried !”
NAIE Cloud earth collaboration : Get through the cloud and the earth , Realize data collection on the cloud 、 Daily evaluation of models 、 Heavy training 、 The whole process of model updating is automated . The specific structure is shown in the figure below :

After we adopt the above scheme , Huawei XX Cloud data center : Average annual PUE Reduce 8-12%, The details are shown in the following figure :

There are many kinds of refrigeration technologies , Pipeline layout varies greatly , Where to start ? Different data centers , In cooling mode ( Water cooling 、 The wind is cold 、AHU etc. )、 Pipe type ( Main pipe 、 Single tube 、 Mixing tube ) There are likely to be differences , The specific classification is shown in the figure below :

But don't worry ,NAIE The model generation service has laid out the layout for you .
We all know that modeling belongs to professional and technical activities , The old expert will be given along with you ?
Build a data center for developers , From energy saving modeling to model application , It requires the input of the development team 4 people , after 6 Months , therefore Even if it comes with an old expert, it doesn't work , The process of building a data center is shown in the following figure :

Data Center PUE Optimize model generation services It provides detailed solutions to such problems , Welcome interested partners to click to view the data center PUE Optimize model generation services :https://www.hwtelcloud.com/products/dpo, The details are shown in the following figure :

A scan of the QR code leads to , The details are shown in the following figure :


This article is compiled from Huawei cloud community 【 Content co creation 】 Activity number 13 period .
https://bbs.huaweicloud.com/blogs/330939
Mission 13.AI Data center money saving strategy under the tide , Just use these moves
边栏推荐
- Use of thread pool
- Peripheral driver library development notes 42:dac8552 DAC driver
- At time_ What happens to TCP connections in wait status after SYN is received?
- Robot_ Framework: Keywords
- Opengauss database operation steps
- 2022年全国最新消防设施操作员(中级消防设施操作员)考试模拟题库及答案
- 软件设计文档最容易忽略哪些?
- 朴素贝叶斯分类器
- R language through rprofile Site file, custom configuration of R language development environment startup parameters, shutdown parameters, use file Edit function edit configuration file
- [NewOJ Week 1]---CDE
猜你喜欢

PostgreSQL数据库复制——后台一等公民进程WalReceiver ready_to_display

2022 Chinese cook (elementary) examination question bank and online simulation examination

Robot_ Framework: Keywords

2022年中式烹调师(初级)考试题库及在线模拟考试

Apache配置与应用(构建web主机、日志分割及AWStats分析系统)

error converting YAML to JSON: yaml: line 10: found character that cannot start any token

【无标题】

Large factory interview algorithm series - dynamic programming for solving the longest common substring problem

Sql Or NoSql,看完这一篇你就懂了

Leetcode0001: sum of two numbers (simple, four solutions)
随机推荐
Robot_ Framework: Keywords
Epoll series system call
How about opening an account for shares of tongdaxin? Is it safe to open an account?
Unity imitates flying birds (2) add protagonists
On the readable and writable conditions of socket
How to prevent the biggest cloud security threat
Web comprehensive performance test model
Push related summary
Boot Black apple with OpenCORE
Opengauss database operation steps
软件设计文档最容易忽略哪些?
Oracle: time type
2022年最新山西建筑安全员模拟题库及答案
[NewOJ Week 1]---CDE
Naive Bayes classifier
PHP date format conversion simplified month day without 0, for example: 22.2.9
Thread scheduling and thread priority
MySQL: merge query results and aliases
How do I add the hours of the current date in SQL Server- How to add hours to current date in SQL Server?
SQLZOO刷题记录-2