当前位置:网站首页>Full link voltage test of the e-commerce campaign Guide
Full link voltage test of the e-commerce campaign Guide
2022-07-07 09:01:00 【bboyzqh】
See the previous article for a guide to e-commerce promotion : Battle guide for e-commerce promotion , Because of the limited space , Here, the full link voltage measurement is described separately .
Catalog
1. Pressure measurement personnel confirm
2. Sorting out core link services
3. Flow prediction and pressure measurement model output
3.3 Pressure measurement model output
4. Pressure measurement data preparation and joint commissioning
5.2 Long term pressure measurement at different scenes
6. Pressure measurement recovery
7. The pressure measurement is repeated
Full link voltage measurement is a time-consuming and labor-intensive work , Therefore, we must make a detailed and thorough plan , The important nodes involved in the full link voltage test are shown in the figure below :
in addition , Before the full link voltage test , The application involved in the core link must first eliminate the known performance problems , Otherwise, the progress of full link voltage measurement will be affected . The author has experienced several full link voltage tests, but none of them can be carried out completely and smoothly , Often need to 2~3 Times to achieve the expected results . The following will detail the process and points for attention of each link of the full link voltage test .
1. Pressure measurement personnel confirm
Pressure measurement personnel need to be broad and precise ,“ wide ” It refers to that the personnel should cover the person in charge of the core link Application 、 Operation and maintenance 、DBA、 Important middleware principals .“ fine ” It means that the pressure measurement personnel need to be familiar with the upstream and downstream of the application , At the same time, it can quickly solve performance problems during pressure measurement .
application | person in charge |
shoppers | Fang Chen |
transaction | ** |
marketing | ** |
...... | ...... |
2. Sorting out core link services
The carding of core link services mainly includes two items : Apply strong and weak dependencies ( Current limiting, degradation, etc )、 Combing high-risk businesses ( For example, after the last promotion , There are business applications with large version upgrades or businesses that have not experienced major promotion ). It's best to produce an application map here , Mark the core link , Examples are as follows :
Picture reference : Geek time 《 Full link piezometry 30 speak 》
After sorting out the core links , Need to apply owner Then sort out the core interface , It is mainly used in three scenarios : Performance indicators monitoring 、 Configuration of stability measures such as current limiting 、 Pressure measurement . Examples are as follows :
Picture reference : Geek time 《 Full link piezometry 30 speak 》
In addition, a clear stability plan needs to be given after sorting out high-risk businesses , For example, it can be downgraded , Need to give a degradation plan , It is necessary to provide alternative or emergency plans . Try to imagine all the possibilities before the big promotion .
3. Flow prediction and pressure measurement model output
Flow estimation tests the skill of the person in charge of application , The application leader should be familiar with the upstream and downstream dependencies of the application and the performance indicators of important core interfaces , Only in this way can we reasonably predict the flow . At this stage, the application leader needs to do several tasks :
- Sort out the core interface of the critical path , Including its upstream and downstream dependent applications and performance indicators
- The interface capacity is estimated according to the historical water level and interface performance index , For example, interface qps value
- Output application pressure measurement model
3.1 Inlet flow estimation
System traffic entry refers to the starting end of the call link , For example, the home page 、 The call beginning of the scenarios such as the product details page and the list page refers to mall、detail And so on , The way of estimation is usually to refer to history GMV Or order quantity or DAU( The number of stores in my company that are online at the same time ), In reference, we should also consider some constant factors , Such as marketing activities 、 The similarity of activity goods, etc . With 20210518 Example of large flow prediction , Reference was made to 20201215 Large flow Promotion , At the same time, the marketing activity scene is also similar , So it has a strong reference significance . Flow estimation reference is attached GMV The general formula is as follows :
Prediction promotion qps value =( Prediction promotion GMV/ History greatly promotes GMV)* History greatly promotes qps value
in addition , The following constant factors should be considered when estimating ( There are many factors , You need to refer to specific business scenarios ):
- Differences in marketing activities : When making estimates with reference to the historical promotion, we need to pay attention to , identical GMV Different marketing activity scenarios , Corresponding big promotion qps The peaks are different , Therefore, we need to consider the differences of marketing activities when making estimates . such as 1215 and 518 How to promote activities 、GMV be similar , That's on the hour qps The peak is similar , It is of great reference value .
- Differences in event goods : The difference of activity goods will affect the order volume , Indirectly affecting the transaction GMV, Ultimately, it will also affect the accuracy of flow estimation . Such as 327 Promote peace 518 Great promotion GMV equally , But the types of goods targeted are different ,327 The promotion is aimed at A Commodity ,518 Aiming at B Commodity ,B The customer unit price of class promotion is higher than A Kind of big promotion , So theoretically 327 The order volume of big promotion will be greater than 518 A large number of orders , namely 327 Hasty qps Theoretically, the value will be higher than 518 Hasty qps value .
- Differences in activity time : in fact 518 Hasty qps The value is lower than that of each application owner Given qps value , It's because the difference of seasons is ignored , namely “ The first half of the year is the off-season for placing orders ”. Compared with the year-end, the flow is relatively small ( At the end of the year, there are important holidays such as Spring Festival and new year's day ).
- ......
3.2 Node flow estimation
As a node on the traffic entry link, the traffic prediction value is also given in the promotion , Such as marketing core applications smc The application and mrc application . The node traffic is determined by the entrance traffic according to the traffic branching model , In proportion . The branch traffic model is based on the system link , Follow these principles :
- Same entrance , The traffic of different link proportions is calculated independently
- For the same node on the same link , If there are multiple calls , It needs to be calculated and magnified by multiple
- DB Write traffic focuses on
Illustrate with examples ,mrc Application multiMatchForKeyDataId Interface is the core interface of marketing , When estimating the flow, first segment the upstream flow , Here's the picture
multiMatchForKeyDataId The upstream flow is divided by the application owner Outside the evaluation , It is also necessary to confirm with each upstream whether the estimated value can meet the upstream call ( Such as icm Of 1.085\% Whether there is any flow change in the prediction of this promotion , If there is a change , What is the coefficient of change , Generally, this value is applied by upstream owner To confirm ), See the table below :
The above table combines the above calculation mrc application multiMatchForKeyDataId Interface estimates 2010, At the same time, do a good job in limiting the flow of clients and servers , The value obtained by this estimation method is similar to 1222 The big promotion value is very close . Ideally, there should be an interface to call the map , The core interface of each application in the map is used as a node , The node connection represents the calling relationship , The value of the connection is qps value .
3.3 Pressure measurement model output
After the flow prediction of each application , Then the pressure measurement model can be preliminarily obtained , As shown in the figure below , The pressure measurement model is used in the later stage and the pressure measurement model 、 Compare with the real traffic model , Used to verify the deviation of flow prediction , Accumulate valuable pressure testing experience for follow-up promotion support .
The value in the figure is an example value , For reference only
4. Pressure measurement data preparation and joint commissioning
Pressure test data preparation includes shadow database application 、 Flow recording and pressure measurement data initialization ( Shadow vault ). Among them, pressure measurement data initialization refers to initializing the data snapshot of the flow recording time point into the shadow Library . The pressure measurement platform of the company where the author works adopts the flow platform for flow recording (GoReplay), During the recording and playback process, the pressure gauge is used to transparently transmit the whole link , Transparence DB Layers are isolated using shadow libraries , The technical scheme of full link voltage measurement is not detailed here . The following points should be paid attention to in the preparation of pressure measurement data :
- For traffic recording, try to choose a scene similar to that of Dashu
- The online data at the time point of traffic recording should be saved and initialized to the shadow Library , There should be no big difference between the time before and after , Otherwise, there will be no normal response to the pressure measurement request
- Stress test data focuses on the timeliness of shopping guide and marketing data , Often the flow recording time point and the real pressure measurement time are not in the same time period , There are often scenarios of active data failure
After preparing based on the above data , It can be debugged with a small proportion of flow , Ensure the success rate of traffic entry interface requests without affecting online business 100%、 The response data can be written to the shadow library normally 、 Ensure that all monitoring is in place .
5. Pressure test execution
Pressure measurement execution is the most critical link in the pressure measurement process , The purpose of pressure measurement is to point out the “ Maximum ” and “ The best ” spot , As shown in the figure below ( quote Some key points about performance testing ):
In the above figure, the junction of light load area and heavy load area is called " Best point ". The junction of heavy load area and load failure area is called " The biggest point ".
- When the load of the system equals “ Best point ” when , The overall efficiency of the system is the highest , The utilization rate of system resources is moderate , User requests can be quickly responded to
- When the system load is " Best point " and " The biggest point " Between time , The system can continue to work , But the response time began to get longer , The utilization rate of system resources is high , And keep this state , If the load continues , It will eventually cause a small number of users to give up unbearable
- When the system load is greater than “ The biggest point ” when , It will cause more users to give up using the system because they can't stand the long wait , Sometimes even the system crashes and cannot respond to user requests
Therefore, the duration of pressure measurement should be considered in the process of pressure measurement , such as 180s、300s、600s What are the system indicators , And then find the system “ The best ” and “ Maximum ” Performance point . Because there is a very vivid statement yes : You can afford 100 The weight of kilogram , And you can walk , But can you afford 100 The weight of kilograms 1 Months . The author once went into a misunderstanding when doing single interface pressure measurement , As long as a certain scene and a certain round reach the system “ Personal understanding ” The highest point of meaning , Then stop the pressure measurement , In fact, the highest point is located in “ The best ” spot , Or in “ The biggest point ” But I don't know , That is, the system ( Or interface ) The real water level .
5.1 Front work
Cache preheating 、 Closing of interface current limit, etc . In addition, it is necessary to determine the regional distribution of pressure flow before pressure measurement , Try to fit the real user distribution , To ensure that the test results are authentic . For regional online businesses , The pressure machine is distributed in the same local machine room , It's understandable . If it's a national online business , The pressure machine should also be distributed according to users , Deploy... In all regions of the country .
5.2 Long term pressure measurement at different scenes
The pressure test scenario needs to be customized according to the big promotion business scenario , For example, this promotion has a second kill ticket , Then the corresponding scenarios need to be considered in the construction of basic data and flow entry , Determine the pressure test scenario, and then divide it into baseline according to the system capacity 、 High capacity 、 Stability and abnormal scenarios , Finally, observe the robustness of the system by long-term pressure measurement at different times , Therefore, it is necessary to plan the pressure test scenario before the pressure test execution 、 Pressure measurement rounds and duration of each round , This is a complicated job .
5.3 Monitoring indicators
In the process of pressure measurement, it is necessary to observe the system indicators at all times , Including but not limited to :
- Server metrics : machine cpu、 Memory 、 Network inflow and outflow 、jvm Indicators, etc
- Database metrics : slow sql Number 、qps、 Index hit rate 、 Lock waiting time, etc
- Business indicators : Order trend 、 Order success rate, etc
- System indicators : The core interface qps、 Average rt、 Failure rate, etc
- ......
Stop pressure measurement once the index is abnormal , Therefore, it is necessary to sort out the pressure measurement startup 、 Stop and end criteria ( No more details ). After the pressure test , The pressure test report focuses on three core indicators ( Statistics specific to the second level are required , Often, the system will produce a second curve during peak periods ): Request success rate 、 Interface ( System )qps、 Interface ( System ) Average rt.
5.4 Pressure test record
In the process of pressure measurement, it is also necessary to do a good job of pressure measurement indicators 、 Pressure measurement problems and pressure measurement site records , Because there are many participants in pressure measurement , Pressure measurement time is precious , The recording method can adopt screenshots combined with summary recording to record the pressure test , As long as important information is not missed , It is convenient to analyze the pressure measurement results after the pressure measurement .
5.5 The pressure test is over
After the pressure test is completed , Generally, there are two scenarios for the full link voltage test results :
- Lower than the performance requirements of large promotion , The system bottleneck needs to be solved , Perform the full link voltage test again , Until the system meets the target capacity
- Meet the performance requirements ( Generally, it reaches the estimated capacity of large promotion 1.5 More than times ), You can try to further improve the system , Press out the bottleneck of the system
The following figure is an example of the author's pressure test execution process ( For reference only ):
6. Pressure measurement recovery
Pressure measurement recovery refers to a series of operations related to pressure measurement resource recovery and data cleaning after pressure measurement , Such as recycling pressure measuring machines 、 Apply shrink 、 Recycle shadow Library 、 Clean up the pressure measurement data . To avoid the loss of monitoring information when troubleshooting pressure test performance problems ( For example, shadow database monitoring is recycled with the recycling of shadow databases ), The recycling information of resources needs to be synchronized to the pressure measurement technology group . The author used to check the slowness caused by pressure measurement sql problem , Due to the recovery of the shadow library, we have to conduct a pressure test to restore the scene .
7. The pressure measurement is repeated
The pressure measurement double disk is to analyze and summarize the pressure measurement process , And decide whether to conduct another full link voltage test according to the following points :
- Whether it meets the performance requirements of large promotion
- Whether there are performance problems that have a great impact , If there is a performance problem , You need to analyze specific performance problems and give solutions
If the pressure test conclusion meets the expectation , The performance index of the system can be obtained , According to this, the current limiting degradation strategy is configured to ensure the stability of the system . The following example provides a reference for the author to summarize the performance problems encountered during a full link voltage test :
Such as reprint , Please indicate the source ! Welcome to WeChat official account. : Fang Chen's blog
边栏推荐
猜你喜欢
面板显示技术:LCD与OLED
Output all composite numbers between 6 and 1000
Reading notes of pyramid principle
Mountaineering team (DFS)
Introduction to data fragmentation
Greenplum 6.x build_ install
PPT模板、素材下载网站(纯干货,建议收藏)
数字三角形模型 AcWing 1027. 方格取数
STM32 serial port register library function configuration method
为不同类型设备构建应用的三大更新 | 2022 I/O 重点回顾
随机推荐
Redis fault handling "can't save in background: fork: cannot allocate memory“
面板显示技术:LCD与OLED
9c09730c0eea36d495c3ff6efe3708d8
面试题:高速PCB一般布局、布线原则
cmake命令行使用
Interpretation of MySQL optimization principle
MySQL主从延迟的解决方案
Mountaineering team (DFS)
Greenplum 6.x build_ install
UnityShader入门精要个人总结--基础篇(一)
模拟卷Leetcode【普通】1557. 可以到达所有点的最少点数目
Frequently Asked Coding Problems
Recommended by Alibaba P8, the test coverage tool - Jacobo is very practical
ncs成都新電面試經驗
Image segmentation in opencv
With an annual salary of 50W, Alibaba P8 will come out in person to teach you how to advance from testing
Problems encountered in the use of go micro
Category of IP address
Output all composite numbers between 6 and 1000
2022-06-30 unity core 8 - model import