当前位置：网站首页>IDS cache preheating, avalanche, penetration

IDS cache preheating, avalanche, penetration

2022-07-06 09:14:00 【~Pompeii】

Catalog

Reids Cache preheating
Redis It's an avalanche
Redis Cache penetration

Reids Cache preheating

1. problem

The database goes down quickly after startup

2. Troubleshoot problems

1. The number of requests is high
2. The data throughput between master and slave is large , The operation frequency of data synchronization is high
Popular explanation ： That is, after the database is started , There is no data in the cache , The instant the database server starts, there are many requests , Naturally, it puts pressure on the server , Then it will be down

3. Cache preheating introduction

Cache warm-up is before the system starts , Load the relevant cache data directly into the cache system in advance . Avoid when the user requests , Query database first , And then cache the data ！ Users directly query the pre heated cache data ！

4. Solution

Preparatory work ：
1. Daily routine statistical data access records , Statistics of hot data with high access frequency （ By hand ）
2. utilize LRU Data deletion strategy , Building a data retention queue , for example ：storm And kafka coordination （ Automatic mode ）
preparation ：
1. Classify the data in the statistical results , According to the level ,redis Load high-level hot data first
2. Using distributed multi server to read data at the same time , Speed up the data loading process
3. Hot data master and slave warm up at the same time
The implementation of ：
1. Manual method is not practical , You can use a script program to fix and trigger the data preheating process
2. If conditions permit , Used CDN（ Content distribution network ）, It works better

Redis It's an avalanche

1. The database server crashed

1. During the smooth operation of the system , All of a sudden, the number of database connections surged
2. The application server cannot process the request in time
3. A lot of 408,500 The error page appears
4. Customers repeatedly refresh the page to get data
5. Database crash
6. The application server crashed
7. It is invalid to restart the application server
8.Redis Server crash （ One by one, it collapsed ）
9.Redis Cluster crash
10. After restarting the database, it is put down by the instantaneous traffic again
notes ： The above problems occur in chronological order

2. Troubleshoot problems

1. In a short time , More in the cache key Concentration expired （ In actual development , There are many timing key Of , Because the memory size is limited ）
2. Request access to expired data during this cycle ,redis Not hit ,redis Getting data from a database
3. The database receives a large number of requests at the same time and cannot process them in time
4.Redis There's a huge backlog of requests , It's starting to time out
5. Database traffic surges , Database crash
6. There is no data available in the cache after restart
7.Redis Server resources are heavily occupied ,Redis Server crash
8.Redis The cluster is collapsing , Cluster collapse
9. The application server can't get the data in time to respond to the request , There are more and more requests from clients , The application server crashed
10. application server ,redis, Restart all databases , The effect is not ideal , Because there is still no cache , Even if rdb Recovery is no good , because key It's overdue

3. Problem analysis

1. In a short period of time
2. A lot of key Concentration expired

4. Solution ( Avenue )

1. More static page processing
2. Building a multi-level cache architecture
Nginx cache +redis cache +ehcache cache
3. testing Mysql Serious time-consuming business optimization
Check the bottleneck of the database ： For example, timeout query 、 Time consuming, high transaction, etc
4. Disaster warning mechanism
monitor redis Server performance metrics
1)CPU Occupy 、CPU Usage rate
2) Memory capacity
3) Average query response time
4) Number of threads
5. Current limiting 、 Downgrade
Sacrifice some customer experience in a short period of time , Restrict access to some requests , Reduce application server pressure , After the business is running at a low speed, the access will be gradually released

5. Solution ( Technique )

1.LRU And LFU Switch
2. Data validity policy adjustment
1) According to the validity period of business data, peak shifting is classified ,A class 90 minute ,B class 80 minute ,C class 70 minute
2) The expiration time is fixed time + The form of random values , Dilution concentration matures key The number of
3. Super thermal data using permanent key
4. Regular maintenance （ Automatically + artificial ）
Using automatic script and maintenance , Or manual maintenance , Do traffic analysis for data that is about to expire , Confirm whether there is a delay , With the visit statistics , Delay of hot data
5. Lock （ Use with caution ！）：
Those who get the lock can work , You can't work if you can't get it

6. Cache avalanche introduction

Cache avalanche means that the amount of expired data is too large , Causing pressure on the database server . If it can effectively avoid the concentration of expiration time , It can effectively solve the emergence of avalanche phenomenon （ about 40%）, Use with other strategies , And monitor the running data of the server , Make quick adjustment according to the operation record

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-t2vX4JcV-1656728445660)(C:/Users/86158/AppData/Roaming/Typora/typora-user-images/image-20220701211016177.png)]

Redis Cache penetration

1. The database server crashed

1. During the smooth operation of the system
2. Application server traffic increases greatly with time
3.Redis The server hit rate decreases over time
4.Redis Memory smooth , No memory pressure
5.Redis The server CPU The occupancy surge
6. The pressure on the database server is surging
7. Database crash

2. Troubleshoot problems

1.Redis Medium and large area miss
2. It's abnormal URL visit

3. Problem analysis

1. The acquired data does not exist in the database , The database query did not get the corresponding data
2.Redis Get null Data is not persisted , Go straight back to
3. Repeat the above process the next time such data arrives
4. Hackers attack the server

4. Solution ( Technique ）

1. cache null
The query result is null Data for caching （ Long term use , Regular clearance ）, Set a short time limit , for example 30-60 second , The highest 5 minute （ The setting time cannot be too long , otherwise redis Memory is too full ）
2. White list strategy
1) Preheat all kinds of classification data in advance id Corresponding bitmaps,id As bitmaps Of offset, It's equivalent to setting up a data white list . When loading normal data , release , Intercept when loading abnormal data （ Low efficiency ）
2) Using the bloon filter （ The bloom filter may not hit all , The hit problem of Bloom filter can be ignored for the current situation ）
3. Implement monitoring
Real-time monitoring redis shooting （ In the normal course of business , There is usually a fluctuation value ） And null The percentage of data
1) Inactive periods fluctuate ： It usually detects 3-5 times , exceed 5 It was included in the key investigation objects （ No festivals ）
2) The activity period fluctuates ： It usually detects 10-50 times , exceed 50 It was included in the key investigation objects （ A double tenth ）
Depending on the multiple , Start different troubleshooting processes . And then use blacklists for prevention and control （ operating ）
4.key encryption
When problems arise , Start disaster prevention business temporarily key, Yes key Carry out business layer transmission encryption service , Set the calibration procedure , Over here key check
For example, randomly assigned every day 60 An encrypted string , choose 2 To 3 individual , Confused with page data id in , Discovery visit key Not meeting the rules , Deny data access

5. Cache penetration introduction

Cache access to nonexistent data , Skip the legal data redis Data caching phase , Every time you access the database , Causing pressure on the database server . Usually, the amount of such data is a low value , When this happens, fight the poison with the poison , And call the police in time . Coping strategies should focus on the prevention of temporary plans .
Whether it's a blacklist or a whitelist , It's all pressure on the whole system , Remove as soon as the alarm is cleared

原网站

版权声明
本文为[~Pompeii]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/187/202207060859158749.html