当前位置:网站首页>What is thermal data detection?
What is thermal data detection?
2022-06-24 16:36:00 【Programmer fish skin】
If the data should also be classified like garbage , What kind of heat data is it ?
Hello everyone , I'm fish skin , Today, I will share a bit of technical knowledge .
As we all know , Various websites 、 The operation of applications cannot be separated from the support of data , Especially for enterprises , Business data is its life .
But sometimes , Pile all the data into a lump 、 Unified processing may not meet our requirements for performance and storage space . therefore , We need to classify the data , To adapt to different business needs and application scenarios .
among , One way to divide data is to divide it into “ Thermal data ”、“ Cold data ”, And even “ Warm data ”!
Just like garbage sorting ~
Let's talk about what is thermal data first !
What is thermal data ?
seeing the name of a thing one thinks of its function , Thermal data means Very popular 、 Frequently visited The data of .
For example, the news on a hot list , There may be thousands of visits per second .
According to the characteristics of thermal data , It can be divided into two categories :
- There are expectations : It is expected that data will become popular , For example, in the big promotion activities with advance notice, the hot commodities endorsed by online celebrities , The double 11 Shopping Festival of a treasure is the best example .
- No expectation : Data access suddenly soared ! It may have been maliciously attacked by people 、 Web crawler , Or the content that is suddenly popular inadvertently . For example, a big news suddenly appeared , A wave of Weibo hasn't had time to do a good job of protection , It may explode .
In response to thermal data , Usually we choose caching technology , Taking data to K / V( Key value pair ) Is stored in memory in advance .
When we need to access cached data , Need to be based on a key character string , To find the corresponding value .
Frequently visited key, Also called heat key, heat key It's a broad concept , It's not just about caching systems , For example, the following are all hot key:
- A primary key that is frequently accessed in a database , For example, for popular applications appId
- K / V Caching systems that are frequently accessed key
- A malicious attack 、 Request information of robot brush , Like the user's userId、 machine IP etc.
- Frequently accessed interface address , Such as app Information Service /app/query
- Count how often a single user accesses an interface , Such as userId + /app/query
- Count the frequency of a machine accessing an interface , Such as IP + /app/query
- Count how often a user accesses specific content of an interface , Such as userId + /app/query + appId
After knowing what is thermal data , Let's talk about thermal data detection technology , namely “ Find the heat data ” Technology .
Why do you want to test thermal data ?
The reason we check the thermal data is very simple :
1. Lifting performance
If you use distributed caching , Network communication is still required when reading , There will be extra time overhead . If you can cache hot data locally in advance , Namely preheating , It can greatly improve the performance of the machine in reading data , Reduce the pressure on the lower level cache cluster .
Of course , This does not mean that all data should be stored locally . More cache levels , The more complex the update operation , The greater the risk of data inconsistency !
2. Risk aversion
For unexpected thermal data ( heat key), It may bring great risks to the business , Risks can be divided into two levels :
Risks to the data layer
Under normal circumstances ,Redis A single cache can support about 100000 QPS( Number of requests per second ), And the concurrency can be increased through the cluster . For systems with average concurrency , use Redis Caching is enough . But if there is a sudden burst of commodity data , Or receive a malicious request , For this data key The interview of QPS May soar to millions 、 Tens of millions ! In low version Redis Single thread working mode , This will cause normal requests to queue , Unable to respond in time , In severe cases, the entire fragmented cluster will be paralyzed .
There's another situation , A hot spot key Suddenly expired , It will lead to a large number of requests directly crashing into the fragile database , Cause the database to hang up !
Risks to application services
Each application can accept and process a limited number of requests per unit time , If attacked by a malicious request , Let malicious users occupy a lot of request processing resources alone , It will cause other normal users who are harmless to humans and animals to fail to respond in time .
therefore , Need a dynamic thermal key Detection mechanism , When unexpected hot data appears , The first time I found him , And carry out special processing for these data . Such as local cache 、 Deny malicious users 、 Interface current limiting / Degradation etc. . Avoid possible risks while improving data access performance .
So how to detect thermal data ?
How to detect thermal data ?
First , We need to give “ heat ” Define a threshold or rule , How hot is it ?
It can be defined according to experience value , It can also be defined according to the average heat of the system data , such as 1 Seconds access 1000 The secondary data is thermal data .
For stand-alone applications , Detecting thermal data is simple , Directly locally for each key Create a sliding window counter , Count the total number of visits per unit time ( frequency ), And store the detected heat through a collection key.
For distributed applications , Antipyretic key The access of is distributed on different machines , Cannot compute independently locally , therefore , Need an independent 、 Centralized heat key Computing unit .
thus , Thermal data detection can be divided into configuration rules 、 heat key Report 、 heat key Statistics 、 heat key Push four steps :
- Configuration rules : Specify heat key Reporting conditions for , Circle the items that need to be monitored key
- heat key Report : Each machine will have its own key The access status is reported to the centralized computing unit
- heat key Statistics : Collect the information reported by each application instance , Use the sliding window algorithm to calculate key The heat of the
- heat key push : When key When the heat reaches the set value , Push heat key Information to all application instances , Each application instance will key Values are cached locally .
Go through the above steps , A basic set of hot key The detection mechanism is completed . However, thermal data detection systems often face complex business scenarios , There are other issues to consider , such as key Failure treatment, etc .
To meet high concurrency scenarios , In design heat key When detecting the frame , It should also focus on the following indicators :
- The real time : Considering the heat key The suddenness of ( Maybe even 1 millisecond ), Must be able to detect heat in real time key And push
- High performance : The frame shall remain lightweight and high performance , Effectively reduce costs
- accuracy : Accurately detect the heat that conforms to the rules key, No missing report 、 No false alarm
- Uniformity : Ensure the hot connection between the application instance and the local cache key Agreement , No data errors
- Scalable : To be counted key When the order of magnitude is very large , The centralized computing cluster can be expanded horizontally
Besides , Excellent heat key The detection framework shall also meet the requirements of easy access 、 There is no invasion of business 、 It can be configured dynamically 、 Rule hot update 、 Visual management and other features .
Last , Students who want to learn more can take a look at the popularity of JD open source key Detection frame JD-hotkey And those who like open source TMC, Their designs are very clever .
I have written an analysis of these two frameworks before , There will be a chance to sort it out later .
边栏推荐
- Interpretation of swin transformer source code
- Virtual machine virtual disk recovery case tutorial
- How do HPE servers make RAID5 arrays? Teach you step by step today!
- 期货怎么开户安全些?哪些期货公司靠谱些?
- Leetcode notes of Google boss | necessary for school recruitment!
- Goby+awvs realize attack surface detection
- MD5 verification based on stm32
- 找出隐形资产--利用Hosts碰撞突破边界
- What is cloud development? Why cloud development? Talk about our story
- Go deep into the implementation principle of go language defer
猜你喜欢

ZOJ - 4104 sequence in the pocket

Applet wxss

C. K-th not divisible by n (Mathematics + thinking) codeforces round 640 (Div. 4)
Advanced programmers must know and master. This article explains in detail the principle of MySQL master-slave synchronization

Cognition and difference of service number, subscription number, applet and enterprise number (enterprise wechat)

B. Terry sequence (thinking + greed) codeforces round 665 (Div. 2)
![[go] concurrent programming channel](/img/6a/d62678467bbc6dfb6a50ae42bacc96.jpg)
[go] concurrent programming channel

Some adventurer hybrid versions with potential safety hazards will be recalled

ZOJ——4104 Sequence in the Pocket(思维问题)

C. Three displays codeforces round 485 (Div. 2)
随机推荐
B. Terry sequence (thinking + greed) codeforces round 665 (Div. 2)
How to select an open source license
What is zero trust? Three classes will show you how to understand him!
Percona Toolkit series - Pt deadlock logger
转置卷积学习笔记
Cloud + community [play with Tencent cloud] video solicitation activity winners announced
Applet wxss
Experience and suggestions on cloud development database
Global and Chinese markets of stainless steel barbecue ovens 2022-2028: Research Report on technology, participants, trends, market size and share
A set of very good H3C and Tianrongxin Internet cutover scheme templates, with word document download
AI video structured intelligent security platform easycvr intelligent security monitoring scheme for protecting community residents
Leetcode notes of Google boss | necessary for school recruitment!
[tke] analysis of CLB loopback in Intranet under IPVS forwarding mode
An error is reported during SVN uploading -svn sqlite[s13]
C. Three displays codeforces round 485 (Div. 2)
Is Guotai Junan Futures safe? How to open a futures account? How to reduce the futures commission?
Goby+AWVS 实现攻击面检测
How FEA and FEM work together
[tke] modify the cluster corendns service address
How does easydss, an online classroom / online medical live on demand platform, separate audio and video data?