当前位置:网站首页>Construction of module 5 of actual combat Battalion

Construction of module 5 of actual combat Battalion

2022-06-29 23:45:00 InfoQ

Microblogging comments on the design of high-performance and high availability computing architecture


One 、 Computational performance prediction

1.1  User volume

2020.9 Live every month 5.11 Billion , Diurnal activity 2.24 Billion ( Reference resources 《 Microblogging 2020 User development report 》).

1.2  Microblog comment scenario key behaviors :

  • Comment on
  • See the comments

1.3  Behavior modeling and performance estimation

  • Comment on : Suppose that the average person sends 1 Micro-blog ( Only consider text microblog ), The daily sending volume of microblog is about 2.5 Billion bars . Most people tweet in the morning 8:00~9:00 spot , At noon, 12:00~13:00, evening 20:00~22:00, Suppose that the proportion of the total amount of microblogging in these time periods is 60%. The time when users send comments basically coincides with the time when they send microblogs , And most of the comments focus on the stars and the big V On Weibo . Suppose that on average, each microblog has 100 Human reading , One in ten people will comment , Then each microblog corresponds to 10 comments . The performance estimate is :2.5 Billion bars  * 10 comments
     
    * 60%/(4 * 3600) ≈ 100K/s

  • See the comments : Look at the time of comments and make comments 、 The time of microblogging basically coincides . Suppose about... Of the people who read Weibo 30% I will click on the comments to view , Then the performance estimate is :2.5 Billion bars  * 100 people  * 30% * 60%/(4 * 3600) ≈ 300K/s

Two 、 Microblogging comments on the design of high-performance computing architecture

2.1  Comment on

2.1.1  Comment on architecture design
  • Analyze from the business characteristics , Commenting is a write operation , Cache cannot be used under normal circumstances , Load balancing can be used .
  • Architecture analysis : Over 100 million users ,TPS The requirements are also very high , A multi-level load balancing architecture should be used , Cover  DNS -> F5 -> Nginx ->  Multilevel load balancing of gateway .
  • Architecture design :
(1) Design of load balancing algorithm :
Depending on login status when posting comments , The login status is generally saved in the distributed cache , So when commenting , Send the request to any server , Choose here “ polling ” perhaps “ Random ” Algorithm .
(2) Estimate the number of business servers :
Commenting involves several key processes : Content review ( Rely on the audit system )、 Write data to storage ( Dependent on storage system )、 Data write cache ( Rely on the cache system ), But the content of general comments is much simpler than that of Weibo , So, per service per second  1000  To estimate , complete  100K/s  Of  TPS, need  100  Servers , Plus a certain amount of reserve ,150  This server is almost .
2.1.2  Comment on the multi-level load balancing architecture

2.2  See the comments

2.2.1  Review architecture design
  • Business characteristic analysis , Reading comments is a typical reading scene , Because the comments can't be modified after they are sent , Therefore, it is very suitable to use cache architecture , At the same time, due to the large number of requests , Load balancing architecture also needs .
  • Architecture analysis :
(1) Over 100 million users , You should use a multi-level load balancing architecture ;
(2) The requested quantity reaches 75 Billion , You should use a multi-level cache architecture , In especial  CDN  cache , Is the core of cache design .
  • Architecture design :
(1) If you choose a load balancing algorithm, visitors can directly read the comments , Therefore, sending a request to any server can , Choose here “ polling ” perhaps “ Random ” Algorithm .
(2)  Business server quantity estimation assumptions  CDN  Capable of carrying 90% Of users , So the rest 10% The request to see the comment enters the system , Ask for  QPS  by 300K/s * 10% = 30K/s, Because the processing logic of reading comments is relatively simple , Mainly read cache system , Therefore, it is assumed that the processing capacity of a single business server is 1000/s, Then the number of machines is 30 platform , according to 20% Reserved quantity of , The final number of machines is 36 platform .
2.2.2  Look at the review of multi-level load balancing architecture
2.2.3  Look at the comment's multi-level cache architecture
Microblog comments are generally less important than microblog itself , At the same time, it is analyzed according to the popular comments and selected comments on the microblog , The top ranked microblog comments are the easiest to see , Later comments may be rarely viewed , So we can distinguish two kinds of comments , Store popular comments in CDN, And regularly update , Non popular comments cache a list to the distributed cache system , It can save money CDN The cost of server resources .

2.3  Overall architecture scheme

2.3.1  Design of multi-level load balancing for microblog comments
2.3.2  Design of multi-level cache architecture for microblog comments
Only reading the comments will use the cache architecture

3、 ... and 、 Weibo comments on high availability architecture design

3.1  User behavior modeling of microblog hot events 、 Performance estimation

Hot spots are usually big V Or the star's revelations 、 Official publicity and other events , Or some hot social events , Usually focus on oneortwo microblogs , But it causes a large number of users to access... In a short time , Put a lot of pressure on the system .
  • Comment on :
The number of comments under hot events is usually very large , There are at least hundreds or even thousands of comments under a hot microblog .
  • See the comments :
Hot events tend to gather a lot of gourd eaters , The number of microblog comments will be significantly higher than usual . But it's hard to predict , It is mainly related to the influence and scope of the event .

3.2  Business characteristic analysis

  • Comment on : Comments are generally less important than tweeting , And many comments will be brushed down after they are sent out , Will not be seen immediately , When a hotspot event occurs, it can reduce the load pressure of the system through write caching .
  • See the comments : After the hot event , The main reason to read the comments is that most of the popular comments under the hot microblog , These comments are mainly cached in CDN in , It can be accessed quickly .

3.3  High availability architecture analysis

Core architecture design idea : Since it is impossible to predict , Then take precautions !
  • Comment on : Sending comments is not as important as sending microblogs , And there are many comments that don't need to be seen immediately , Consider using current limiting to protect the system . At the same time, comments can increase the degree of discussion , Activate the community atmosphere , Bring better communication effect , So it should not be discarded , Consider the leaky bucket algorithm , Through one kafka Message queuing to implement write buffering .
  • See the comments : There is a cache hotspot problem in the hot event microblog , You can consider “ Multi copy cache ”, Because the original cache architecture has adopted “ Cache in application ”, In general , Cache hot issues are not necessarily prominent .

3.4  High availability architecture design diagram


原网站

版权声明
本文为[InfoQ]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/180/202206292332192873.html