当前位置：网站首页>Construction of module 5 of actual combat Battalion

Construction of module 5 of actual combat Battalion

2022-06-29 23:45:00 【InfoQ】

Microblogging comments on the design of high-performance and high availability computing architecture

One 、 Computational performance prediction

1.1 User volume

2020.9 Live every month 5.11 Billion , Diurnal activity 2.24 Billion （ Reference resources 《 Microblogging 2020 User development report 》）.

1.2 Microblog comment scenario key behaviors ：

Comment on

See the comments

1.3 Behavior modeling and performance estimation

Comment on ： Suppose that the average person sends 1 Micro-blog （ Only consider text microblog ）, The daily sending volume of microblog is about 2.5 Billion bars . Most people tweet in the morning 8：00~9：00 spot , At noon, 12：00~13：00, evening 20：00~22：00, Suppose that the proportion of the total amount of microblogging in these time periods is 60%. The time when users send comments basically coincides with the time when they send microblogs , And most of the comments focus on the stars and the big V On Weibo . Suppose that on average, each microblog has 100 Human reading , One in ten people will comment , Then each microblog corresponds to 10 comments . The performance estimate is ：2.5 Billion bars * 10 comments

* 60%/(4 * 3600) ≈ 100K/s

See the comments ： Look at the time of comments and make comments 、 The time of microblogging basically coincides . Suppose about... Of the people who read Weibo 30% I will click on the comments to view , Then the performance estimate is ：2.5 Billion bars * 100 people * 30% * 60%/(4 * 3600) ≈ 300K/s

Two 、 Microblogging comments on the design of high-performance computing architecture

2.1 Comment on

2.1.1 Comment on architecture design

Analyze from the business characteristics , Commenting is a write operation , Cache cannot be used under normal circumstances , Load balancing can be used .

Architecture analysis ： Over 100 million users ,TPS The requirements are also very high , A multi-level load balancing architecture should be used , Cover DNS -> F5 -> Nginx -> Multilevel load balancing of gateway .

Architecture design ：

（1） Design of load balancing algorithm ：

Depending on login status when posting comments , The login status is generally saved in the distributed cache , So when commenting , Send the request to any server , Choose here “ polling ” perhaps “ Random ” Algorithm .

（2） Estimate the number of business servers ：

Commenting involves several key processes ： Content review （ Rely on the audit system ）、 Write data to storage （ Dependent on storage system ）、 Data write cache （ Rely on the cache system ）, But the content of general comments is much simpler than that of Weibo , So, per service per second 1000 To estimate , complete 100K/s Of TPS, need 100 Servers , Plus a certain amount of reserve ,150 This server is almost .

2.1.2 Comment on the multi-level load balancing architecture

2.2 See the comments

2.2.1 Review architecture design

Business characteristic analysis , Reading comments is a typical reading scene , Because the comments can't be modified after they are sent , Therefore, it is very suitable to use cache architecture , At the same time, due to the large number of requests , Load balancing architecture also needs .

Architecture analysis ：

（1） Over 100 million users , You should use a multi-level load balancing architecture ;

（2） The requested quantity reaches 75 Billion , You should use a multi-level cache architecture , In especial CDN cache , Is the core of cache design .

Architecture design ：

（1） If you choose a load balancing algorithm, visitors can directly read the comments , Therefore, sending a request to any server can , Choose here “ polling ” perhaps “ Random ” Algorithm .

（2） Business server quantity estimation assumptions CDN Capable of carrying 90% Of users , So the rest 10% The request to see the comment enters the system , Ask for QPS by 300K/s * 10% = 30K/s, Because the processing logic of reading comments is relatively simple , Mainly read cache system , Therefore, it is assumed that the processing capacity of a single business server is 1000/s, Then the number of machines is 30 platform , according to 20% Reserved quantity of , The final number of machines is 36 platform .

2.2.2 Look at the review of multi-level load balancing architecture

2.2.3 Look at the comment's multi-level cache architecture

Microblog comments are generally less important than microblog itself , At the same time, it is analyzed according to the popular comments and selected comments on the microblog , The top ranked microblog comments are the easiest to see , Later comments may be rarely viewed , So we can distinguish two kinds of comments , Store popular comments in CDN, And regularly update , Non popular comments cache a list to the distributed cache system , It can save money CDN The cost of server resources .

2.3 Overall architecture scheme

2.3.1 Design of multi-level load balancing for microblog comments

2.3.2 Design of multi-level cache architecture for microblog comments

Only reading the comments will use the cache architecture

3、 ... and 、 Weibo comments on high availability architecture design

3.1 User behavior modeling of microblog hot events 、 Performance estimation

Hot spots are usually big V Or the star's revelations 、 Official publicity and other events , Or some hot social events , Usually focus on oneortwo microblogs , But it causes a large number of users to access... In a short time , Put a lot of pressure on the system .

Comment on ：

The number of comments under hot events is usually very large , There are at least hundreds or even thousands of comments under a hot microblog .

See the comments ：

Hot events tend to gather a lot of gourd eaters , The number of microblog comments will be significantly higher than usual . But it's hard to predict , It is mainly related to the influence and scope of the event .

3.2 Business characteristic analysis

Comment on ： Comments are generally less important than tweeting , And many comments will be brushed down after they are sent out , Will not be seen immediately , When a hotspot event occurs, it can reduce the load pressure of the system through write caching .

See the comments ： After the hot event , The main reason to read the comments is that most of the popular comments under the hot microblog , These comments are mainly cached in CDN in , It can be accessed quickly .

3.3 High availability architecture analysis

Core architecture design idea ： Since it is impossible to predict , Then take precautions ！

Comment on ： Sending comments is not as important as sending microblogs , And there are many comments that don't need to be seen immediately , Consider using current limiting to protect the system . At the same time, comments can increase the degree of discussion , Activate the community atmosphere , Bring better communication effect , So it should not be discarded , Consider the leaky bucket algorithm , Through one kafka Message queuing to implement write buffering .

See the comments ： There is a cache hotspot problem in the hot event microblog , You can consider “ Multi copy cache ”, Because the original cache architecture has adopted “ Cache in application ”, In general , Cache hot issues are not necessarily prominent .