当前位置:网站首页>6000 + words to help you understand the evolution of Internet architecture!
6000 + words to help you understand the evolution of Internet architecture!
2022-06-09 13:45:00 【Java technology stack】
Click on the official account ,Java dried food Timely delivery 
author : Small M
source :https://cnblogs.com/xiaoMzjm/p/5223799.html
Preface
We use javaweb For example , To build a simple e-commerce system , See how the system can evolve step by step .
The function of the system :
User module : User registration and management
Commodity module : Commodity display and management
Trading module : Create transactions and manage
Stage 1 、 Build a website on its own
The beginning of the site , We often run all our programs and software on a single computer . Now we use a container , Such as tomcat、jetty、jboos, Then use it directly JSP/servlet technology , Or use open source frameworks like maven+spring+struct+hibernate、maven+spring+springmvc+mybatis; Finally, choose a database management system to store data , Such as mysql、sqlserver、oracle, And then through JDBC Connect and operate the database .
Load all the above software on the same machine , The app is running , It's also a small system . At this time, the system results are as follows :

Stage two 、 Separation of application server and database
With the launch of the website , The number of visitors is on the rise , The load on the server is slowly increasing , Before the server is overloaded , We should be ready to , Improve the load capacity of the website . If our code level has been difficult to optimize , Without improving the performance of a single machine , Adding machines is a good way , It can not only effectively improve the load capacity of the system , And it's cost-effective .
What are the additional machines used for ? At this point we can put the database ,web The server is split , This not only improves the load capacity of a single machine , It also improves disaster tolerance .
The architecture after the application server is separated from the database is shown in the figure below :

Stage three 、 Application server cluster
As visits continue to grow , A single application server can no longer meet the needs . Assuming that the database server is not under pressure , We can turn an application server from one to two or more , Distribute user requests to different servers , So as to improve the load capacity .
There is no direct interaction between multiple application servers , They all rely on databases to provide services to the outside world . The famous software for failover is keepalived,keepalived Is a similar to layer3、4、7 Software for the exchange mechanism , It's not the exclusive product of a specific software failover , It's a product that can be applied to all kinds of software .keepalived Match up ipvsadm It can also do load balancing , It can be called a artifact .
Let's take the example of adding an application server , The added system structure is as follows :

The system evolved here , There will be four questions :
Who will forward the user's request to the specific application server
What's the forwarding algorithm
How the application server returns the user's request
If users visit different servers every time , How to maintain session The consistency of
Let's take a look at the solution :
1、 The first problem is load balancing , Generally speaking, there are 5 Kind of solution :
1、http Redirect .HTTP Redirection is the application layer request forwarding . The user's request has arrived HTTP Redirect the load balancing server , The server requires the user to redirect according to the algorithm , When the user receives a redirect request , Request the real cluster again
advantage : Simple .
shortcoming : Poor performance .
2、DNS Domain name resolution load balancing .DNS Domain name resolution load balancing is in the user request DNS The server , Get the... Corresponding to the domain name IP Address time ,DNS The server directly gives the server after load balancing IP.
advantage : hand DNS, We don't need to maintain the load balancing server .
shortcoming : When an application server hangs up , Can't inform in time DNS, and DNS The control of load balancing is in the domain name service provider , The website can't do more improvement and stronger management .
3、 Reverse proxy . When the user's request reaches the reverse proxy server ( It has reached the website machine room ), By the reverse proxy server according to the algorithm forward to the specific server . frequently-used apache,nginx Can act as a reverse proxy server .
advantage : Simple deployment .
shortcoming : Proxy servers can be a performance bottleneck , Especially a big file upload .
4、IP Layer load balancing . After the request reaches the load balancer , The load balancer modifies the request by IP Address , So as to realize the request forwarding , Load balancing .
advantage : Better performance .
shortcoming : The broadband of load balancer becomes the bottleneck .
5、 Data link layer load balancing . After the request reaches the load balancer , The load balancer modifies the requested mac Address , So as to achieve load balancing , And IP Load balancing is different from , After requesting access to the server , Direct return to customer . Without going through the load balancer .
2、 The second problem is the cluster scheduling algorithm , Common scheduling algorithms include 10 Kind of .
1、rr round-robin scheduling . seeing the name of a thing one thinks of its function , Polling for distribution requests .
advantage : Implement a simple
shortcoming : Regardless of the processing power of each server
2、wrr Weighted scheduling algorithm . We set weights for each server weight, The load balancer dispatches the server according to the weight , The number of times the server is called is proportional to the weight .
advantage : Considering the different processing power of the server
3、sh Original address hash : Extract users IP, From the hash function, we get a key, Then according to the static mapping table , Investigate and deal with the corresponding value, The target server IP. Overload the target machine , It returns null .
4、dh Destination address hash : ditto , It's just that what we're extracting now is the target address IP To make hash .
advantage : Both of the above algorithms can realize the same user accessing the same server .
5、lc The minimum connection . Priority is given to forwarding requests to servers with few connections .
advantage : Make the load of each server in the cluster more even .
6、wlc Weighted least connected . stay lc On the basis of , Weight each server . Algorithm for :( Number of active connections *256+ Number of inactive connections )÷ The weight , Servers with small calculated values are preferred .
advantage : Requests can be allocated according to the capabilities of the server .
7、sed In the short term, we hope to delay . Actually sed Follow wlc similar , The difference is that the number of inactive connections . Algorithm for :( Number of active connections +1)*256÷ The weight , The server with small calculated value is preferred .
8、nq Never in line . The improved sed Algorithm . Let's think about the circumstances under which we can “ Never in line ”, That's the number of connections to the server 0 When , So if there are server connections 0, The equalizer forwards the request directly to it , No need to go through sed The calculation of .
9、LBLC Minimal connections based on locality . Equalizer according to the purpose of the request IP Address , Find out what to do IP Address recently used by the server , Forward the request , If the server is overloaded , The least number of connections algorithm .
10、LBLCR Minimum connections based on locality with replication . Equalizer according to the purpose of the request IP Address , Find out what to do IP Address recently used “ The server Group ”, Be careful , It's not a specific server , Then select a specific server from the group with the minimum number of connections , Forward the request . If the server is overloaded , Then according to the algorithm of the minimum number of connections , In the cluster Not Servers in this server group , Find a server out , Join this server group , Then forward the request .
The latest interview questions have been sorted out , You can Java Interview library applet online brush questions .
3、 The third problem is cluster mode , commonly 3 Kind of solution :
1、NAT : The load balancer receives the user's request , Forward to a specific server , The server processes the request and returns it to the equalizer , The equalizer returns to the user .
2、DR : The load balancer receives the user's request , Forward to a specific server , After the server comes out to play the request, it directly returns it to the user . Need system support IP Tunneling agreement , It's hard to cross platform .
3、TUN : ditto , But there is no need for IP Tunneling agreement , Good cross platform , Most systems can support .
4、 The fourth question is session problem , Generally speaking, there are 4 Kind of solution :
1、Session Sticky .session sticky That is to put the request of the same user in a certain session , All assigned to a fixed server , So we don't have to deal with cross server session Problem. , Common algorithms are ip_hash Law , That is, the two hash algorithms mentioned above .
advantage : Implement a simple .
shortcoming : When the application server is restarted session disappear .
2、Session Replication .session replication It's replication in a cluster session, Make sure that every server has all the users session data .
advantage : Reduce the load balancing server pressure , There is no need to achieve ip_hasp Algorithm to forward requests .
shortcoming : When copying, broadband costs a lot , If you have a large number of visitors session It takes up a lot of memory and wastes .
3、Session Centralized data storage :session Data centralized storage is to use database to store session data , Realized session Decoupling from application server .
advantage : comparison session replication The plan , There's a lot less pressure on broadband and memory between clusters .
shortcoming : Need to maintain storage session The database of .
4、Cookie Base :cookie base Is to put session There is cookie in , There is a browser to tell the application server my session What is it? , It's also implemented session Decoupling from application server .
advantage : Implement a simple , Basically maintenance free .
shortcoming :cookie Length limit , Low security , Broadband consumption .
It is worth mentioning that :
nginx Currently supported load balancing algorithms include wrr、sh( Supports consistent hashing )、fair( I think it comes down to lc). but nginx As an equalizer , It can also be used as a static resource server .
keepalived+ipvsadm More powerful , Currently, the algorithms supported are :rr、wrr、lc、wlc、lblc、sh、dh
keepalived There are :NAT、DR、TUN
nginx It doesn't provide session Synchronization solution , and apache It provides session Shared support .
Okay , After solving the above problems , The structure of the system is as follows :

Stage four 、 Database read-write separation
We always assume that the database load is normal , But as the number of visitors increases , The load on the database is also increasing . Then someone may immediately think of the same as the application server , One copy of the database is the second load balancing . But for databases , It's not that simple . This MySQL Database development 36 Rules ! I suggest you look at .
If we simply split the database in two , Then the request for the database , Load separately to A The machine and B machine , So it is obvious that the data of the two databases will be inconsistent . So in this case , We can first consider the use of read-write separation .
The structure of the database system after the separation of reading and writing is as follows :

This structural change will also bring about two problems :
Data synchronization between master and slave databases
Application selection of data sources
The solution to the problem :
We can use MYSQL Self contained master+slave To achieve master-slave replication .
Third party database middleware is adopted , for example mycat.mycat It's from cobar Developed from , and cobar It's Alibaba's open-source database middleware , Later, development stopped .mycat It's better at home mysql Open source database sub database sub table middleware .
Stage five 、 Use search engine to relieve the pressure of Reading database
If the database is a reading database , Often unable to do fuzzy search , Even if the separation of reading and writing is done , This problem has not yet been solved . Take our trading website as an example , Published items are stored in the database , The most commonly used function of users is to find products , Especially according to the title of the product to find the corresponding product . For this need , We usually go through like Function to achieve , But the cost of this approach is very high . At this time, we can use the inverted index of search engine to complete .
Click on the official account ,Java dried food Timely delivery 
Search engine has the following advantages :
It can greatly improve the query speed .
The introduction of search engine will also bring the following costs :
Bring a lot of maintenance work , We need to implement the index building process ourselves , Design total / Additional building methods to meet non real time and real-time query needs .
Need to maintain search engine cluster
Search engine can't replace database , He solved the problem of “ read ” The problem of , Whether to introduce search engine , Need to consider the needs of the whole system . The system structure after the introduction of search engine is as follows :

Stage six 、 Use cache to relieve the pressure of reading
1、 Cache of background application layer and database layer
As the number of visitors increases , Gradually, many users access the same part of the content , For these more popular content , There's no need to read from the database every time . We can use caching technology , For example, you can use google Open source caching technology guava Or use memcacahe As the cache of application layer , You can also use redis As the cache of database layer .
in addition , In some cases , Relational databases are not very suitable , For example, I want to make a “ Limit the number of password errors per day ” The function of , The idea is probably when the user logs in , If login error , Record the user's IP And the number of mistakes , So where is the data to be put ?
in addition , The latest database series interview questions have been sorted out , You can Java Interview library applet online brush questions .
If it's in memory , So obviously it will take up too much content ; If it's in a relational database , Then we should establish database tables , And a resume java bean, And write SQL wait . And analyze the data we want to store , It's just like {ip:errorNumber} In this way key:value data . For this kind of data , We can use NOSQL Database to replace the traditional relational database .
2、 Page caching
In addition to data caching , And page caching . For example, use HTML5 Of localstroage perhaps cookie.
advantage :
Reduce the pressure on the database
Greatly improve access speed
shortcoming :
Need to maintain cache server
Increased the complexity of coding
It is worth mentioning that :
The scheduling algorithm of cache cluster is different from the application server and database mentioned above . It's better to use “ Consistent hash algorithm ”, In this way, we can improve the hit rate . Let's not talk about this , If you are interested, please refer to the relevant information .
Structure after adding cache :

Stage seven 、 Database horizontal split and vertical split
Our website has evolved to the present , transaction 、 goods 、 The user's data is still in the same database . Despite the increased cache , The way of separation of reading and writing , But as the pressure on the database continues to increase , The bottleneck of database is more and more prominent , here , We can choose to split data vertically or horizontally . Want to be an architect , This The architect's Atlas suggests taking a look , Little detours .
7.1、 Data split vertically
Vertical split means to split different business data in the database into different databases , Combined with the present example , It's about trading 、 goods 、 The user's data is separated .
advantage :
It solves the problem of putting all businesses in one database .
More optimization can be made according to the characteristics of the business
shortcoming :
Need to maintain multiple databases
problem :
Need to consider the original cross business transactions
Cross database join
The solution to the problem :
We should try to avoid cross database things in the application layer , If you have to cross databases , Try to control... In your code .
We can use third parties to solve , As mentioned above mycat,mycat Provides rich cross Library join programme , Please refer to mycat Official documents .
The vertical split structure is as follows :

7.2、 Data horizontal split
Data horizontal splitting is to split the data in the same table into two or more databases . The reason for data level splitting is that the data volume or update volume of a business reaches the bottleneck of a single database , At this point, you can split the table into two or more databases .
advantage :
If we can solve the above problems , Then we will be able to do a good job of data volume and write volume growth .
problem :
The application system of accessing user information needs to solve SQL Routing problem , Because now the user information is divided into two databases , You need to know where the data you need to operate is .
Primary key processing is also different , For example, the original auto increment field , We can't simply continue to use .
If pagination is needed , That's the trouble .
The solution to the problem :
We can still solve the third-party middleware through , Such as mycat.mycat Can pass SQL Parsing module for our SQL To analyze , According to our configuration , Forward the request to a specific database .
We can go through UUID Guarantee uniqueness or customization ID Plan to solve .
mycat It also provides a rich paging query scheme , For example, do paging query from each database first , Then merge the data to do a paging query and so on .
Structure after data horizontal split :

Stage eight 、 Split of application
8.1、 Split application
As the business grows , More and more business , More and more applications . We need to think about how to avoid making applications more and more bloated . This requires taking the application apart , From one app to two or more . Or take our example above , We can put users 、 goods 、 The deal is split up . become “ user 、 goods ” and “ user , transaction ” Two subsystems .
The split structure :

problem :
After this split , There may be some of the same code , Such as user related code , Products and transactions need user information , So in both systems, we keep the same code for operating user information . How to ensure that these codes can be reused is a problem to be solved .
solve the problem :
By taking a service-oriented route to solve
8.2、 Take the road of service
In order to solve the above problems after splitting the application , We split up public services , Form a service-oriented model , abbreviation SOA. The latest micro service interview questions have been sorted out , You can Java Interview library applet online brush questions .
Adopt the system structure after the service :

advantage :
The same code will not be scattered in different applications , These implementations are in various service centers , Make the code better maintained .
We put the interaction of database in each service center , Give Way ” front end “ Of web Application pays more attention to the work of interaction with browser .
problem :
How to make remote service call
resolvent :
We can solve this problem by introducing message oriented middleware
Stage nine 、 Introduce message middleware
As the website continues to grow , There may be sub modules developed in different languages and sub systems deployed on different platforms in our system . At this point we need a platform to deliver reliable , Platform and language independent data , And it can make load balancing transparent , It can collect call data and analyze it during the call , Guess the growth rate of website visit and so on , Make predictions about how websites should grow . Open source message middleware has Alibaba's dubbo, Collocation Google Open source distributed program coordination service zookeeper Realize server registration and discovery .
The structure after the introduction of message middleware :

Ten 、 summary
The above evolution is just an example , Not for all sites , In fact, the evolution process of website is closely related to its own business and different problems encountered , There is no fixed pattern . Only serious analysis and continuous exploration , To find the right architecture for your site .


Spring Boot After the scheduled task is started , How to stop automatically ?
Work 3 My colleagues in didn't know how to roll back the code !
23 Design mode and Practice ( Very comprehensive )
Spring Boot Protect sensitive configurations 4 Methods !
Goodbye, single dog !Java Create the object's 6 Ways of planting
Why does Ali recommend LongAdder?
A new technical director : No code writing with headphones ..
blockbuster !Spring Boot 2.7 Official release
Java 18 Official release ,finalize deprecated ..
Spring Boot Admin Born in the sky !
Spring Boot Learning notes , This is so complete !
Focus on Java Technology stack, see more dry goods


obtain Spring Boot Practical notes !
边栏推荐
- [C language practice - exchange the values of two variables]
- Lossy transmission instance
- 云呐|固定资产管理制度及流程,相关流程
- Typescript的学习笔记
- 面试题 08.08. 有重复字符串的排列组合
- Mysql database (25): foreing key
- Database day-3
- C language -- single cycle linked list
- Software test engineers teach you how to make test plans
- 2022.5.24-----leetcode.965
猜你喜欢

k8s中的postgresql怎么导出查询的结果,并导入到本地windows机器上的数据库

Explain asynchronous tasks in detail: the task of function calculation triggers de duplication

网络攻击盯上民生领域,应对DDoS和APT攻击,如何有效防御?

云呐|服务器监控可视化工具

不看全图看局部,CNN性能竟然更强了

top命令的详解

VMware ESXI software 英文版安装步骤

5G发牌三周年 云网融合加速 如何解决企业网络之忧?

Yunna | how to manage the physical assets of the company

2021年10月4日Facebook史上最严重宕机复盘分析
随机推荐
使用nodejs导出md/Markdown文档当中的图片到本地并替换原始图片链接为本地图片链接
云呐|数据库监控工具,数据库监控运维工具
pytorch:子模型参数冻结 + BN冻结
2022.5.27-----leetcode.面试17.11
Teach you how to implement a virtual machine with JS
功能强大的开发板
Navicat instructions
Software test engineers teach you how to make test plans
VMware esxi software installation steps in English
curl post请求携带请求头,传递接送参数数据的命令
Mysql database (25): foreing key
浅析网络可视化分析技术
数字化转型:如何获得组织的认可?
云呐|固定资产如何管理比较好?公司固定资产怎么管理?
云呐|数据库监控一般监控什么
Install MySQL in MySQL installer mode
Yunna administrative unit fixed assets management system, unit fixed assets management measures
【C语言练习——交换两个变量的值】
炒作剽窃、内鬼欺诈 OpenSea上常见的NFT骗局及安全建议
论文阅读《LEAStereo:Hierarchical Neural Architecture Search for Deep Stereo Matching》