当前位置:网站首页>Data intensive application system design - Application System Overview
Data intensive application system design - Application System Overview
2022-07-25 23:34:00 【Adong lazy】
《 Design of data intensive application system 》 - Application system overview
introduction
The overview of system application is the part of pure theory, although it is very simple , But after reading it, I found that many times, some terms are very narrow in my own concept , In the book, the author uses a more rigorous explanatory discourse to discuss some common problems in software and system design .
Some experiments and actual cases mentioned in the book are quite interesting . Also because it is the first chapter , The content is usually not difficult and boring at first , It is also an interesting chapter .
Introduce
Modern application design tends to be more unitary and modular , The amount of data in modern information systems is expanding rapidly , In exchange for complex data and changeable modules , Application systems usually need to include the following .
database : Store the data . Cache : Reduce operation costs for complex operations , such as CPU The cache of , Hard disk cache, etc . Indexes : Establish fast data search and filtering . Stream processing : Communicate asynchronously with another process . The batch : Processing large amounts of accumulated data .
Re understand the data system
In the architecture of a data system , We usually judge the three features of an application system , These three characteristics are : reliability 、 Scalable , Maintainability .
reliability
The so-called reliability does not only mean that the system can operate normally in case of abnormalities , It actually contains more :
The application performs the functions expected by the user . Tolerate wrong data or incorrect operation . Reasonable to the system load and release performance . Rights management .
In short, reliability refers to the reliability of the program and the system architecture, as well as ensuring the security of data .
In addition, some common terms closely related to reliability need to be explained more strictly :
Fault tolerance : Fault tolerance does not mean allowing certain errors , But to Allow some specific problems to arise under the premise of foresight .
Faults and failures : Failure refers to the condition that the component deviates from the original setting , The system may recover , Failure means that the whole business system is paralyzed and will not be able to provide services .
It is difficult to eliminate faults , But in more cases, it is not impossible to solve the problem , But from solving problems to program problems , That is to say BUG Top stack BUG.
There are some ways to detect whether the repair system itself “ normal ”, in the past Neflix Of Chaos Monkey, Literally, it's a noisy monkey , This component detects system problems by simulating some common faults , Although relatively small, it is more interesting .
expand : For many small teams and projects Simian Army It may not make much sense , but Chaos Monkey The idea behind it is worth learning and using . Chaos Monkey It mainly includes the following contents 1、Exception Assault ( Throw an exception attack ) 2、Kill Assault ( Kill process attack ) 3、Latency Assault ( Delay the Caton attack ) 4、Memory Assault ( Memory overflow attack ) You can see Simian Army project :Netflix/SimianArmy · GitHub. If you can visit slideshare, You can also look at this slides: RE: invent: Chaos Monkey.
Hardware failure : Hardware failure is usually by adding spare components in case of need , such as RAID Hard disk , The lithium battery , Asynchronous standby , In order to achieve reliability and ensure high availability , But in recent years, software fault tolerance has gradually become a new means , For example, upgrade the patch by taking turns under the multi node mode , Upgrade without destroying the cluster .
Software error : Software errors are more about being hidden for a long time without being found BUG, Although the probability of error is relatively small , But once something goes wrong, it will be a very complicated troubleshooting process . Software reliability assurance is always unreliable , Even if it seems “ forever ” Some monitoring and defensive measures are also needed for places that cannot be misplaced , This can ensure that the problem can be checked at the first time .( This sentence is very important )
Human error : The more complex the link, the more likely it is to make mistakes , The online process is in the charge of one person, and basically only appears in some garbage companies , Every formal process company has a similar or less strict online process , However, online configuration is often the most likely to encounter human error when the system is updated online .
The guarantee of reliability means the level of development and operation costs , So it is the most noteworthy thing .
Extensibility
How to describe performance
Extensibility refers to twitter about Huge fan out structure Solutions for , The typical performance of this structure is that a user receives a large amount of attention , After that, when users who are concerned publish new content, they will fan out huge requests , So as to support the demand of massive message release .
This structure is obviously a typical business scenario of massive single node publishing, subscribing and broadcasting , There are two kinds of push solutions for bloggers and niche anchors with millions of followers in twitter :
If it is a relational database solution , It is to push new tweets one by one according to the chronological order of followers . Use cache to push , When pushing users, if the same target is found according to the cache , Then directly fetch the cache and push , This reduces a lot of system overhead .
There are some problems with both of them , The first is that it will aggravate the reading load pressure , Although the second can obviously solve the problem of the first , But there is obviously waste , The final extension is to find a combination of two situations , For users who pay less attention, you can use the first scheme to update in real time , But for users who pay a lot of attention, we need the second way .
Therefore, it can be considered that the explanation for scalability is to find a balance between different solutions .
To reach the equilibrium point, we need to consider the following two factors :
1. How many machines need to be expanded to maintain the original performance when the business increases .2. How to maintain performance when system resources remain unchanged .
Delay and response time differences ? The main difference is that the response time will include the time taken by a server from the moment of request to the moment of return , So here we need to add network overhead . The delay is reflected in how long it takes to deal with the task . Here's an example : The total time we spend uploading files from the moment we click the upload button to the moment we return the correct results is called response time , The delay refers to how long it takes to wait for the upload action itself .
So how should we measure performance indicators , We usually use the average response time as a reference , But average response time can't actually restore performance .
The conclusion is that Median + response time Sorting means judging performance , Process according to the user's response time and size .
Here's another example of Amazon's response time when users visit the website based on shopping , response time 1S And sales .
For the optimization of a request , In the early stage 2-3S The completion time is reduced to 1S It's very effective , But to 1S Optimization within , Like optimization 99% Satisfactory request and 1% Dissatisfied request , Optimize to the end 1% The cost is much higher than the actual benefits , So at this time, we need to change our thinking instead of sticking to the old methods .
Therefore, the optimization index of scalability is not for ultimate optimization , Excellent optimization is a logarithmic process , If you can't reach this target, you have to consider the cost and whether it's worth continuing .
To observe system optimization , During load testing, the request generation end must be concurrent instead of blocking , Otherwise, there will be test errors . This sentence means that before any test, it is necessary to ensure that the test is reasonable and reliable .
Load increase expansion
How to cope with the increase of expansion , At present, vertical expansion and horizontal expansion are more discussed , Vertical expansion refers to upgrading the old system configuration , Horizontal expansion is to deploy more machines to share the load .
In most cases, it may be considered that multiple machines with average performance are better than a few powerful machines , In fact, if the architecture is strong enough , Only a few servers with good performance can offset the effect of multiple servers , And the horizontal expansion to a certain extent is limited .
In the vertical expansion and horizontal expansion, it is divided into stateful node expansion and stateless node expansion .
A more common approach for stateful nodes is to use a high-performance server to service requests with a single machine load ( Note that the services here are only application services ), When a single point of service cannot be supported, the plan of horizontal expansion will be considered .
Stateless nodes tend to expand horizontally , Therefore, it usually requires multiple machines or the use of primary and standby backup for disaster recovery .
The future application is likely to be a distributed oriented architecture , Modern distributed programming interfaces and frameworks are constantly improving .
The last point is that machines with the same throughput will have completely different architecture designs according to different business scenarios , Extensible structure usually means the independence between components and its own scalability , like TCP/IP The model is general .
But once the business architecture is established , The cost of adjusting the architecture in the future will be higher and higher .
Maintainability
Maintainability includes operation and maintenance , Simplicity and performability . Operation and maintenance means that the system stability can be maintained through the operation team in daily work , Simplicity is the ability to complete requirements with the simplest logic , It is also necessary to ensure that operators can use it simply , That is, the function is perfect and the system is complete ,
summary
These three features actually point to one feature : Let the operation and maintenance personnel better maintain the system , Because no matter how many systems can be customized , Finally, the operation and maintenance personnel are required to complete the maintenance operation .
in addition , In order to realize the simplicity of the system , We have to introduce abstraction to solve the problem , High level languages also use abstraction to cover up CPU register , Assembly code , Complexity of system call .
Using a larger system requires more abstract thinking , Agile development mode is set up in modern system for this purpose , Test driven open mode and refactoring , From the domestic environment, the abuse trend of the two open models is relatively large , So we should pay more attention to the application of refactoring .
Details about refactoring , Can be in 《 restructure 》 In this book .
At the end
The first chapter discusses reliability 、 Extensibility 、 The theoretical concept of maintainability , At the same time, it combs the challenges faced by various application systems in the current era , And looking forward to the future, with the code maturity and technology improvement of distributed architecture , Even the smile project can play distributed , And it may be in the near future ......
relation
System acceleration : Amda's law
The core : Suppose the program is divided into two parts : Non parallelizable part and parallelizable part .
explain :
Suppose a program on disk is loaded into memory , Scan directories and create files . The parts that scan directories and create file lists cannot be parallelized , But processing files can be done in parallel .
Then according to the above instructions , We can define the following variables :
T = Total time of serial execution
B = Total time that cannot be parallelized
T- B = The total time of the parallel part
From the formula ,T-B This part of the time is really parallel and can be improved by CPU Or thread performance optimization time . When more than one CPU When executing parallel parts with threads, the calculation formula is :
T(N) = B + (T - B) / N(N For the processor or CPU Number )
We don't need to remember complex formulas here , We only need to know that this amda theorem explains the need to optimize the performance of a program , The performance improvement is not as much as we think , Software, hardware and equipment IO, Every item such as memory may affect the performance of the program , At the same time, the effect of single performance optimization may not be significant , When optimizing, we also need to consider from many aspects according to the actual situation .
More references
English original website : http://tutorials.jenkov.com/java-concurrency/amdahls-law.html
Introduction to Chinese :http://ifeve.com/amdahls-law/
in addition , Amda's law provides the following optimization details :
Thread level concurrency ( hyperthreading )
Instruction set parallelism ( Assembly line technology )
Single instruction multiple data parallel
hyperthreading
边栏推荐
- 新手哪个券商开户最好 开户最安全
- Pytorch data input format requirements and conversion
- Generating random number random learning uniform_ int_ distribution,uniform_ real_ distribution
- ETL tool (data synchronization) II
- PyTorch的数据输入格式要求及转换
- Unexpected dubug tricks
- Simulate and implement common interfaces of string class
- Why are there many snapshot tables in the BI system?
- Anti shake and throttling
- [testing technology automated testing pytest] basic summary of pytest
猜你喜欢

Docker 安装 Redis-5.0.12(远程访问)

Qt风格(QSS)应用之QProgressBar

利用用户脚本优化 Yandere/Konachan 站点浏览体验

Apple CMS V10 template /mxone Pro adaptive film and television website template

Discuz atmosphere game style template / imitation lol hero League game DZ game template GBK

图的遍历-DFS,BFS(代码详解)

@Import

动态内存管理

POI special effects Market Research

XxE & XML external entity injection utilization and bypass
随机推荐
@Import
152. 乘积最大子数组-动态规划
LeetCode 0135. 分发糖果
Vscode shortcut key: collapse and expand code
Swap, move, forward, exchange of utility component learning
This point inside the function / change this point inside the function
XxE & XML external entity injection utilization and bypass
Release of v6.5.1/2/3 series of versions of Xingyun housekeeper: the ability of database OpenAPI continues to be strengthened
Mongodb query and projection operators
E-commerce RPA, a magic weapon to promote easy entry
物理防火墙是什么?有什么作用?
策略模式_
TS class
XXE&XML-外部实体注入-利用和绕过
[Muduo] thread package
Which securities firm is the best and safest for beginners to open an account
2022牛客多校第二场
Bind class style and bind style style
Graph traversal DFS, BFS (code explanation)
Mongodb update operator (modifier)