当前位置:网站首页>Service architecture and transformation optimization process of e-commerce trading platform in mogujie (including ppt)
Service architecture and transformation optimization process of e-commerce trading platform in mogujie (including ppt)
2020-11-08 10:35:00 【osc_x8s7voop】
Service architecture and transformation optimization process of e-commerce trading platform in mogujie ( contain PPT)
Reading guide : High availability Architecture 7 month 30 It was held in Shanghai on May th 『 The cornerstone of the Internet Architecture 』 Special Salon , He gave lectures on four topics of closed door private Council discussion and opening to the outside world , It is expected to promote the construction and development of Internet infrastructure in the industry , This paper is about Pan Fujiang sharing the e-commerce trading system architecture of mushroom street .
Pan Fujiang , Senior R & D Engineer of mushroom Street ,2014 Years ago in Ali , I have done the construction of e-commerce vertical business platform , Also engaged in middleware related research and development work ,2015 Joined mushroom street in ( Now beautiful United Group ), Responsible for mushroom Street trading funds , Service oriented construction of e-commerce infrastructure platform such as shopping cart .
I'm from mushroom Street , Mushroom street is an e-commerce platform mainly for female users , Men may use less . But there are a lot of model girls in mushroom street , And the appearance is high , I suggest that you can come down and use , When I'm tired of writing code , You can secretly open mushroom street to see my sister , I think it's still very good .
Today, my topic is the service architecture of mushroom Street trading platform , And in the process of service construction , We share some of the transformation process .
Mushroom Street shopping guide period Business structure
Mushroom Street started as a shopping guide , At that time, all business was based on users and content . At that time, the front desk business was mainly social shopping guide , Background business is mainly to do content management . In a word, it is a small and beautiful state , The business is not very complicated .
At that time, the technology architecture was a typical entrepreneurial company . The whole website uses PHP Built , The system is simply layered , The infrastructure is based on ready-made open source products .2013 Mushroom street was transformed in , The main reason is that a large number of shopping guide websites were blocked during that period , So we transformed into a social e-commerce platform .
The social e-commerce platform is divided into two parts , Part of socialization , We have accumulated some experience in shopping guide before . E-commerce is something we haven't touched before , This is basically built from scratch . To be an e-commerce platform , The first step is to build a trading platform . At first it was simpler , We rewrote a system , There is no essential change in the structure of the system , All business is written in a huge project , We interact with our infrastructure through a set of proxy layers .
The problems faced by e-commerce transformation
- Business is developing at a high speed , Keep... Every year 3 More than double the growth (2015 More than 100 million users per year ,PV exceed 10 Billion )
- The peak value of user purchase link is 100 times that of daily (2015 At the beginning of the year, only 400 single / second )
- The business is extremely complex , Rapid expansion of business form
- The burden of history is heavy , System coupling is very strict
After the transformation of mushroom street to e-commerce platform , Business is basically growing more than three times a year , This is when the problem begins to emerge . E-commerce platform in the development process, especially in the middle of the development of some problems encountered , It's not just mushroom street , Other platforms may also encounter . For example, the system code is bloated 、 Module coupling is high , Rely on Complexity , Poor business expansion ability, etc .
At that time, mushroom Street faced several problems :
One is that our business is growing at a high speed , The system capacity can't keep up with , At that time, the trading system could only support 400 orders per second ( The flow rate is more than 100 times that of the normal time when it is greatly promoted ).
The other is that the e-commerce business has changed rapidly , Business support is not flexible enough , Not fast enough .
There is also the burden of history , System coupling is very serious .
The key to solve this series of problems is just one word :“ Demolition ”.
The process of system splitting
- DB Split Vertically
- Vertical splitting of business system ( The shopping cart , Place an order , Money …)
- data & Business model unification , The logic of service interface design is clear , Proper granularity
- Basic business logic sinks into service ,Web Layers focus on representing logic and choreography
- Service governance
System split —— Trading shopping carts, for example
Let's take the example of trading shopping cart to illustrate our transformation process , We used to have a project , All the code is written here , Different terminals or services have a different module code in maintenance . Access to data is also more casual , Each maintains a set of data access code . So there are two very headache problems :
On the one hand, because trading is just a pool , It's all in it , So these random things SQL May be cold to give you a slow inquiry , The instability of other business code can affect each other , It's hard to locate this “ wild SQL” Where did you find out , That led to our DB It's very unstable , It's very bad for the subsequent transformation .
The other is business support , A product needs to come , It has to be implemented on all kinds of terminals , Reusability is very poor , Business support is very inflexible , The system has no scalability , It's hard to develop students , I often work overtime to do , And a lot of bug.
So we went to dismantle the system , How to dismantle ? In fact, there are some fastidious things .
Priority of system splitting
If you put DB Like a barrel , It's like pouring water into all kinds of business , There may not be much water poured into the barrel at first , There's no problem with the barrel . But as the business grows , The barrel will not be able to hold it one day .
First of all, the barrel needs to be big enough , And it's easy to expand , In this way, there will be no worries . Business volume is sometimes not easy to predict , It's not sure when it will be measured , If you don't make the bottom barrel strong enough , And give priority to the division and optimization of business , As soon as the quantity rises, the whole system will stop .
therefore DB It's the basis of system splitting , It needs to be split first .
DB Take it out and pay attention to stability , As mentioned earlier , at that time SQL It's a bit messy , Extremely easy to cause DB unstable , So data access / Model unification is also key , We built a unified data access layer . With this layer , Back to DB The transformation and expansion can be controlled effectively .
All the basic things have been built , Then we can solve the problem of business support difficulties . Business models need to be unified and abstract , Ability to support custom extensions . At the same time, the process of process transformation also hatched SPI Business framework 、 Process engine 、 These basic business frameworks such as rule engines . It is flexible and scalable in business support . The system has also made a reasonable stratification , Each layer only needs to pay attention to its own ability .
The result of system split
After the overall split of the trading system is completed , company SOA The rudiment of transformation has basically been formed , Including the basic service-oriented framework 、 Message middleware 、 Data middleware 、 The configuration center has also been implemented , In addition, a series of infrastructure tools have been incubated , Including the monitoring system , Scheduling system , Log collection , Link tracking system, etc .
There is also a background splitting process , The overall strategy of the company is to Java Language transfer , This is a comprehensive consideration of the company ,Java There are a lot of talents, especially in Hangzhou , The technology system is also relatively mature , There is a big cow who can hold Housing problems , At that time, it was PHP Less resources .
Capacity improvement
After the system was split and transformed , Next, more attention will be paid to the capacity of the application itself 、 Things about performance and stability . We have also made some improvements and attempts in these aspects .
- According to the business DB Split vertically
- Read / write separation , Ensure that the read can be extended at will
- Sub database and sub table , Increase the central service write capacity
When the system is split , We have put DB It's split vertically , also DB I also made a separation between reading and writing ( be based on MySQL).
The following focuses on the transformation of sub database and sub table , At that time, the main purpose was to improve the write capacity of the central service , Because at that time DB Separation of reading and writing is single Master structure , There will be a write bottleneck .
Take transaction creation as an example to illustrate the process of sub database and sub table , Transaction creation should be one of the most complex business scenarios in a transaction . When you create an order , Write a lot of other data at the same time , At that time, the system capacity was about 1000 units per second ,DB There is a write bottleneck in a single point , And write too much will cause serious delay between master and slave . in addition DB Disk space has also broken through 80%, It's very unstable , It could collapse at any time .
So we decided to split it up , The background was that middleware had not yet been established , There are no sub database and sub table related components , So I decided to start inside first .
At that time, we compared some popular solutions in the industry , Like Ali's TDDL,Cobar, Google's Vitess etc. , By comparison, these components are heavy , Access and use costs are relatively high . Our principle is in line with our business scenario , Choose a component with relatively simple access and usage costs . So we took the last approach , adopt MyBatis Plugin The way to realize the function table of byte division , The component is now open source :https://github.com/baihui212/tsharding
The industry scheme comparison of sub database and sub table is as follows :
Self developed sub database sub table component TSharding, Complete the sub warehouse and sub table
- Simple enough , Invest less resources
- Support sub database and sub table
- Support data source routing
- Support transactions
- Support result set merging
- Support for read/write separation
This component is called TSharding, It's characterized by being simple enough , It's in line with our expectations , Support sub database and sub table , Support data source routing , Support transactions , Support result set merging , Support for read/write separation , Meet all our requirements .
performance optimization
We have also made some attempts at performance optimization , Here are three scenarios :
- Distributed transactions
- Single machine asynchronous parallel
- Preprocessing & cache
Distributed transactions ---- Transaction creation, for example
Optimization idea : Asynchronous message decoupling
- In the transaction creation process , Order 、 The status of the voucher and inventory must be consistent
- Marketing coupon services and inventory center inventory services , It is deployed separately from the order service
- Call the coupon / Inventory service timeout / Failure , Send a message asynchronously to notify rollback ; Complexity is controllable
- MQ Failed to send and try again + Consumption acceptance ACK The mechanism ensures consistency
- Eliminates the intrusive impact of distributed transaction frameworks such as two-phase commit
Let's start with distributed transaction processing , Here's another example of transaction creation , The transaction creation process interacts with multiple services , And some services are strongly dependent , For example, deduct inventory , Locking service , Consistency has to be maintained . Two stages / The multi-stage protocol is very heavy , It was not adopted at that time .
We thought of a way to do it , The asynchronous message decoupling is used to solve the problem , Specific process :
Don't rush to expose the order when placing an order , Let's create an invisible order first ( Or you can think of it as creating an order in advance ), Then we can reduce the inventory , Lock ticket operation , When these operations are abnormal or fail , The order system will send out a scrap message , Its downstream system ( Such as promotion , inventory system ) After receiving the news of the scrap Bill , Will help us do the rollback operation , To solve our distributed transaction problem in this way .
Distributed transactions —— Payment callback, for example
- In the payment callback process , After the fund system calls back the transaction, the order status will be updated , Reduce inventory , And so on
- Funds as sponsor guarantee to try again , Message reachable , Trading and downstream good, etc
- The failed business enters the task retry table , Do asynchronous compensation and try again
- Eliminates the intrusive impact of distributed transaction frameworks such as two-phase commit
Another scenario is payment callback , The payment system will inform the trading system after the order has been paid , The trading system will perform a series of operations, such as order status update , Reduce inventory , Issuing coupons, etc , It's also a distributed transaction problem .
Our strategy is , When the business fails, the request will enter into one of our failure compensation tables , Try again by constantly doing asynchronous compensation ( Stepped ), Ensure ultimate consistency .
Single machine asynchronous parallel —— Shopping cart, for example
Analysis methods
- Shopping carts are typical IO Intensive application
- The code is executed serially , Synchronization wait time is long
- CPU Low utilization
Look at the , The shopping cart itself is a typical IO Intensive application , There are many similar applications like this , There will be a lot of networks IO request . Another point is that we are used to writing code serially , So there's a lot of synchronization waiting time .
Since every shopping cart query will pass through so many nodes , If there is no dependency between two nodes , Can I do it in parallel , In fact, every query corresponds to a query dependency tree , There is no dependency between nodes in the same layer , In this layer, we can actually do parallel operations , So based on this idea, we optimized , And the effect is pretty good .
The specific optimization is to add this concept , When we go to check, we'll wait for other queries , Let's go and find out , Finally make a summary , And the results of the query come out , It's such a process . And then the effect is good , Whole RT Basically, it can be reduced to more than half .
Preprocessing & cache —— Marketing pricing services, for example
- Use cache to reduce DB Reading pressure
- Cache as much data as you need
- DB Data changes, active invalidation cache ( asynchronous , Low latency ), Reduce inconsistencies
- Turn on the local cache before the rush hour ; Warm up the cache , Help to improve cache hit rate
- The preprocessing achieves partial coupling of pricing interface : Sign up to promote the discount of products synchronized to the product list
Preprocessing and caching , In fact, it is also a common optimization method . We adopt a multi-level cache strategy , Local cache + Distributed cache . Read local cache first , If there is no local cache, go to the distributed cache . The distributed cache can't be retrieved DB Take inside . When the data changes, we will have a system to refresh the cache asynchronously to update the data in the cache in time .
service SLA guarantee
SLA: Service Level Agreement, It's a requirement for service providers .SLA Embodied in the container (QPS)、 performance (RT)、 Degree of ( Distribution situation ; Usability ; Error rate ) Constraints . Improve SLA Some of the tools are as follows
- Basic monitoring goes first , Monitor key indicators
- Dependence on Governance 、 Logic optimization : Reduce unnecessary dependencies
- Load balancing ; Service group ; Current limiting
- Downgrade plan 、 disaster 、 Pressure measurement 、 Practice online
This is our internal monitoring system , We will monitor some key metrics for each application , Look at the whole link .
Summarize and plan for the next step
summary
- Service architecture is not static , It evolves as the business evolves
- There is no best plan , Use the right solution in the right scenario
At present, we are doing
- Service governance 、SLA Support systematization
next step
- Same city / Live in different places
Q&A
put questions to : If the consumer receives a message , At this point, I told the system that the message could be deleted , Then my backend is in the process of executing this message , For example, I do some warehousing operations or other operations , But the service died , So, have you ever had a situation like this , How do you deal with it ?
Pan Fujiang : Not yet , Because it's actually a process that requires cooperation , We need downstream systems to cooperate to ensure , Guarantee business OK Then I went to ACK news , It's mainly about the cooperation between several systems .
put questions to : Most e-commerce distributed transaction solutions use message queuing mechanism , Is there a more general solution ? We develop a set of distributed components , For example, a two-stage agreement , Solve this kind of distributed transaction more efficiently .
Pan Fujiang : The problem of distributed transaction depends on the scenario , Alipay ( Used to be in Alipay ) There is a similar framework , But it's heavier , It also requires a certain access cost , It needs some cooperation .Case by Case It would be better to analyze the problem , For example, in some scenarios, the consistency requirements are not so high , There's no need for a two-phase protocol to deal with , Mainly depends on the business scenario .
put questions to : When the database is migrated , Can you do smooth migration ? Because I saw that the middleware you used before was switched twice , At this time, there must be some database online smooth migration , How did you do it ?
Pan Fujiang : We will have a set of data synchronization tools internally , In addition, there is a set of switch system to complete the gray switching , You can push some values dynamically , Into your app , You can change this value dynamically . In addition, our data synchronization tool supports backtracking , You can switch back quickly in case of emergency , And trace the data back .
put questions to : After you make a sub library, you need to keep the old library in the new one , Because after you go up, you have to release it step by step , But your old library is still running , Someone is using your old library at this time , But because you're publishing again , Half of the traffic has been switched to the new library , What if someone is modifying the data at this time ?
Pan Fujiang : As mentioned above , There is a channel between the old library and the new one ( Data synchronization channel ), And the data synchronization tool works all the time . The data from the old database will be synchronized to the new database in real time , And our gray level is dynamically pushed through the switch system , In real time .
put questions to : After the split of database and table , If there is a table associated query, how do you handle it ?
Pan Fujiang : We don't seem to have associated queries , Also do not recommend , Association query is very disadvantageous to subsequent horizontal split , It's not good for DB An extension of . You can look it up separately , Then in the application layer, do the related things .
put questions to : If we go separately , Will the performance be affected ?
Pan Fujiang : Performance may be affected to some extent , Will visit the database several times more , however DB Extended performance is greatly improved , The Internet is playing with big data , Compared to this ,DB Scalability is more important , There are many ways to optimize the application layer ( Such as caching ), If you use JOIN It's very difficult to do horizontal split again .
put questions to : Before you dismantle the library , Have you considered any good plan to back off ?
Pan Fujiang : Our data synchronization tool supports backtracking , If something goes wrong, the switch system can be switched back immediately , And it can go back to the data , Reduce the influence surface to a controllable range .
Related reading
( Click on the title to read )
- Mercury: Vipshop full link application monitoring system solution details
- Design of the same journey travel cache system : How to build Redis The perfect system of the times
- To ensure the data consistency of the distributed system 6 Kind of plan ( Including the mushroom street plan )
- dialogue : An engineer in mushroom street 4 The structure of the year
- 10 Internet teams deal with high voltage capacity assessment and high availability system : The private Council 1 period
This article is related to this salon PPT Links are as follows , You can also click to read the original text and download it directly
http://pan.baidu.com/s/1nvnOEBf
Want to know more about high availability architecture Salon , Please pay attention to 「ArchNotes」 The official account of WeChat reads the following articles . Pay attention to the official account and reply City circle We can learn more about the follow-up activities in time . Please indicate from the highly available architecture and include the following QR code .
High availability Architecture
Changing the way the Internet is built
Long press QR code Focus on 「 High availability Architecture 」 official account
版权声明
本文为[osc_x8s7voop]所创,转载请带上原文链接,感谢
边栏推荐
- Adobe media encoder /Me 2021软件安装包(附安装教程)
- Istio流量管理--Ingress Gateway
- IQKeyboardManager 源代码看看
- Review the cloud computing application scenarios you didn't expect (Part 1)
- 笔试面试题目:求缺失的最小正整数
- Cloud Alibabab笔记问世,全网详解仅此一份手慢无
- Xamarin 从零开始部署 iOS 上的 Walterlv.CloudKeyboard 应用
- 解决Safari浏览器下载文件文件名称乱码的问题
- ASP.NET MVC下基于异常处理的完整解决方案
- Adobe Prelude / PL 2020 software installation package (with installation tutorial)
猜你喜欢
How to deploy pytorch lightning model to production
盘点那些你没想到的云计算应用场景(上)
Adobe Prelude / PL 2020 software installation package (with installation tutorial)
函数周期表丨筛选丨值丨SELECTEDVALUE - 知乎
Personal current technology stack
Japan PSE certification
学习小结(关于深度学习、视觉和学习体会)
Written interview questions: find the smallest positive integer missing
Analysis of istio access control
Improvement of rate limit for laravel8 update
随机推荐
Mate 40系列发布 搭载华为运动健康服务带来健康数字生活
PX4添加新的应用
我们采访了阿里云云数据库SQL Server的产品经理,他说了解这四个问题就可以了...
Six key points of data science interview
That's what software testing is all about?!
Japan PSE certification
Which is more worth starting with the difference between vivos7e and vivos7
函数周期表丨筛选丨值丨SELECTEDVALUE - 知乎
进程、线程和协程的区别
笔试面试题目:盛水最多的容器
Game mathematical derivation AC code (high precision and low precision multiplication and division comparison) + 60 code (long long) + 20 point code (Full Permutation + deep search DFS)
Windows10关机问题----只有“睡眠”、“更新并重启”、“更新并关机”,但是又不想更新,解决办法
笔试面试题目:求丢失的猪
Seven features of Python 3.9
Japan PSE certification
next.js实现服务端缓存
维图PDMS切图软件
How does spotify drive data-driven decision making?
i5 1135g7和i5 1035g1参数对比区别大吗? 哪个好
Python Gadgets: code conversion