当前位置:网站首页>Six years of technology iteration, challenges and exploration of Alibaba's globalization and compliance
Six years of technology iteration, challenges and exploration of Alibaba's globalization and compliance
2022-07-01 13:44:00 【InfoQ】
One 、 Business development history
1.1 Business background

- The above figure shows the countries currently covered by Alibaba's global business / region , You can see the key business countries / The region spans Asia 、 o 、 The three continents of the United States , Differences in business demands lead to obvious differences in technical solutions , An end-to-end technical solution cannot perfectly support all countries / region , But the hierarchical combination of differentiation / Customization has been proved feasible by practice , This is for us【 Standardization of the system 】Put forward a request ;
- The era of extensive harvesting has passed , In the era of refined operation , Dealing with user experience / Compliance regulation , Technical solution deployment closer to users , It is the foundation of local experience , This is for us【 Lightweight of the system 】Put forward a request ;
- With the deepening of the digital era , Digitization / Intellectualization is influencing and changing all aspects of human society more and more profoundly . As a global business , Whether our users come from developed countries or developing countries , Let the numbers / Intelligence helps users live a better life , It is always our goal , And this is also for us【 The intellectualization of the system 】Put forward a request .
1.2 Iterative process of technical system

- Stage 1 , Based on domestic Taobao 、 Tmall 、 Search and push and other team systems , stay 6 A full set of support has been set up in six months Lazada New e-commerce kernel system based on .
- Stage two , Customize the e-commerce kernel system accordingly , Set up a full set of support Daraz New e-commerce system .
- Stage three , Combine this e-commerce core with AE The system is deeply integrated , At the same time, Taobao was introduced 、 Excellent system solutions from tmall and other teams , It can support local + The internationalization of cross-border transaction mode is the embryonic form of Zhongtai .
- Stage four , Based on the above fusion version , Merge Lazada、Daraz、 Tmall Taobao overseas , Complete the internationalization of the middle Technology Branch 4 close 1 action , Finally formed the present 1 A middle platform support N A new global architecture for sites .
- Stage five , The open source strategy of International China and Taiwan has begun to land , after 1 More than years to 2021 year 11 In June, the open source of the full link between China and Taiwan was completed , The global business and the respective closed-loop iteration situation of the middle office have formed .
- Stage six , The future has to , Coming soon .
Two 、 Challenges at the infrastructure level of globalization
- Global deployment : Whether considering the user experience , Or consider regulatory compliance , The global deployment of infrastructure is the basic capacity that global businesses must build , The infrastructure deployed globally also directly determines many specific architectural forms of the global technology system , At the same time, the construction and maintenance of the globally deployed infrastructure itself is also a huge challenge .
- performance : Here, the performance refers to the delay of user request processing , userThe shorter the delay from initiating a request to receiving a response , Represents better performance. And global Internet services have natural challenges in terms of delay , That is, the physical distance is longer , The computer room may be in the United States , Users may be in Australia . Our test data shows that American users request the general network of American Internet services RTT yes 10ms within , And the Russian user requested the western US computer room RTT stay 150ms To 300ms Not between , This directly leads to the user's full screen load time will be more 1 Second , and 1 Seconds will cause the conversion rate to decrease , Even the loss of users .
- Usability : There are also cost challenges in serving global users , This challenge will also bring challenges in system availability . If availability is guaranteed only from a local perspective , Then we need to build dual computer rooms in each local area to ensure high availability , But in this way, the idle resources of computer rooms in other areas cannot be used , The overall cost will also be very high . And we 7*24 Hour availability requires a global perspective , therefore , If we can achieve disaster recovery in different places around the world , It can be within the acceptable range of cost , Better take into account the user's availability .
- Data consistency : The challenge of data consistency refers to when data is shared by many users around the world and many users will read and write , How to ensure data consistency ? give an example : The scene of global buying and global selling , Buyers create orders in local data centers , Sellers maintain orders in their local data center , If it is the same order and the buyer and seller are in different data centers , How to ensure consistency in reading and writing ? When the global data centers have disaster recovery for each other , There will also be more reading and writing , How to ensure data consistency .
3、 ... and 、 Cloud based sea landing practice
3.1 Overseas deployment and disaster recovery
3.1.1 Alibaba cloud infrastructure
- IAAS layer :Relying on Alibaba cloud's globally consistent infrastructure , We have built a global 6 A large area 、13 Physical machine room 、17 A logic room (AZ) The infrastructure of overseas digital commerce , While enjoying flexible resource capacity, there is no need to be in multiple countries / Regional deployment and maintenance of data centers .
- PAAS layer :Relying on various middleware of Alibaba cloud / Cloud products are deployed globally , So as to solve a series of technical challenges of globalization from top to bottom .

3.1.2 Global deployment architecture

- The network layer :According to DNS Resolve to the nearest computer room IDC, Arrive at the unified access layer of the machine room .
- Access layer :It is necessary to bridge a unified routing layer to rectify the strong consistency of user ownership , That is, call the routing service at the access layer , Query the user's ownership and realize cross machine room scheduling , To achieve the purpose of users jumping across machine rooms .
- Service layer :For strongly consistent data , For example, payment 、 Trading, etc , The user ownership of the unified routing layer needs to be guaranteed , That is, if the unified routing layer is wrong , that MSE The layer also needs to call the service back to the computer room where the user belongs correctly across the computer room for consumption ; At the same time, for the consistency of shared data , It is necessary to expand the cross machine room service call function of central reading and writing ; In short , stay MSE The layer needs to realize the cross machine room call function according to the user ownership or the consumption of the central machine room .
- Database layer :We extend its plug-in , Realize the write ban function , It is also the bottom line of user attribution errors and strong data consistency guarantee , That is, if the user's home area is inconsistent with the actual call area , We will protect it from writing , To avoid dirty data writing between different regions .
- Data synchronization layer :The data between the central machine room and the regional machine room is synchronized in two directions , Ensure the data consistency of remote disaster recovery , Avoid data loss after user area change .

- 【 Flow dyeing 】 Request identification on the end , Determine what tenant traffic , And dye the flow
- 【 Precise location 】 Based on traffic coloring , And the service routing capability of the access gateway layer , Accurately locate the physical cluster where the tenant is located
- 【 Link transparent transmission 】 Within a single service instance of the cluster , It is necessary to solve the problem of transparent transmission of tenant objects , And keep up with downstream synchronization 、 Transparent transmission of tenant information during asynchronous interaction
- 【 Resource isolation 】 During the execution of internal business logic , Operation on any resource , Need to consider the isolation problem , Such as configuration , data , Flow, etc
3.1.3 Global disaster recovery solutions
- Region Level and network are not available :Machine room level unavailable , The external network entrance cannot reach the physical machine room or the physical machine rooms cannot be interconnected .
- Service level unavailable :Extranet / The intranet connectivity is normal , Service not available .
- The data tier is not available :DB/ Cache not available .

- Network disaster tolerance :The user's first hop is outside the network route ( If the cell network is abnormal, we basically have no operation space ), In the next 2->N jump , We can build the switching capability of network operators respectively ( many CDN Manufacturers are related to each other ), Link switching capability of machine room (Region Level tangency ), The switching ability of the operator at the entrance of the machine room (IDC Network teams are interconnected ) And other means to try disaster recovery .
- Access layer disaster recovery :When the traffic arrives at Alibaba cloud machine room , After entering the internal gateway routing layer , According to user granularity 、Api The granularity level and other dimensions are used for real-time traffic correction , Second level to take effect . When the network and gateway products are normal , Access layer disaster recovery is the daily norm to be applied 、 The disaster recovery plan with the most rehearsals .
- Disaster recovery at the service level :For some strong center services , For example, inventory 、 Sales and other single area deduction Services , It also needs to build its disaster preparedness capacity .
- Data layer disaster tolerance :For multi active architecture , In ensuring a single data Master On the basis of , Ensure that data will not be dirty during disaster recovery . For compliance scenarios , Consider some Region No sensitive data , Realize the compliance disaster tolerance capability under limited scenarios .
3.2 Global data compliance practice
3.2.1 Introduction to global compliance

3.2.2 Data compliance requirements and deployment architecture

3.2.3 Local storage solutions

3.3 Application architecture cloud prototype
3.1 Challenges faced by traditional application architecture

- The application architecture is not sustainable :In the rich application delivery mode , In the process of software production , There is always a single point —— application , When the content supported by the application becomes larger and more complex , Then it will be the key point affecting the efficiency of R & D , It is also the biggest challenge that affects the sustainability of the entire international platform architecture .
- R & D delivery uncertainty :The global platform and business layered R & D model are inconsistent in purpose and change rhythm . To solve the difference between the two , It will lead to the application itself becoming bloated and corroded , So it brings great uncertainty and unpredictability to the daily R & D iteration .
- Lack of standards for operation and maintenance capacity :As the complexity of the application itself increases , The matching operation and maintenance capacity will also increase , And currently advocated DevOps idea , Also derived a lot of related products and tools , But the standards of these products and tools are not unified , And then cause scattered and miscellaneous 、 There is no unified product import , As a result, the operation and maintenance efficiency and understanding cost continue to increase .

- Container arrangement technology :Through the cloud native container orchestration technology, the traditional software delivery process is evolved into the combined delivery of various container orchestrations , Split a single application delivery into multiple modules, and flexibly arrange and deliver , So as to promote the evolution of global application delivery system .
- The deliverables are mirrored :Applications are no longer the only object of research and development , But to build a mirror R & D system , Ensure the certainty of the delivered content based on the invariance of the image , And realize the mirroring of platform capabilities , Have an independent and stable R & D system .
- Unified operation and maintenance standards :With the help of clouds IaC/OAM etc. GitOps idea , Converge and define the application operation and maintenance standards under the cloud through a unified model . And redefine the business organization SRE, Query through a unified perspective 、 analysis 、 Measure the current situation of application operation and maintenance capabilities and resource usage .
3.2 Global cloud native architecture practice
3.2.1 Cloud based native application architecture
- IaC:Provide a unified R & D infrastructure statement paradigm . In order to better decouple the platform from business dependencies , Reduce the cognitive cost of the platform , What we apply to the site IaC The hierarchical abstract standard is defined , Define infrastructure standards around globalization scenarios , From specifications 、 Log collection 、 probe 、hook、 Publish strategies, etc. for unified convergence , Reduce service access IaC cost .
- OAM:Provide the definition of unified application model . Depending on OAM Separation of development and O & M concerns 、 Platform independent and highly scalable 、 Modular application deployment and operation and maintenance , We define business and platform application-oriented Standards , So as to better link application developers 、 Operations staff 、 Application infrastructure , Make the delivery and management process of cloud native applications more consistent .
- GitOps:Provide business R & D continuous delivery capability . Cloud based GitOps Declarative ideas , External dependent components can be integrated from capabilities to operation and maintenance control and uniformly declared in the project , And then only need to be based on unity GitOps The standard carries out the declaration and definition of dependent capabilities , Thus, the delivery and control of component capabilities are handed over to the underlying GitOps engine , Improve the integrity and sustainability of the entire software system .
- ACK:Provide a unified resource scheduling engine . We are based on Alibaba cloud ACK Container services , Use the powerful container orchestration it provides 、 Resource scheduling 、 And automatic operation and maintenance , Realize the delivery of different business module functions for different environments , And traffic scheduling based on the upper layer , Realize business on-demand deployment , Dispatch on demand .
- Container arrangement :adopt ACK Container flexible orchestration technology has successfully upgraded the global application architecture , Integrate business logic and infrastructure 、 Platform capabilities 、 Public rich clients are completely isolated in the R & D state , In the running state, business processes and operation and maintenance processes are relatively completely isolated through lightweight containers , Improve the overall application R & D delivery efficiency and the stability of business form .

- Infrastructure container (Base Container), It includes the operation and maintenance container 、 The capabilities of the infrastructure on which applications such as gateway containers depend ;
- Temporary container (Temporary Container), The container does not have any lifecycle , Its function is to pass its own R & D products Pod The shared directory under is integrated into the main application container and business container , Complete the integration of the whole capability and be used , It is mainly composed of platform containers ;
- Business container (Business Container), This container is the same as the main application container , Have a complete life cycle , And through gRPC Complete the communication with the main application , It mainly consists of categories 、 Multilingual and other rich client containers .
3.2.2 Based on the original operation and maintenance system of cloud

- Application of release : Intelligent release decision 、 Upgrade in place 、 Rolling upgrade 、 Release in batches
- Elastic capacity : Auto elastic 、 Timing flexibility 、CPUShare
- Mass operation and maintenance : Restart the application container in place 、 Container replacement 、 Log cleaning 、JavaDump
- Lightweight containers : The operation and maintenance container is independent 、Sidecar layout
- Multi container delivery deployment : Port conflict 、 Process conflict 、 File directory sharing
- Observability and stability : Application life cycle 、 Start abnormal diagnosis 、 White screen 、 Container perspective monitoring


Four 、 Summary and prospect
边栏推荐
- Dragon lizard community open source coolbpf, BPF program development efficiency increased 100 times
- 分布式事务简介(seata)
- 关于佛萨奇2.0“Meta Force原力元宇宙系统开发逻辑方案(详情)
- Explain IO multiplexing, select, poll, epoll in detail
- Yarn restart applications record recovery
- 受益互联网出海 汇量科技业绩重回高增长
- Cs5268 advantages replace ag9321mcq typec multi in one docking station scheme
- Report on the 14th five year plan and future development trend of China's integrated circuit packaging industry Ⓓ 2022 ~ 2028
- Report on the "14th five year plan" and scale prospect prediction of China's laser processing equipment manufacturing industry Ⓢ 2022 ~ 2028
- 1.8新特性-List
猜你喜欢
French Data Protection Agency: using Google Analytics or violating gdpr
6年技术迭代,阿里全球化出海&合规的挑战和探索
刘对(火线安全)-多云环境的风险发现
9. Use of better scroll and ref
QT学习管理系统
当你真的学会DataBinding后,你会发现“这玩意真香”!
Content Audit Technology
Chen Yu (Aqua) - Safety - & gt; Cloud Security - & gt; Multicloud security
5G工业网关的科技治超应用 超限超重超速非现场联合执法
Station B was scolded on the hot search..
随机推荐
Social distance (cow infection)
1553B environment construction
Leetcode第一题:两数之和(3种语言)
Collation and review of knowledge points of Microcomputer Principle and interface technology - pure manual
word2vec训练中文词向量
Dragon lizard community open source coolbpf, BPF program development efficiency increased 100 times
[Jianzhi offer] 54 The k-th node of binary search tree
面试题目总结(1) https中间人攻击,ConcurrentHashMap的原理 ,serialVersionUID常量,redis单线程,
Solution to 0xc000007b error when running the game [easy to understand]
Spark source code reading outline
[anwangbei 2021] Rev WP
Detailed explanation of leetcode reconstruction binary tree [easy to understand]
When you really learn databinding, you will find "this thing is really fragrant"!
A Fletter version of Notepad
Enter the top six! Boyun's sales ranking in China's cloud management software market continues to rise
Global and Chinese silicone defoamer production and marketing demand and investment forecast analysis report Ⓨ 2022 ~ 2027
Investment analysis and prospect prediction report of global and Chinese p-nitrotoluene industry Ⓙ 2022 ~ 2027
7. Icons
盲盒NFT数字藏品平台系统开发(搭建源码)
一文读懂TDengine的窗口查询功能