当前位置:网站首页>Repair for a while, decisively reconstruct and take responsibility -- talk about CRM distributed cache optimization

Repair for a while, decisively reconstruct and take responsibility -- talk about CRM distributed cache optimization

2022-06-21 13:04:00 InfoQ

01 

Current status of distributed cache usage


as everyone knows , Distributed cache is a silver bullet to improve system performance , It is frequently used in large distributed systems , It plays an indispensable role in improving the performance of the system . In recent years , Distributed 、 Centralization is popular all over the country IT world , With high cohesion 、 Low coupling is the principle ,CRM Gradually evolved into a multi center distributed deployment architecture , The resulting interaction between centers is catalyzed by the increasingly complex business , System pair PAAS Platform components put forward higher requirements . What's not surprising is , Distributed cache is also facing many challenges in business production .

02 

The conflict between ideal and reality


The original intention of introducing distributed cache into business system is to improve system performance , It brings convenience to users 、 Jane 、 Spiritual operation experience . In order to get the benefits of distributed caching as soon as possible , Accelerate the introduction of distributed cache in system construction , Lazy style ( Trigger while reading ) Load mode of -- When there is no data in the cache , First from DB Read , And then load it into the distributed cache , This kind of mode generally exists in the early stage of many system construction .

With the passage of time and the constant change of business demands , People will find that , To meet the requirements of complex business , Distributed caching is not enough , Therefore, the application server is added as the L2 cache , In order to reduce the number of interactions between the application and the remote distributed cache . After the adjustment, the cache usage architecture of the system will change to a two-level mode .

null
In this architecture mode , It is found that the performance of the system has been greatly improved in many business scenarios , For example, product sales query 、 Product sales relationship query . But it also brings some cache usage problems :
  • Cache data is opaque : Whether it's remote distributed caching , Or the local L2 cache , Is a huge black box . At present, there is no reliable visualization tool that can clearly view the contents of the cache . Especially in the application layer key/value Value after processing , The operation and maintenance personnel have no way to view the cache through a simple identification key Corresponding value value .
  • Multi center cache refresh problem : Under the multi center architecture , The same set of configuration data , Several centers may need , So when doing a refresh , How to ensure that the cache of each center can be refreshed to ? How to ensure that missing brushes are found in time , And can handle flexibly .
  • Cache data consistency problem : Remote distributed cache 、 The data in the local L2 cache is the same as DB How to ensure the data consistency of ? How to find the inconsistency in time and prompt the early warning .

In front of the ideal performance improvement brought by distributed caching to business systems , There are still some pain points in the combination of technology and business in the use process and some practical problems in the operation and maintenance process . Solving the conflict between ideal and reality is also a big challenge , What do I do ?

03 

Face the challenge , Give an answer


In life, we often meet with the passage of time 、 Personal needs 、 Changes in population, etc. lead to aging of housing facilities 、 An untimely situation , It needs to be cleaned or renovated irregularly . Interestingly , Actually CRM The cache optimization process is the same as the old house renovation process .

around “ A vision 、 Two goals 、 Four improvements 、 Seven measures ”,CRM The cache module reconfiguration and optimization of each center is divided into two parts : One part is the main body transformation part , The other part is the polishing part .

null

04 

Main body reconstruction —— Cache refresh refactoring


Early stage of transformation -- Disassembly of existing functions


The process of sorting out the original functional logic in the early stage of transformation , It is necessary to have a certain understanding of the existing business support . Put all the components of the original page , Include buttons 、 Forms, etc. are disassembled , Record the function of each button . Disassembly is a mechanical process , There is no need to add any judgment . All the parts that need to be optimized , Decide in the next step . It is convenient to distinguish which business functions are retained in the later stage , What logic to reorganize .

design phase -- Determine the overall style 、 Overall design


After disassembling and sorting out the existing functions , We already know the stock function in the system , It is found that the functions of the old pages are confused , Partial logical redundancy 、 Some functions have defects . After sorting, the business dimensions are used , Organize and summarize , Determine the overall optimization plan . The overall events and plans for unified refresh have been established :
  • structure CRM Unified cache management function of several centers , Make the operation and maintenance personnel easy to understand , The function is clear , Support single value refresh 、 Support batch refresh , Support cache value viewing ;
  • CRM A little refresh capability , Cache data involving associations , Refresh synchronously according to business requirements , Include associated specifications 、 Instance template, etc . Achieve a little refresh , It is no longer necessary to judge the logical order , You can use the sales product dimension 、 Full refresh of product dimensions ;
  • CRM Cache refresh operation is traceable , Distributed application multi node cache information can be patrolled 、 contrast , Refresh operation log retention , Facilitate the O & M and developers to locate and track problems ;
  • Provide CRM Visual interface , Processed KV Values are displayed by structure , Intuitive 、 Clear view of distributed cache 、 Local cache 、 The specific values of the cache objects of the three databases .

null
The redesigned unified refresh architecture is shown in the figure

Optimization, reconstruction and transformation


First step : Demolition work -- Eliminate unnecessary redundant logic
After the functional disassembly and special design stage , Sort out the cache refresh caliber scattered in each center 10 Multiple , Multiple caliber 、 More pages , It is hard to avoid confusing the operation and maintenance personnel , This requires taking the essence from the scattered cache management functions in each center 、 Discard the dregs , Eliminate scattered pages and redundant business logic , Integrate the page functions reserved by each center into the unified management page of the portal .

The second step : Concealed construction works -- The cache refresh mechanism is unified ,zk Unified command release listening mode
In distributed architecture , For data that requires higher performance and better availability , Cache is often designed as a multi-level structure , If there are data updates , You need to consider how to ensure that the cache data in the process of each host node is consistent . To ensure the consistency of cache refresh , We used a new refresh mode , Initiate a cache refresh request by refreshing the page , When the application receives the cache refresh request , Generate a cache flush command and write the command to ZK node , Each central backend application has a zookpper Provided api Do real-time monitoring , When the supervisor hears zk After the node data changes , Each application obtains the command of the node 、 Parse command 、 Call the command cache to refresh API, So as to refresh the in-process cache data of the local application node .

null
The third step : Foundation construction works -- Provide a refresh interface according to the use role
The roles involved in cache operation include version release 、 Fault operation and maintenance personnel and specification data operation . Different roles 、 There will also be some differences in operating habits , To make cache management more powerful , We target different roles , Provides a personalized refresh interface and logic .
1、 For personnel in operation and maintenance or data operation roles , We provide key Value to view the cached data and refresh the interface .
2、 For release personnel , We just need to provide the table 、 Field corresponds to KV Just refresh the interface .

Step four : Conventional decoration -- Cache Visualization 、 Cache patrol 、 Cache operation log
To solve the problem of cache Visualization , The cached content must be presented structurally . In the business system , The data we cache may be single KV Mapping values , It may also be complex business object data . Therefore, a visual display should be provided to the operation and maintenance personnel , We need to put key and value Values are internally encapsulated and then provided . Take the configuration data cache structure of product as an example , Its structure is composed of different sub table cache objects .

After finishing the structured sorting , Add structured presentation of page cache data , Make the results more clear and intuitive .

null
After the refresh cache action is completed , Successful operation , The cached data cannot be viewed . To solve this problem , It also needs to provide visual patrol function , After the operation, you can initiate a request on the page , Back end according to key Value loop calls each application node , obtain key Audit the corresponding cache value , Compare whether the local cache is consistent with the database , Mark data inconsistency application node , At the same time, assemble the data values of the local cache objects .

Visualization of patrol inspection results , Difference data comparison , The effect is shown below :

null
On-Site Inspection

null
Data difference comparison

05 

Refinish and polish —— The cache engine uses elevation


After the renovation of the old house, the main body is transformed , It has basically reached the occupancy standard , But for the furniture and articles retained by the owner of the house before the renovation , It also needs to be properly organized and arranged , So as to reach the level of carrying bags . It is said that ,“ Three part carving , Seven points ”, After completing the optimization and reconstruction of the main functions , The common core functions also need to be refined and polished , This can improve the efficiency of the cache :

Atomic power refinement and encapsulation

For cached parts API Atomized encapsulation , For each API Give a detailed description , Keep improving , Standardize and facilitate the use of subsequent R & D personnel .

Cache compensation and cache interception mechanisms

To prevent the system from generating dirty data , It is necessary to prevent and clean abnormal data . Missing value for cached data , Compensation mechanisms are needed ; For outliers , The cache verification and interception mechanism is added .

Create features -- Seamless switching of grayscale production cache

CRM Application of gray environment , Let some users conduct internal test of version upgrade , To a large extent, it avoids the risk of production failure due to version problems . In architecture design , The grayscale is isolated from the cache of the production environment . But this approach also brings some problems , When the business data of gray-scale verification is not completed in time , After the grayscale is switched to the production environment , The original grayscale business data cannot be read in production . Therefore, we reform the reading strategy of the application , Share a set of instance data through production and grayscale ( user 、 Order 、 Expenses, etc ) cache , Orders in the gray environment can also be queried and continued in the production environment . After the system is switched , There is no need to place a new order , Thus, the seamless switching between grayscale and production environment is realized .

null

06 

speech


Different businesses have different requirements for system performance in different periods , The requirements for system design are also different when the performance requirements cannot be met , It's rather awkward to say , In short, the system is in the process of continuous support evolution , Will appear due to inadequate preliminary design 、 Problems caused by inadequate conditions . Solve such problems , Can be cautious , Where there is no cure, there is no cure ; It can also be drastic , A knife in every direction . For this kind of problem that will continue to linger in the sickbed if it is not cured , Tinkering can make you feel good for a while , But the decisive reconstruction is the only way .
原网站

版权声明
本文为[InfoQ]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/172/202206211232320022.html