当前位置:网站首页>Customer case | China law network, through observing the cloud, greatly shortens the time of fault location

Customer case | China law network, through observing the cloud, greatly shortens the time of fault location

2022-07-07 22:05:00 InfoQ

Wonderful case guide  
 
 
Chengdu Hualu Network Service Co., Ltd
 
abbreviation 「 China Law Net 」www.hualv.com, Founded on  2004  year , Dig deep into the Internet  +  The legal profession  18  year , It is the largest one-stop online legal service platform in China . By  2022  year  1  month , The current registered user of the platform  1.6  Billion , Registered lawyer  40  ten thousand , It accounts for two-thirds of the total number of lawyers in the country , Registered law firm  1.5  Thousands of families , New media fans are super  2000  ten thousand ; Daily independent user visits  1100  ten thousand , Daily consultation volume  16.5  ten thousand , The total number of consulting services reached  4  Billion ; The content of high-quality legal knowledge is super  9  Billion bars , Nianpu legal person Zida  50  Billion ; Since its establishment, the cumulative number of registered enterprises has reached  400  All around , Annual registration  30  All around .
 
Case highlights
·  Large portals can observe best practices
· RUM +  Container monitoring  + APM +  Log analysis  +  Visualization panel , Full function integrated observable experience
·  Real time and fast fault location
· SaaS  deliver , Pay by volume , Achieve cost optimization
 
 
1、 Briefly introduce your company
 
Chengdu Hualu Network Service Co., Ltd ( Abbreviated as China Law Network ) Founded on 2004  year , Dig deep into the Internet  +  The legal profession  18  year . Depending on the 「 Law popularization Services 」「 Consulting services 」「 Non litigation legal services 」 and 「 Offline lawyer matching service 」 Four core functions , For individuals 、 Enterprises and governments provide one-stop legal services . With 「 Let everyone enjoy universal benefits 、 High quality 、 Efficient legal services 」 For the mission , We are committed to becoming a world-renowned one-stop legal service platform .
 
By  
2022  year  1  month , The current registered user of the platform  1.6  Billion , Registered lawyer  40  ten thousand , It accounts for two-thirds of the total number of lawyers in the country
, Registered law firm 1.5  Thousands of families , New media fans are super  2000  ten thousand ; Daily independent user visits  1100  ten thousand , Daily consultation volume  16.5  ten thousand , The total number of consulting services reached 4 Billion ; The content of high-quality legal knowledge is super  9  Billion bars , Nianpu legal person Zida  50  Billion ; Since its establishment, the cumulative number of registered enterprises has reached  400  All around , Annual registration  30  All around . It has three national certified high-tech enterprises , The main honors include  2021  The first certification of fledgling Eagle enterprise in seed stage , 2021  Provincial gazelle enterprise certification , 2019  Chengdu shanggui unified enterprise , 2019  In, the first batch of gazelle enterprises in the high tech Zone , The most influential award in China's e-commerce industry , Certification and awards such as the most investment value award in the mobile Internet industry . Currently, he is a member of Sichuan Internet Industry Federation , Member units of Chengdu Internet Industry Federation .
 
2、 What is the construction idea of the monitoring platform ?
 
Zihua Law Network IT  Business system construction begins , We have been exploring how to perceive the whole system more efficiently , Achieve accurate control of the operating status of components at all levels of software and hardware . Previously, the concept of observability was not mature in China , Out of theoretical basis 、 Technology maturity 、 Implementation cost 、 Team technical growth and other considerations , We mainly use some open source components , for example  Prometheus + Grafana  Open source tools such as , Monitor and analyze the existing system . The monitoring dimension is designed based on our entire business platform , But limited by the ability of the tool itself , Actually, only  VM  and  Pod  Basic indicators of level , It realizes the monitoring of some indicators .
 

null

 
 
Such monitoring is actually relatively simple , In the early days, the scale of the system was small , Under the condition that the application system is not very complex, it is still enough . With the continuous expansion of chinalaw's business , The continuous expansion of user access , The application construction method also needs to adapt to the rapid development of business , Gradually become containerized 、 Microservice migration , If you continue to use basic monitoring to troubleshoot problems, you will be a little stretched . For example, we received the customer's fault feedback , You can only simply check the occupation of infrastructure resources through tools , have a look CPU /  Is there a problem with hard indicators such as memory ; If there is no abnormality , Further check whether the services of cloud manufacturers have alarms ; If there's still no clue , We can only return to the original fault analysis method —— Log  +  Reappear .
 
In the context of distributed microservices , Let's not discuss whether the fault can be reproduced , Or whether the fault can be accurately captured through the new log , And the log printing just provides enough diagnostic information . The first problem we need to solve is , Identify the fault point , Which service failed ? We should be for 「 who 」 Add location log ?
 
In the event of a failure ,
Unable to effectively locate the fault point , It has always been the biggest pain point affecting our online business recovery and fault repair efficiency
. In response to this question , The whole team has been continuously exploring new solutions .
 
3、 How can you pay attention to
Observation cloud
Of ?
 
In the exploration of the overall monitoring scheme of the new business platform , We have preliminarily identified several key technical features of the required monitoring tools :
·  The monitoring dimension must be comprehensive
, Have the ability to collect data at all levels of the system , Especially the application link data , It can well supplement all kinds of information when the fault occurs ;
·  Support distributed microservice tracking
, This is also the business architecture we are using now ;
·  High real-time data
( Discover problems in time ),
Business system intrusion is small
( Reduce introduction costs );
·  If you can , Better be able to
Automatically integrate metrics 、 Self built logs and cloud native service data
, Put it on an interface and display it in a large scale , Or you can see all the information through simple operation in at least one tool interface . So you don't have to be in Grafana  And the interface of cloud monitoring , Can improve some efficiency .
among , Link tracking (APM) The part of is relatively complex , It is also the entry point of our technical scheme selection .APM  It's not a new concept , There are also many mature products available on the market . Because our back-end service is a hybrid technology stack  Java + .NET ,
Yes  .NET  The degree of support and friendliness
Is a more important consideration .
 
We have examined many schemes , Like open source SkyWalking 、Jaeger 、 Zipkin  And some business solutions . except  DataDog  Of  dd-trace-agent  Outside , other  APM  Probe pair  .NET  Their support is not comprehensive , And there is a certain amount of development workload . therefore , Our ideal tool selection , Not weaker than  dd-trace-agent  Probe capability ( Or it can be reused directly ), To achieve distributed full link tracking , It also has other technical features mentioned above , To improve our overall monitoring effectiveness . It is against this background that observational clouds come into our view .
 
4、 Observe how the cloud can help you quickly locate faults ?
 
After determining the construction direction of the new monitoring platform, we began to
Observation cloud
Trial and research . First of all, from the perspective of product function , In addition to the traditional monitoring level indicators 、 Outside the log data , Observation clouds also support dd-trace-agent  Access to , Can achieve
Apply the link tracking function
. Then through their front-end plug-ins , You can collect the visits of websites and applets . secondly , All the data collected can be passed
Data label system , Display all information on a set of interfaces .
 
Now when you encounter problems , For example, we found that some users' requests are relatively slow , You can directly locate the specific location in the link . This location may be an interface , A cloud service such as Redis , Or a database service . In the past, adding logs to find problems back and forth may take several hours , Now we can locate the problem in a few minutes .
 

null

 
 
And observational clouds provide
Various prefabricated panels for out of the box
, We can quickly find out which services are problematic , Which interfaces have high response time , Which interfaces often fail . The efficiency of dealing with problems has been greatly improved .

null
null
 
This efficiency improvement is
Real time data collection capability based on all levels of business system
Realized . The observation cloud shows the data when the fault occurs , Help us understand 「 at that time 」 What happened? , instead of 「 after 」 Try to splice and reproduce again . You know, it is sometimes very difficult to reproduce a problem only from the technical level , And have the ability to collect data at all levels of business , It can solve this problem very well , All status information is recorded in an orderly and relevant way , It means we have some images of the scene of the accident , Avoid invalid R & D positioning investment .
 
in addition , I was also very impressed by the iterative efficiency of the observation cloud product team . In fact, we are relatively early users of cloud observation , Very early , In the early days, the product was not called observation cloud , Nor does it support K8S  Access ( laugh ). There are also some inconveniences in the deployment of data sources . however
Observe that the response speed of the cloud product development team is very fast
, Through continuous product iteration , Fix various problems in the trial process , Realize the trial opinions fed back by users , Develop new functions to support more complex business systems . Today, when we officially launched , It has grown into a full-featured platform , And it is constantly releasing updates to meet more needs .
 
5、 What other suggestions do you have for observing clouds ?
 
At present, observation cloud products have been introduced into our production environment for normal use , For long-term products, we also pay more attention to the balance between function and cost . Recently, the observation cloud has made some adjustments to its billing strategy , This is very good .
Observational clouds may be the only ones we have seen in China SaaS  Chemical delivery , And pay as you go observable tool
, In addition, the product itself is continuously optimizing the collection strategy , We can see more data now , But it seems that the comprehensive use cost is lower . We hope that more similar strategies can help us use various observation functions at the optimal cost .
 
Recently, our development and operation team is gradually exploring the introduction of service governance into the micro service system of China Law Network , It is hoped that observation cloud can provide more product functions and best practices in the direction of service governance .
 
 
 
 
 
 
author | Hualv network operation and maintenance technology expert  —— Fang Zhipeng
Observation cloud product technology expert —— Zhang Tian
 
 
About observing clouds
 
Observation cloud
 www.guance.com, new generation SaaS Full link data observable platform , The first batch of products in China to be awarded by the China Academy of information technology 「 Observability platform technology capability 」 highest level 「 Advanced level 」 authentication , Realize unified collection 、 Uniform label 、 Unified storage and unified interface , Bring a fully functional, integrated and observable experience . Observation cloud energy full environment high base data collection , Support multi-dimensional information intelligent retrieval and analysis , And provide powerful user-defined programmability , Keep the system running under control , The root cause of the fault has nothing to hide .
 
To help technology lovers better understand the global technology trends 、 Observability best practices 、 Observe leading-edge dry goods such as cloud product functions , In particular, we have set up an official community exchange group for cloud observation , Provide a platform for communication and interaction . You haven't joined the group yet , You can scan the code and add wechat to the group , Join our technology community !
 

null

 
 
原网站

版权声明
本文为[InfoQ]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/188/202207071844216988.html