当前位置:网站首页>Optimization of network request success rate in IM instant messaging software development
Optimization of network request success rate in IM instant messaging software development
2022-07-28 16:35:00 【wecloud1314】
Because of the complexity of mobile networks , Write high quality 、 Experience good mobile applications with network communication capabilities ( Especially instant messaging, which is highly sensitive to network quality ) It's a big challenge .

The mobile network we usually see has the following three typical characteristics :
1) Mobile state network signal is unstable , High delay 、 Jitter and packet loss 、 The passage is narrow ;
2) Mobile state network access types and access points change frequently ;
3) Mobile users use high frequency 、 fragmentation 、 Not WIFI Traffic sensitive .
Because of the above characteristics , Mobile applications will face a variety of complex and changeable problems in network data communication .
No matter how complicated the technology behind it is , But for ordinary users APP Come on , Can complete the network request smoothly , It's a matter of course . let me put it another way ,APP Network request success rate , The importance is directly reflected in the fact that it can directly determine APP Availability of services , It directly affects data communication 、 Video playback 、 Advertising presentation 、 Convenient payment and other service quality .
Factors leading to the failure of mobile network request
Want to optimize the success rate of mobile network request , First of all, let's understand the links in the whole chain of mobile network requests that may cause requests to fail .
The first category : Factors that cannot be improved :
1)iOS System pair APP Network access control of 、 Flight mode or no network connection . Detect and identify these three situations , Prompt the user in an appropriate way ;
2) Router failure .
The second category : Factors that can be improved :
1) cellular /Wifi Signal strength 、 False connection state of connection congestion ;
2)DNS fault ( such as DNS Hijack, etc );
3) Operator local node failure ;
4) Own operation load balancing fault ;
5) Business server failure :HTTP Response error , Corresponding APM Of HTTP Response error rate ;
6) Business logic error : Monitor subclass resolution results , Corresponding APM Analysis error rate monitoring of .
For non improving factors : At present, the fault type can only be identified through network diagnosis , Guide the user to manually authorize access to the network or connect to the available network .
among , If the router fails , It can guide the user to restart the router or switch 4G. Through iqiyi APP Data monitoring of , It can be roughly seen that the proportion of the duration of the user's non network connection is 3.8% about , This shows that it is very important to provide good Internet free prompt , And the slave uses the weak signal of cellular signal (0 Case sum 1 Lattice signal ) The proportion of duration is 9% Left right duration , We can also see the complexity of mobile network environment . Instant messaging chat software app Development of the v:weikeyun24 consulting

For the factors that can be improved , Solutions can be roughly divided into three categories :
1) Network layer error , Corresponding factors 1 To 4. Mainly reflected in over reporting errors ;
2)HTTP Response error , Corresponding factors 5.HTTP Status code for 400 And above ;
3) Parse error , Corresponding factors 6. Monitoring by overloaded interface defined by baseline network library .
In order to improve the success rate of network requests , First, we need to establish a monitoring system , Thus, the baseline network database can be transferred to the APM Deliver network request data of various dimensions . With APM After statistics , In order to effectively discover the causes of network failure on the end , And then solve the problem .
besides , Due to the huge data requested by the end network , The limitation of storage space makes APM Sampling only 2% Users of , Therefore, network requests for key services ( For example, the home page ) Then the whole collection is carried out , So as to achieve a more objective and comprehensive evaluation of the optimization of the success rate .
In the baseline network library layer, different compensation ideas are provided for different services
Before optimization , adopt APM According to the classification analysis of : The main error of request failure is timeout (-1001) 90% of the total , meanwhile SSL error ,DNS Parsing error ratio follows . According to this data , Retry becomes the most important measure to optimize the success rate of requests .
After continuous exploration and practice summary , The baseline network library provides four different retries for different business needs .
1)IP Direct retry , Direct connection through configuration IP Count to control the number of retries :
Scheme unchanged ,Host Change to direct use IP( Eliminate domain name resolution risks ). Because this requires each business line to provide direct connection IP, At present, the main access services are mandatory requirements such as login HTTPS Connected business .
2) Super pipeline retry , You can configure the 1~3 Retries :
Based on HTTP Gateway agent service for ( Similar to remote charles agent ).Host Change to agent IP( Eliminate domain name resolution risks ),Scheme Change it to HTTP( eliminate SSL risk ,h2 Downgrade to HTTP1.1). Because this measure needs to pay the flow cost , At present, access services are all key core services , Such as home page .
3)HTTP retry , You can configure the 1~3 Retries :
Scheme It is amended as follows HTTP( eliminate SSL risk ,h2 Downgrade to HTTP1.1), The other is constant . In view of its inclusive retries , Access to non key core general services at present .
4) primary url retry , You can configure the 1~3 Retries :
Scheme and host And so on , Generally understood retry , Just ask again . This is not recommended for retry at present , At the discretion of the business party .
Except for a single retry method, you can retry multiple times , The basic network library also supports the combination of multiple retry methods , The priority of the four retry methods is :IP Direct retry > Super pipeline retry > HTTP retry > primary URL retry . Deduct the situation of no network , Homepage recommendation page CARD The interface success rate is retried in the 2020 By the end of the first quarter 99.76%.
Other factors that affect the success rate of mobile network requests
Except to try again , There are also the following factors that have a direct impact on the success rate of network requests .
1)HTTP/2 vs HTTP/1.1: The recommended request strategy is to request to leave for the first time H2, Go when failure retries HTTP/1.1:
HTTP/2 Yes HTTP/1.1 The biggest improvement is to share one TCP Connect , When the network is smooth , by HTTP/2 Speed advantage . But when the network goes bad ,TCP The absolute sequence number of a package will cause the loss of a package and block all subsequent requests . This single TCP Connectivity, on the other hand, increases congestion , Increase the possibility of request failure .
NSURLSession yes HTTP Protocol adaptive , Unable to control request usage HTTP/2 perhaps HTTP/1.1. However, due to the actual standard requirements of the industry HTTP/2 Must be HTTPs Of , In this way, the URL Scheme Change it to HTTP Can be downgraded indirectly to HTTP/1.1.
2) Proper timeout setting is an important factor :
NSURLSession The timeout of is actually TCP Time out for , It's not an overall request time-consuming timeout .
The recommended timeout policy is : The timeout for the first request can be smaller , The timeout of retry should be larger .
3) Interface requests are too dense and concurrency may reduce the success rate of requests :
Like playing records upload Interface after adding multiple retries , The success rate is still only 98.2%.APM Monitor this interface at IPv4 The environmental failure rate is only 0.47%, and IPv6 High failure rate 7.07%. Optimize the request policy through the end , After reducing the concurrency density of the interface ,IPv6 The environment and IPv4 The environment is improved to 99.85% Success rate of .
4) Smaller interface data volume , Higher request success rate :
HTTP/2 and HTTP/1.1 It's all based on TCP Connected , Smaller interface data volume , To be transmitted TCP Fewer bags , The probability of transmission failure is reduced .
Optimization measures to improve robustness and prevent failure
After various optimization measures to improve the network success rate , We also take the following measures to prevent the instantaneous decrease of success rate caused by online fault , Improve the robustness of network request .
1) Robustness of super pipeline itself :
The following case shows that the interface error rate with super pipeline retry is only 3.95%, The interface error rate without super pipeline retry is as high as 28.96%. The time point of this case has not used remote disaster recovery IP, Disaster recovery in superimposition IP after , The error rate curve can be almost smoothed .
2)HTTP Retry and original URL Retry on v4v6 Dual stack environment , first IPv4:
because IPv6 Still under construction , Some interfaces are IPv6 Less than IPv4, Then you can specify to try again IPv4.
3)TLS1.3– 1RTT Savings of :
TLS1.3 take SSL handshake 2 individual RTT Reduced to 1 individual RTT, To reduce the SSL Probability of handshake failure .iOS12.2 Start ,NSURLSession Support TLS1.3. Only server upgrade support is required TLS1.3 that will do , No change on the end .
4)IP Composite connection race :
Use TCP Connection speed measurement , The purpose is to eliminate the bad IP, Choose the best IP, So as to improve the success probability of the request .
边栏推荐
- Pop up layer prompt in the background
- leetcode 题目
- 自动打包压缩备份下载及删除 bat脚本命令
- Dynamic programming -- digital statistics DP
- Sort 2 bubble sort and quick sort (recursive and non recursive explanation)
- Ansa secondary development - apps and ansa plug-in management
- PHP mb_ Substr Chinese garbled code
- 排序4-堆排序与海量TopK问题
- Use js direct OSS to store files in Alibaba cloud and solve the limitation of large file upload server
- 栈的介绍与实现(详解)
猜你喜欢

Huada chip hc32f4a0 realizes RS485 communication DMA transceiver

HM二次开发 - Data Names及其使用

HyperMesh自动保存(增强版)插件使用说明

IT远程运维是什么意思?远程运维软件哪个好?

HM secondary development - data names and its use

遭MQ连连干翻后的醒悟!含恨码出这份MQ手册助力秋招之旅

KubeEdge发布云原生边缘计算威胁模型及安全防护技术白皮书

Record doc

A good start

I'll show you a little chat! Summary of single merchant function modules
随机推荐
Wake up after being repeatedly upset by MQ! Hate code out this MQ manual to help the journey of autumn recruitment
KubeEdge发布云原生边缘计算威胁模型及安全防护技术白皮书
后台弹出layer提示
QML signal and handler event system
Curl returns blank or null without output. Solve the problem
Ansa secondary development - apps and ansa plug-in management
Baidu editor ueeditor, when editing too much content, the toolbar is not visible, which is not convenient for editing or uploading problems
关于MIT6.828_HW9_barriers xv6 homework9的一些问题
Using pyqt to design gui in ABAQUS
使用js直传oss阿里云存储文件,解决大文件上传服务器限制
在abaqus中使用PyQt设计GUI
curl无输出返回空白或者null问题解决
Reentrant and non reentrant
Li Hongyi, machine learning 5. Tips for neural network design
PHP计算坐标距离
The epidemic dividend disappeared, and the "home fitness" foam dissipated
Automatic conversion and cast
QT packaging
关于标准IO缓冲区的问题
每一个账号对应所有密码,再每一个密码对应所有账号暴力破解代码怎么写?...