当前位置:网站首页>There is another problem just online... Warm
There is another problem just online... Warm
2022-07-07 22:53:00 【Yes' level training strategy】
Hello everyone , I am a yes.
I've come to the online investigation experience again !
Here's the thing , Today, my colleague gave me feedback on a question .
Because our application needs to synchronize order information from a third party , If the user has not entered the order page for a period of time , Then, after entering again, it will automatically pull the order from the third party , In this way, the order information can be updated in time , Prevent users from operating expired orders .
In the near future , This colleague found that every time he clicked the order list, he would trigger a full pull , This is obviously unreasonable , Very expensive resources for back-end tasks .
At first I thought it had nothing to do with me , Maybe there is something wrong with the front-end code BUG ( Ha ha ha , I thought so last time ).
So I told my colleagues at the front end , After investigation , He definitely told me that the code must be ok , Only users who have not synchronized their orders for more than an hour , When you enter the order page again, the pull will be triggered .
I see his vows , I believe . Can't , I can only study it myself .
I really found the problem in this research , And tracing back to the source, it was caused by a problem encountered before , It's really a buckle, this ring !
Start troubleshooting
I'll log in to the test account first , It is found that it is impossible to reproduce what colleagues said that each click on the order list will trigger the full pull of orders .
Good! , lose the first battle .
Then I communicated with him , I find , It's an example ? therefore , Find out individual users who will have this situation .
Simulate , When the full order pull task is executed , In fact, the report is wrong , What's wrong accessToken Be overdue .
What we and the third party authorized to go is oauth2
in other words , Authorized to us by a third party token Out of date , This leads to an error in our order pull interface , So the task failed .
So , I doubt it again token The code of , Because we have a mission , Will be based on token The expiration time of , Use... In advance refreshToken In exchange for the latest token .
therefore , It's impossible to be reasonable token Overdue error reports , So I visually saw that this must be a refresh token There's something wrong with your task , Lead to token Be overdue , This causes the order pull task to fail . Then the front end will not record the failed task time , Therefore, when you enter the order page again, you find that it has not been synchronized for more than one hour , Then trigger full pull .
At this time , I want to find someone responsible for refreshing token Colleagues in the task of , After a round of searching , I found that I wrote it …
I checked that the scheduled refresh task is really running , That can only be a refresh token There's something wrong with your request , Let me check the log , There was a mistake !
This is a mistake , I've seen it before .
This is a call to a third-party refresh token Interface , Then the error returned by the third party , I didn't seem to have any clue , The lack of code , what code?
As you can see from the code above , Refresh token The interface only needs to pass these two parameters , There are no other operations or the like .
also , When I saw this mistake , I'll get it right here refreshToken Make a test call , It is found that there is no error at all , Can successfully return accessToken.
And after many days of observation , I found that some users can refresh successfully , And some can't .
because , Refresh token The interface is so simple , And the error report is returned by the other party , And from the wrong information, it seems to have nothing to do with me , Taken for granted , I think there must be something wrong with the other party's interface , What do I think? There's no room for mistakes on my side ( Remember this sentence ).
therefore , I said I couldn't handle this problem before , Throw the pot directly to a third party ( Because the third party has had problems many times ) 了 , Who knows it's coming back now .
Can't , This problem happened again , Now I can only take this user's refreshToken Try again locally .
By coincidence , I looked in the database before refreshToken , This time I used the company's internal tools to get , Then I found Huadian !
You can see refreshToken It's empty ??? I immediately boarded it and checked it from the library , Found that there are data !!
It's numb , I'm numb again , So what happened ??
I checked it immediately token The code for the task , Confirmed my sql It does get refreshToken , Since there are values in the database , Then I can “ conclude ” When I go to refresh the task refreshToken It must not be empty !
And all of a sudden , I found that this acquisition is cached !
Just a flash of light , I'll check the cache right away , Found in the cache refreshToken It's empty , I wonder which bastard put... In the cache refreshToken Deleted .
immediately , I denied the idea , It should be impossible for us to have such requirements and Implementation …
No idea , I went to see the company's internal tool call to get token Code for , It is found that what is called is a rpc Interface , Because I don't have the code for that service , So I went to ask an old colleague , He was a little impressed , Here comes a sentence :
Good! , I got caught , Make a direct alignment with the modified colleague , Who knows, the other party only replied three words :
I'll go straight to one :
So far, the case has been solved …
The colleague's idea is like this : He thinks he can get token No need refreshToken Of , So out of select The rules of what to take , He chose not to take refreshToken, In this way, the slice cache is not stuffed into the cache refreshToken value .
Then the authorization service is written at the beginning , At that time, the service of this colleague was not taken out A, So about token The acquisition and writing of are implemented by the authorization service itself operating the database , So I'm pretty sure my code does get from the database refreshToken , You wouldn't even think of refreshToken It will be empty .
The problem is that they share a cache key , service A Out of the principle of economy , The user authorization information is not inserted into the cache refreshToken , This causes the authorization service to obtain user authorization information , Due to hit cache , Get the value directly from the cache , And there's nothing in the cache refreshToken Value , So call a third-party refresh token At the interface , refreshToken The value passed is empty !
So the third party returned an error :
thus , I realized the lack of code The meaning of … I want to say that the error message returns refreshToken Is the parameter null fragrant , Give me the whole code, I don't even know what it is code!
then , For those authorized services before services A For users who put it in the cache , Their refresh authorization is normal , Because the authorization service will refreshToken Put it in the cache .
Okay , The investigation is complete , The final treatment is service A Will also be refreshToken Put it in the cache .
Last
You can see , In fact, this investigation does not involve any advanced technology , In fact, it is multi-party linkage , And mistakes caused by poor consideration . In fact, most errors in the production environment are details , For example, the parameter configuration is incorrect , Wrote an additional judgment and so on .
Let's summarize this experience :
- The correctness of the cache should be considered when obtaining data , You can't just rely on the database , Don't forget to cache
- Operation of convergence service , That is, the service division is clear and independent , Try not to implement the functions of other services internally , In this way, multiple changes and missing changes can be avoided when the requirements are changed , The above problems will not happen , Unified constraints , Most comfortable
- The error message is clear , Like the error report above, if it's not missing code It is refreshToken The parameter is empty. , I may have finished checking when I first saw this error report , You don't have to wait until now ( Trust values are also important , There are many mistakes , Gradually distrust each other's service )
- Global awareness is key . Even if you are responsible for only one service , Have the opportunity to know more about other people's services , Especially their own upstream and downstream , There's something wrong with this , The brain can scan the whole situation clearly , Quickly locate where problems may be found , This is the difference between Daniel and ordinary people ( You can't handle , I'll finish it in two minutes ).
That's about it , If you have a need , You can also take this experience for an interview , Ha ha ha , Don't be polite to me !
I am a yes, From a little bit to a billion , Let's look forward to the next online investigation !
边栏推荐
- 行测-图形推理-9-线条问题类
- “拧巴”的早教行业:万亿市场,难出巨头
- Nx10.0 installation tutorial
- 面试百问:如何测试App性能?
- 如何选择合适的自动化测试工具?
- Revit secondary development - wall opening
- Revit secondary development - get the thickness / length / height of the beam
- Revit secondary development - Hide occlusion elements
- 全面掌控!打造智慧城市建设的“领导驾驶舱”
- Unity technical notes (I) inspector extension
猜你喜欢
GBU1510-ASEMI电源专用15A整流桥GBU1510
数字化转型:五个步骤推动企业进步
Time convolution Network + soft threshold + attention mechanism to realize residual life prediction of mechanical equipment
Redis集群安装
[problem] pytorch installation
苹果在iOS 16中通过'虚拟卡'安全功能进一步进军金融领域
Matplotlib快速入门
UnicodeDecodeError: ‘gbk‘ codec can‘t decode byte 0xf9 in position 56: illegal multibyte sequence
Line test - graphic reasoning - 3 - symmetric graphic class
如何选择合适的自动化测试工具?
随机推荐
Debezium series: support the use of variables in the Kill Command
PHP records the pitfalls encountered in the complete docking of Tencent cloud live broadcast and im live group chat
数字化转型:五个步骤推动企业进步
UnicodeDecodeError: ‘gbk‘ codec can‘t decode byte 0xf9 in position 56: illegal multibyte sequence
面试百问:如何测试App性能?
Robot autonomous exploration DSVP: code parsing
Line test - graphic reasoning - 3 - symmetric graphic class
php 记录完整对接腾讯云直播以及im直播群聊 所遇到的坑
Aspose. Words merge cells
ASEMI整流桥KBPC1510的型号数字代表什么
Take full control! Create a "leading cockpit" for smart city construction
全面掌控!打造智慧城市建设的“领导驾驶舱”
Qt Graphicsview图形视图使用总结附流程图开发案例雏形
Details of the open source framework of microservice architecture
Yarn开启ACL用户认证之后无法查看Yarn历史任务日志解决办法
XMIND mind mapping software sharing
Debezium系列之:引入对 LATERAL 运算符的支持
Antd date component appears in English
行测-图形推理-2-黑白格类
Matplotlib快速入门