当前位置:网站首页>There is another problem just online... Warm
There is another problem just online... Warm
2022-07-07 22:53:00 【Yes' level training strategy】
Hello everyone , I am a yes.
I've come to the online investigation experience again !
Here's the thing , Today, my colleague gave me feedback on a question .
Because our application needs to synchronize order information from a third party , If the user has not entered the order page for a period of time , Then, after entering again, it will automatically pull the order from the third party , In this way, the order information can be updated in time , Prevent users from operating expired orders .
In the near future , This colleague found that every time he clicked the order list, he would trigger a full pull , This is obviously unreasonable , Very expensive resources for back-end tasks .
At first I thought it had nothing to do with me , Maybe there is something wrong with the front-end code BUG ( Ha ha ha , I thought so last time ).
So I told my colleagues at the front end , After investigation , He definitely told me that the code must be ok , Only users who have not synchronized their orders for more than an hour , When you enter the order page again, the pull will be triggered .
I see his vows , I believe . Can't , I can only study it myself .
I really found the problem in this research , And tracing back to the source, it was caused by a problem encountered before , It's really a buckle, this ring !
Start troubleshooting
I'll log in to the test account first , It is found that it is impossible to reproduce what colleagues said that each click on the order list will trigger the full pull of orders .
Good! , lose the first battle .
Then I communicated with him , I find , It's an example ? therefore , Find out individual users who will have this situation .
Simulate , When the full order pull task is executed , In fact, the report is wrong , What's wrong accessToken Be overdue .
What we and the third party authorized to go is oauth2
in other words , Authorized to us by a third party token Out of date , This leads to an error in our order pull interface , So the task failed .
So , I doubt it again token The code of , Because we have a mission , Will be based on token The expiration time of , Use... In advance refreshToken In exchange for the latest token .
therefore , It's impossible to be reasonable token Overdue error reports , So I visually saw that this must be a refresh token There's something wrong with your task , Lead to token Be overdue , This causes the order pull task to fail . Then the front end will not record the failed task time , Therefore, when you enter the order page again, you find that it has not been synchronized for more than one hour , Then trigger full pull .
At this time , I want to find someone responsible for refreshing token Colleagues in the task of , After a round of searching , I found that I wrote it …
I checked that the scheduled refresh task is really running , That can only be a refresh token There's something wrong with your request , Let me check the log , There was a mistake !
This is a mistake , I've seen it before .
This is a call to a third-party refresh token Interface , Then the error returned by the third party , I didn't seem to have any clue , The lack of code , what code?
As you can see from the code above , Refresh token The interface only needs to pass these two parameters , There are no other operations or the like .
also , When I saw this mistake , I'll get it right here refreshToken Make a test call , It is found that there is no error at all , Can successfully return accessToken.
And after many days of observation , I found that some users can refresh successfully , And some can't .
because , Refresh token The interface is so simple , And the error report is returned by the other party , And from the wrong information, it seems to have nothing to do with me , Taken for granted , I think there must be something wrong with the other party's interface , What do I think? There's no room for mistakes on my side ( Remember this sentence ).
therefore , I said I couldn't handle this problem before , Throw the pot directly to a third party ( Because the third party has had problems many times ) 了 , Who knows it's coming back now .
Can't , This problem happened again , Now I can only take this user's refreshToken Try again locally .
By coincidence , I looked in the database before refreshToken , This time I used the company's internal tools to get , Then I found Huadian !
You can see refreshToken It's empty ??? I immediately boarded it and checked it from the library , Found that there are data !!
It's numb , I'm numb again , So what happened ??
I checked it immediately token The code for the task , Confirmed my sql It does get refreshToken , Since there are values in the database , Then I can “ conclude ” When I go to refresh the task refreshToken It must not be empty !
And all of a sudden , I found that this acquisition is cached !
Just a flash of light , I'll check the cache right away , Found in the cache refreshToken It's empty , I wonder which bastard put... In the cache refreshToken Deleted .
immediately , I denied the idea , It should be impossible for us to have such requirements and Implementation …
No idea , I went to see the company's internal tool call to get token Code for , It is found that what is called is a rpc Interface , Because I don't have the code for that service , So I went to ask an old colleague , He was a little impressed , Here comes a sentence :
Good! , I got caught , Make a direct alignment with the modified colleague , Who knows, the other party only replied three words :
I'll go straight to one :
So far, the case has been solved …
The colleague's idea is like this : He thinks he can get token No need refreshToken Of , So out of select The rules of what to take , He chose not to take refreshToken, In this way, the slice cache is not stuffed into the cache refreshToken value .
Then the authorization service is written at the beginning , At that time, the service of this colleague was not taken out A, So about token The acquisition and writing of are implemented by the authorization service itself operating the database , So I'm pretty sure my code does get from the database refreshToken , You wouldn't even think of refreshToken It will be empty .
The problem is that they share a cache key , service A Out of the principle of economy , The user authorization information is not inserted into the cache refreshToken , This causes the authorization service to obtain user authorization information , Due to hit cache , Get the value directly from the cache , And there's nothing in the cache refreshToken Value , So call a third-party refresh token At the interface , refreshToken The value passed is empty !
So the third party returned an error :
thus , I realized the lack of code The meaning of … I want to say that the error message returns refreshToken Is the parameter null fragrant , Give me the whole code, I don't even know what it is code!
then , For those authorized services before services A For users who put it in the cache , Their refresh authorization is normal , Because the authorization service will refreshToken Put it in the cache .
Okay , The investigation is complete , The final treatment is service A Will also be refreshToken Put it in the cache .
Last
You can see , In fact, this investigation does not involve any advanced technology , In fact, it is multi-party linkage , And mistakes caused by poor consideration . In fact, most errors in the production environment are details , For example, the parameter configuration is incorrect , Wrote an additional judgment and so on .
Let's summarize this experience :
- The correctness of the cache should be considered when obtaining data , You can't just rely on the database , Don't forget to cache
- Operation of convergence service , That is, the service division is clear and independent , Try not to implement the functions of other services internally , In this way, multiple changes and missing changes can be avoided when the requirements are changed , The above problems will not happen , Unified constraints , Most comfortable
- The error message is clear , Like the error report above, if it's not missing code It is refreshToken The parameter is empty. , I may have finished checking when I first saw this error report , You don't have to wait until now ( Trust values are also important , There are many mistakes , Gradually distrust each other's service )
- Global awareness is key . Even if you are responsible for only one service , Have the opportunity to know more about other people's services , Especially their own upstream and downstream , There's something wrong with this , The brain can scan the whole situation clearly , Quickly locate where problems may be found , This is the difference between Daniel and ordinary people ( You can't handle , I'll finish it in two minutes ).
That's about it , If you have a need , You can also take this experience for an interview , Ha ha ha , Don't be polite to me !
I am a yes, From a little bit to a billion , Let's look forward to the next online investigation !
边栏推荐
- Redis cluster installation
- 微服务架构开源框架详情介绍
- Line measurement - graphic reasoning -9- line problem class
- Line test - graphic reasoning - 6 - similar graphic classes
- Get the exact offset of the element
- Revit secondary development - operation family documents
- Ligne - raisonnement graphique - 4 - classe de lettres
- Microservice Remote debug, nocalhost + rainbond microservice Development second Bomb
- How to choose the appropriate automated testing tools?
- Robot autonomous exploration DSVP: code parsing
猜你喜欢
Select sort (illustration +c code)
Micro service remote debug, nocalhost + rainbow micro service development second bullet
vite Unrestricted file system access to
Basic knowledge of linked list
Unity FAQ (I) lack of references
Digital transformation: five steps to promote enterprise progress
行测-图形推理-1-汉字类
ASEMI整流桥KBPC1510的型号数字代表什么
Force deduction - question 561 - array splitting I - step by step parsing
Sword finger offer 28 Symmetric binary tree
随机推荐
软件测评中心▏自动化测试有哪些基本流程和注意事项?
Line test - graphic reasoning -5- one stroke class
The PHP source code of the new website + remove authorization / support burning goose instead of pumping
微服务架构开源框架详情介绍
Failed to initialize rosdep after installing ROS
不夸张地说,这是我见过最通俗易懂的,pytest入门基础教程
Cataloger integrates lidar and IMU for 2D mapping
Robot autonomous exploration DSVP: code parsing
Ren Qian code compilation error modification
Remember an experience of using selectmany
「开源摘星计划」Loki实现Harbor日志的高效管理
Unity technical notes (II) basic functions of scriptableobject
Remember aximp once Use of exe tool
Visual studio 2019 installation
Robot autonomous exploration series papers environment code
数字化转型:五个步骤推动企业进步
行测-图形推理-5-一笔画类
Install mxnet GPU version
XMIND mind mapping software sharing
Vs custom template - take the custom class template as an example