当前位置:网站首页>There is another problem just online... Warm
There is another problem just online... Warm
2022-07-07 22:53:00 【Yes' level training strategy】
Hello everyone , I am a yes.
I've come to the online investigation experience again !
Here's the thing , Today, my colleague gave me feedback on a question .

Because our application needs to synchronize order information from a third party , If the user has not entered the order page for a period of time , Then, after entering again, it will automatically pull the order from the third party , In this way, the order information can be updated in time , Prevent users from operating expired orders .
In the near future , This colleague found that every time he clicked the order list, he would trigger a full pull , This is obviously unreasonable , Very expensive resources for back-end tasks .
At first I thought it had nothing to do with me , Maybe there is something wrong with the front-end code BUG ( Ha ha ha , I thought so last time ).
So I told my colleagues at the front end , After investigation , He definitely told me that the code must be ok , Only users who have not synchronized their orders for more than an hour , When you enter the order page again, the pull will be triggered .
I see his vows , I believe . Can't , I can only study it myself .
I really found the problem in this research , And tracing back to the source, it was caused by a problem encountered before , It's really a buckle, this ring !
Start troubleshooting
I'll log in to the test account first , It is found that it is impossible to reproduce what colleagues said that each click on the order list will trigger the full pull of orders .
Good! , lose the first battle .
Then I communicated with him , I find , It's an example ? therefore , Find out individual users who will have this situation .
Simulate , When the full order pull task is executed , In fact, the report is wrong , What's wrong accessToken Be overdue .
What we and the third party authorized to go is oauth2
in other words , Authorized to us by a third party token Out of date , This leads to an error in our order pull interface , So the task failed .
So , I doubt it again token The code of , Because we have a mission , Will be based on token The expiration time of , Use... In advance refreshToken In exchange for the latest token .
therefore , It's impossible to be reasonable token Overdue error reports , So I visually saw that this must be a refresh token There's something wrong with your task , Lead to token Be overdue , This causes the order pull task to fail . Then the front end will not record the failed task time , Therefore, when you enter the order page again, you find that it has not been synchronized for more than one hour , Then trigger full pull .
At this time , I want to find someone responsible for refreshing token Colleagues in the task of , After a round of searching , I found that I wrote it …

I checked that the scheduled refresh task is really running , That can only be a refresh token There's something wrong with your request , Let me check the log , There was a mistake !

This is a mistake , I've seen it before .
This is a call to a third-party refresh token Interface , Then the error returned by the third party , I didn't seem to have any clue , The lack of code , what code?

As you can see from the code above , Refresh token The interface only needs to pass these two parameters , There are no other operations or the like .
also , When I saw this mistake , I'll get it right here refreshToken Make a test call , It is found that there is no error at all , Can successfully return accessToken.
And after many days of observation , I found that some users can refresh successfully , And some can't .
because , Refresh token The interface is so simple , And the error report is returned by the other party , And from the wrong information, it seems to have nothing to do with me , Taken for granted , I think there must be something wrong with the other party's interface , What do I think? There's no room for mistakes on my side ( Remember this sentence ).
therefore , I said I couldn't handle this problem before , Throw the pot directly to a third party ( Because the third party has had problems many times ) 了 , Who knows it's coming back now .
Can't , This problem happened again , Now I can only take this user's refreshToken Try again locally .
By coincidence , I looked in the database before refreshToken , This time I used the company's internal tools to get , Then I found Huadian !

You can see refreshToken It's empty ??? I immediately boarded it and checked it from the library , Found that there are data !!

It's numb , I'm numb again , So what happened ??
I checked it immediately token The code for the task , Confirmed my sql It does get refreshToken , Since there are values in the database , Then I can “ conclude ” When I go to refresh the task refreshToken It must not be empty !
And all of a sudden , I found that this acquisition is cached !

Just a flash of light , I'll check the cache right away , Found in the cache refreshToken It's empty , I wonder which bastard put... In the cache refreshToken Deleted .

immediately , I denied the idea , It should be impossible for us to have such requirements and Implementation …
No idea , I went to see the company's internal tool call to get token Code for , It is found that what is called is a rpc Interface , Because I don't have the code for that service , So I went to ask an old colleague , He was a little impressed , Here comes a sentence :

Good! , I got caught , Make a direct alignment with the modified colleague , Who knows, the other party only replied three words :

I'll go straight to one :
So far, the case has been solved …
The colleague's idea is like this : He thinks he can get token No need refreshToken Of , So out of select The rules of what to take , He chose not to take refreshToken, In this way, the slice cache is not stuffed into the cache refreshToken value .
Then the authorization service is written at the beginning , At that time, the service of this colleague was not taken out A, So about token The acquisition and writing of are implemented by the authorization service itself operating the database , So I'm pretty sure my code does get from the database refreshToken , You wouldn't even think of refreshToken It will be empty .
The problem is that they share a cache key , service A Out of the principle of economy , The user authorization information is not inserted into the cache refreshToken , This causes the authorization service to obtain user authorization information , Due to hit cache , Get the value directly from the cache , And there's nothing in the cache refreshToken Value , So call a third-party refresh token At the interface , refreshToken The value passed is empty !
So the third party returned an error :

thus , I realized the lack of code The meaning of … I want to say that the error message returns refreshToken Is the parameter null fragrant , Give me the whole code, I don't even know what it is code!
then , For those authorized services before services A For users who put it in the cache , Their refresh authorization is normal , Because the authorization service will refreshToken Put it in the cache .
Okay , The investigation is complete , The final treatment is service A Will also be refreshToken Put it in the cache .
Last
You can see , In fact, this investigation does not involve any advanced technology , In fact, it is multi-party linkage , And mistakes caused by poor consideration . In fact, most errors in the production environment are details , For example, the parameter configuration is incorrect , Wrote an additional judgment and so on .
Let's summarize this experience :
- The correctness of the cache should be considered when obtaining data , You can't just rely on the database , Don't forget to cache
- Operation of convergence service , That is, the service division is clear and independent , Try not to implement the functions of other services internally , In this way, multiple changes and missing changes can be avoided when the requirements are changed , The above problems will not happen , Unified constraints , Most comfortable
- The error message is clear , Like the error report above, if it's not missing code It is refreshToken The parameter is empty. , I may have finished checking when I first saw this error report , You don't have to wait until now ( Trust values are also important , There are many mistakes , Gradually distrust each other's service )
- Global awareness is key . Even if you are responsible for only one service , Have the opportunity to know more about other people's services , Especially their own upstream and downstream , There's something wrong with this , The brain can scan the whole situation clearly , Quickly locate where problems may be found , This is the difference between Daniel and ordinary people ( You can't handle , I'll finish it in two minutes ).
That's about it , If you have a need , You can also take this experience for an interview , Ha ha ha , Don't be polite to me !
I am a yes, From a little bit to a billion , Let's look forward to the next online investigation !
边栏推荐
- Variables and constants
- Line measurement - graphic reasoning -9- line problem class
- Matplotlib快速入门
- Robot autonomous exploration DSVP: code parsing
- Take full control! Create a "leading cockpit" for smart city construction
- 变量与常量
- How to choose the appropriate automated testing tools?
- Debezium系列之:源码阅读之BinlogReader
- JS number is insufficient, and 0 is added
- LeetCode203. Remove linked list elements
猜你喜欢

不夸张地说,这是我见过最通俗易懂的,pytest入门基础教程

ASP. Net core introduction V

Visual design form QT designer design gui single form program

UWA问答精选

Robot autonomous exploration series papers environment code

Line test - graphic reasoning - 3 - symmetric graphic class

Select sort (illustration +c code)

XMIND mind mapping software sharing

微服務遠程Debug,Nocalhost + Rainbond微服務開發第二彈

Line test graph reasoning graph group class
随机推荐
微服务远程Debug,Nocalhost + Rainbond微服务开发第二弹
Leetcode94. Middle order traversal of binary trees
0-5vac to 4-20mA AC current isolated transmitter / conversion module
Revit secondary development - project file to family file
Redis cluster installation
行测-图形推理-8-图群类
LeetCode707. Design linked list
Sword finger offer 28 Symmetric binary tree
行测-图形推理-3-对称图形类
行测-图形推理-5-一笔画类
Line test graph reasoning graph group class
Gazebo import the mapping model created by blender
Microservice Remote debug, nocalhost + rainbond microservice Development second Bomb
Unity technical notes (II) basic functions of scriptableobject
行测-图形推理-4-字母类
Install mxnet GPU version
行测-图形推理-6-相似图形类
Antd date component appears in English
Unity development --- the mouse controls the camera to move, rotate and zoom
The PHP source code of the new website + remove authorization / support burning goose instead of pumping