当前位置:网站首页>An online duplicate of a hidden bug
An online duplicate of a hidden bug
2022-07-27 04:04:00 【Wu_ Candy】
Preface
A previous project has been online for a long time , Recently, there was a sudden explosion Bug, Finally, the scope of impact will be evaluated Bug Upgraded to failure , Just because the amount of data affected is 10000 Bar or so , It has a certain impact on the business side .
But because it does not involve financial losses ,Bug Repair the data after repair , So the final level is also low .
Today I share with you this hidden online Bug, It can also be used for reference in work projects ~
Demand background
The theme : B & B check-in return visit questionnaire
describe :
For customers staying in B & B , You need to send the return visit questionnaire of this B & B check-in to the customer on the same day or the next day after leaving the hotel , In order to collect customers' comments or suggestions on the check-in experience
explain :
Because of the large amount of data , It's using Hive Inventory storage data and logical processing
Problem analysis
1. Business logic
**t2 surface :** Storage ( Yesterday, + At this time today ) Check out customer list information
**t3 surface :** Store the customer list information that has sent the follow-up questionnaire
appear Bug The core business logic of is as follows :

The equivalent association field used is :
t2.id = t3.primary_id
from t2 surface Filter out t3 surface Data in , By using t2 surface As the main table left join Sub table t3 surface
And then determine t3.primary_id is null, The description has not been sent .
2. Problems arise
t2.id Field type is :string type
t3.primary_id Field type is :bigint type
When two different types of fields are equivalently associated , When t3.primary_id The actual value is more than 16 The value of bit is in MapReduce The field type is changed from bigint Type conversion to double type , At this point, there is a lack of accuracy .

for example :
# This is just an example , Does not represent the actual true value
t2.id = 1000110000000000
t3.primary_id = 10001100000000009
The above example ,t3 surface There is already primary_id = 10001100000000009 The record of sent a follow-up questionnaire
here t3.primary_id = 10001100000000009 The value of exceeds 16 position , Convert it to double Type , The value is truncated and then correlated equivalently
Will appear t3.primary_id = 1000110000000000( Will end with 9 Get rid of ) And again t2 surface Medium id relation
There are already primary_id = 10001100000000009 The record of , So I think this record should not send a follow-up questionnaire , Directly filtered out
In the actual business, this record is not in the record form of the sent return visit questionnaire , Only because of the lack of precision, the two records on the inappropriate association are truncated and wrongly associated
result : Lead to customers who should send a follow-up questionnaire , Not sent
Solution
1.Bug solve
At that time, this appeared online Bug The first solution is : Type the equivalent associated fields , All converted to string Type and then conduct equivalent Association
Use cast function Do type conversion , This can solve the problem of lack of accuracy

2. Missing data processing
The modified code logic at the equivalent connection , According to the time range of business data affected , Do it again , Find out that you should have sent a follow-up questionnaire, but so Bug And the unsent customer list , Send a follow-up questionnaire .
summary
Through the previous analysis , You can see the online Bug More hidden , If you are not clear about the implicit conversion rules , This problem is likely to occur .
Today I will share with you this implicit transformation Bug, I hope that we can learn from one or two when there are similar problems in future work projects .
Except for Bug Remedial measures after , What can we do in advance in future projects ?
- Once there is equivalent Correlation , You need to consider whether the field types of associated fields are consistent ;
- When defining tables , It is necessary to consider the data range of the possible maximum value in combination with specific business conditions ;
- Don't be afraid of this problem , Without thinking , As long as the equivalence relation is used
cast functionConduct string Conversion of type ;
Welcome to your attention The way of immeasurable testing official account , reply Claim resources
Python+Unittest frame API automation 、
Python+Unittest frame API automation 、
Python+Pytest frame API automation 、
Python+Pandas+Pyecharts Big data analysis 、
Python+Selenium frame Web Of UI automation 、
Python+Appium frame APP Of UI automation 、
Python Programming learning resources dry goods 、
Vue Front end component framework development 、
Resources and code Free ~
remarks : My official account has been officially opened. , betake IT Sharing of Internet technology .
contain : Data analysis 、 big data 、 machine learning 、 Test Development 、API Interface automation 、 Test operation and maintenance 、UI automation 、 Performance testing 、 code detection 、 Programming technology, etc .
WeChat search official account : The way of immeasurable testing
Add the attention , Let's grow together !
边栏推荐
- Chapter 4 决策树和随机森林
- 分享当下人生——一个高中毕业生在中央电视台的六星期实习经历
- [untitled]
- DataX cannot connect to the corresponding database (yes under windows, but failed under Linux)
- Worthington papain dissociation system solution
- 明汯投资裘慧明:长期优异超额的背后考验的是团队的投研能力和策略的完整性
- Feitengtengrui d2000 won the "top ten hard core technologies" award of Digital China
- Connman introduction
- Golang jwt跨域鉴权
- 开机启动流程及营救模式
猜你喜欢

NLP hotspots from ACL 2022 onsite experience

分享当下人生——一个高中毕业生在中央电视台的六星期实习经历
![[Android synopsis] kotlin multithreaded programming (I)](/img/04/4349bacbd401868d73a3b05d018b66.png)
[Android synopsis] kotlin multithreaded programming (I)

First pass of routing strategy

小于等于K的最大子数组累加和

Chapter 4 决策树和随机森林

C. Cypher

H.265网页播放器EasyPlayer对外开放录像的方法

深圳家具展首日,金可儿展位三大看点全解锁!

Characteristics and determination scheme of Worthington pectinase
随机推荐
零基础小白也能懂的 Redis 数据库,手把手教你易学易用!
H.265网页播放器EasyPlayer对外开放录像的方法
Prime factorization -- C (GCC) -- PTA
Characteristics and determination scheme of Worthington pectinase
"Date: write error: no space left on device" solution
Development of NFT digital collection system: Xiaoyi digital intelligence helps brands launch NFT with one click on the chain
flink cdc 到MySQL8没问题,到MySQL5读有问题,怎么办?
Parallels Desktop启动虚拟机“操作失败”问题解决
Kotlin中lateinit和lazy的原理区别是什么
11.zuul路由网关
安装umi4阻碍一天的问题解决了
分享当下人生——一个高中毕业生在中央电视台的六星期实习经历
PSINS工具箱中轨迹生成工具详细解析
Will this flinkcdc monitor all tables in the database? Or the designated table? I look at the background log. It monitors all tables. If it monitors
【obs】动态码率:码率估算
Plato farm has a new way of playing, and the arbitrage eplato has secured super high returns
电商系统结合商品秒杀活动,VR全景不断带来收益
「Gonna Be Alright 会好的」数藏现已开售!感受艺术家的心灵共鸣
Learning and understanding of four special data types of redis
暑假加餐|有钱人和你想的不一样(第5天)+电力系统潮流仿真(文档和Matlab代码)