当前位置:网站首页>An online duplicate of a hidden bug
An online duplicate of a hidden bug
2022-07-26 11:47:00 【Wu_ Candy】
Preface
A previous project has been online for a long time , Recently, there was a sudden explosion Bug, Finally, the scope of impact will be evaluated Bug Upgraded to failure , Just because the amount of data affected is 10000 Bar or so , It has a certain impact on the business side .
But because it does not involve financial losses ,Bug Repair the data after repair , So the final level is also low .
Today I share with you this hidden online Bug, It can also be used for reference in work projects ~
Demand background
The theme : B & B check-in return visit questionnaire
describe :
For customers staying in B & B , You need to send the return visit questionnaire of this B & B check-in to the customer on the same day or the next day after leaving the hotel , In order to collect customers' comments or suggestions on the check-in experience
explain :
Because of the large amount of data , It's using Hive Inventory storage data and logical processing
Problem analysis
1. Business logic
t2 surface : Storage ( Yesterday, + At this time today ) Check out customer list information
t3 surface : Store the customer list information that has sent the follow-up questionnaire
appear Bug The core business logic of is as follows :

The equivalent association field used is :
t2.id = t3.primary_id
from t2 surface Filter out t3 surface Data in , By using t2 surface As the main table left join Sub table t3 surface
And then determine t3.primary_id is null, The description has not been sent .
2. Problems arise
t2.id Field type is :string type
t3.primary_id Field type is :bigint type
When two different types of fields are equivalently associated , When t3.primary_id The actual value is more than 16 The value of bit is in MapReduce The field type is changed from bigint Type conversion to double type , At this point, there is a lack of accuracy .

for example :
# This is just an example , Does not represent the actual true value
t2.id = 1000110000000000
t3.primary_id = 10001100000000009
The above example ,t3 surface There is already primary_id = 10001100000000009 The record of sent a follow-up questionnaire
here t3.primary_id = 10001100000000009 The value of exceeds 16 position , Convert it to double Type , The value is truncated and then correlated equivalently
Will appear t3.primary_id = 1000110000000000( Will end with 9 Get rid of ) And again t2 surface Medium id relation
There are already primary_id = 10001100000000009 The record of , So I think this record should not send a follow-up questionnaire , Directly filtered out
In the actual business, this record is not in the record form of the sent return visit questionnaire , Only because of the lack of precision, the two records on the inappropriate association are truncated and wrongly associated
result : Lead to customers who should send a follow-up questionnaire , Not sent
Solution
1.Bug solve
At that time, this appeared online Bug The first solution is : Type the equivalent associated fields , All converted to string Type and then conduct equivalent Association
Use cast function Do type conversion , This can solve the problem of lack of accuracy

2. Missing data processing
The modified code logic at the equivalent connection , According to the time range of business data affected , Do it again , Find out that you should have sent a follow-up questionnaire, but so Bug And the unsent customer list , Send a follow-up questionnaire .
summary
Through the previous analysis , You can see the online Bug More hidden , If you are not clear about the implicit conversion rules , This problem is likely to occur .
Today I will share with you this implicit transformation Bug, I hope that we can learn from one or two when there are similar problems in future work projects .
Except for Bug Remedial measures after , What can we do in advance in future projects ?
Once there is equivalent Correlation , You need to consider whether the field types of associated fields are consistent ; When defining tables , It is necessary to consider the data range of the possible maximum value in combination with specific business conditions ; Don't be afraid of this problem , Without thinking , As long as the equivalence relation is used cast functionConduct string Conversion of type ;
边栏推荐
- Synchronized and reentrantlock
- Three properties of concurrency
- 沟通中经常用到的几个库存术语
- Data type of SQL Server database
- Application scheme of ankerui residual pressure monitoring system in residential quarter
- Leetcode / Scala - sum of two numbers, three numbers, four numbers, and N numbers
- 浅谈Web Vitals
- Esp8266 Arduino programming example GPIO input and output
- 你敢信?开发一个管理系统我只用了两天时间
- Harbor2.2 用户角色权限速查
猜你喜欢

ESP8266-Arduino编程实例-开发环境搭建(基于Arduino IDE)

Practice of microservice in solving Library Download business problems

常用库安装

LeetCode / Scala - 两数,三数,四数,N数之和

正点原子stm32中hal库iic模拟`#define SDA_IN() {GPIOB->MODER&=~(3<<(9*2));GPIOB->MODER|=0<<9*2;}` //PB9 输入模式

Data visualization - White Snake 2: black snake robbery (2)

Real time streaming protocol --rtsp

最新心形拼图小程序源码+带流量主

QT——LCDNumber

梅科尔工作室-华为14天鸿蒙设备开发实战笔记八
随机推荐
剑指 Offer 25. 合并两个排序的链表
Data Lake (19): SQL API reads Kafka data and writes it to iceberg table in real time
[communication principle] Chapter 3 -- random process [i]
Esp8266 Arduino programming example - know esp8266
js使用WebUploader做大文件的分块和断点续传
微服务化解决文库下载业务问题实践
Back to the top of several options (JS)
数据库组成存储引擎
了解 useRef 一篇就够了
数据库组成 触发器
社区点赞业务缓存设计优化探索
System call capture and analysis conclusion making system call log collection system
程序员成长第二十八篇:管理者如何才能不亲力亲为?
Cmake常用命令总结
Basic recursive equation of DP
Wechat applet - Advanced chapter Lin UI component library source code analysis button component (I)
【通信原理】第三章 -- 随机过程[上]
武林头条-建站小能手争霸赛
[cloud co creation] what good habits do you adhere to in order to write good code?
沟通中经常用到的几个库存术语