当前位置:网站首页>What is the most challenging issue in Bi development?
What is the most challenging issue in Bi development?
2022-06-21 22:38:00 【Bi visualization of Parker data】
I sometimes work with business intelligence BI Chatting with colleagues on the project , Ask them about business intelligence BI In the process of project construction , What do you think is the most challenging ? Some people say that I am most afraid of complicated business logic , Some people say that they are most afraid that users' needs are not clear , Some people say that complex business scenarios do not know how to implement them through technology , Most challenging . What everyone said is very reasonable , Let me talk about my personal feeling about business intelligence BI What is the most challenging problem in the project development process , I think it's the data quality .
Why is data quality the biggest challenge
5、6 Years ago, I was still engaged in business intelligence BI I came across a project during development , The business is not complicated , It is to count the time difference of some time periods , Finally, calculate the time spent by each user , Do business intelligence BI Statistical analysis . But in the actual development process, it is found that , Even if you understand the business rules thoroughly , After the development, run the data to the actual production environment , There are always some data that don't match . Check the business logic repeatedly in the test development environment , No problem , I was in deep distress .

business intelligence BI - Parker data business intelligence BI Visual analysis platform
This business intelligence BI What is the problem in the project , yes ETL I lost my data while running ? There is still something wrong with my code ? Or I don't understand the business enough ? I began to doubt my ability . I don't think so , I am so capable , In business intelligence BI I have never missed a project , Not even that . After repeated self-examination , I can basically conclude that , There is a problem with the data of the production environment .
Because in some business intelligence BI On the project , Development testing and production environments are completely isolated , The data in the development test environment is limited 、 incomplete , Not as complete as the production environment . therefore , Apply to see business intelligence BI The project analyzes the actual data of the production environment , As soon as it turns out , I found the problem , There is a problem with the data in the production environment , And the problem is still very big .
How to deal with data quality problems
business intelligence BI In the project , A normal data logic , Comb repeatedly in the production environment , The result is that 24 Cases of abnormal data . How did the cause come about ? It's in the business system , There is a business process , such as A、B、C、D, Normally it should be a linear 、 Irreversible operation process .

The business process - Parker data business intelligence BI Visual analysis platform
But some new users are actually using it , For example, after processing A Under node N Actions , That's it B node ,B When the node is processed, it will arrive C node . According to the truth , here we are C The node is impossible to go back and re pair A The business of the node does any operation . As a result, such problems arise in the system . As a result, the sequence of data nodes in the background database is completely disordered in some scenarios , There are a large number of abnormal operation data , Let business intelligence BI There is a problem with the data quality of the project .
So in business intelligence BI Statistical analysis , The time series generated by these abnormal operation data should not be calculated . Of course , The actual scene is more complicated than I described . Let me give you a general description of , There is a row of rooms here , The number of rooms from left to right is infinite .
Each room has a data , Every room you go forward , Remember what you did in each room before , What kind of data . By the end of the third day N When I have a room , See a data , This data can form a correct time series with a certain data of the room you have walked before , So you need to remember what was put in each room before , Then calculate the time difference between the two data , write down .

Data visualization - Parker data business intelligence BI Visual analysis platform
When you go further , Another data is found , This data corresponds to a certain data in the previous room , Then the calculation conditions you completed last time will not hold , It needs to be reassembled again . This process is very complicated , We sort out all the scenes , Yes 24 Kind of . Take these scenarios and business intelligence BI The business personnel of the project should confirm , The business people basically don't know , Can't confirm , Because the data is too confusing , It is beyond their understanding of the business .
But in the end , By looking at the data over and over again , Find scene , The business rules have been confirmed . Finally, to the development stage , For this job , It took me two weeks of business intelligence BI Development time . pure SQL And stored procedures cannot be implemented directly , Later, I wrote a program , combining ETL and SQL Just finished . also , Simulated 100 million pieces of data , Test all scenarios over and over again , No problem . It has been many years since it was launched , This business intelligence BI There have been no problems with the project .
Ideas for handling data quality problems
actually , This business intelligence BI The problem with the project is a logical flaw in the business system , It is easy to adjust the business system . That is, when the user operates to a certain node , Nodes that have completed the previous operation will not be allowed to go back to the operation , Just control the process .

Data visualization - Parker data business intelligence BI Visual analysis platform
So in the past, they have been doing this repeatedly , There will be no big problems in the business process , So they ignore business intelligence BI Data problems of the project . thus , Doing business intelligence BI Data statistical analysis , These issues need to be taken into account . As a result, after submitting this question , The supplier is from abroad , Said that it would not be solved until half a year later . therefore , This issue cannot be promoted from the business system , Only in business intelligence BI Level to solve , But the price is very high .
therefore , During the construction of business system , Many problems go beyond the data level , It is impossible to find many potential problems . Because users sometimes try to save time , Can also be used , These problems they usually don't realize , Because it doesn't have much impact on their daily work . To business intelligence BI level , Because the data needs to be statistically analyzed , A business rule corresponds to a processing rule , It needs to be defined in the development process . If a business rule has N A special data processing scenario , You need to correspond to N A data processing development process , It can not be ignored automatically like business personnel , The workload is huge .
Simply speaking , In the business system, the adjustment of this problem may only take half a day to complete . For data logic , Control over data quality , The more control at the source , The more obvious the effect . This is the problem ahead 、 Program preprocessing . We will not deal with it before , The more you put it back , Post processing , Once business intelligence BI And other projects involving data , The problem becomes even more difficult .

Data visualization - Parker data business intelligence BI Visual analysis platform
So a small data quality problem of business system is important to business intelligence BI It may take a lot of time and energy to deal with it , This requires the use of our business systems 、 operation 、 We should really pay attention to the standardization of procedures , It can greatly reduce business intelligence BI The time cost of implementing the development process . Including the inconsistency of data file information of multiple systems encountered before , It is caused by not planning in advance at the beginning of business system planning .
Can these problems be completely avoided at the beginning , Can't say absolutely , But it must be possible to avoid it in most cases . This requires our enterprise information department to have a very forward-looking judgment , We should not only pay attention to the construction of the current system 、 Construction quality , Also expect to deploy business intelligence in the future BI System expansion 、 Problems that may exist when various systems are connected , Have a clear plan and foresight , This requires a solid information project development 、 technology 、 project management 、 Data and other comprehensive consciousness and ability .
Today's sharing is here , Focus on big data business intelligence BI, Friends who like our content , Welcome to pay attention to and like support .
边栏推荐
- Games101 job 7- detailed explanation of implementation steps of multi thread speed up
- [in depth understanding of tcapulusdb technology] how to realize single machine installation of tmonitor
- 流式细胞分析Flowjo 10介绍以及超详细图文安装激活教程
- Wonderful review Figure 1 learn about Huawei cloud special dry goods
- GDB debugging practice (8) transfer startup parameters to the program
- Use the for loop to calculate the odd and even sums in 1-100 [method 1]
- Precautions for bitmap use
- Verilog参数例化时自动计算位宽的函数
- The concept of multiprocess and Multithread
- WPF data binding: data source target
猜你喜欢

Install RkNN toolkit Lite2 for itop-3568 development board

Uwp shadow effect

【深入理解TcaplusDB技术】 Tmonitor模块架构

刷题笔记(十六)--二叉树:修改与构造

【深入理解TcaplusDB技术】单据受理之表管理
![[leetcode] 8. String conversion integer (ATOI)](/img/e8/08986237d8945685888817f214d7a9.png)
[leetcode] 8. String conversion integer (ATOI)

About Eureka starting successfully but accessing 404

Electronic bidding procurement mall system: optimize traditional procurement business and speed up enterprise digital upgrading
![[in depth understanding of tcapulusdb technology] how to realize single machine installation of tmonitor](/img/74/a645742a8e135b32154859be956760.png)
[in depth understanding of tcapulusdb technology] how to realize single machine installation of tmonitor

【深入理解TcaplusDB技术】单据受理之事务执行
随机推荐
Contact five heart matchmaker to take off the order
Install RkNN toolkit Lite2 for itop-3568 development board
ES7 create index template
MATLAB在做图像处理时容易出现的一个误区:为提高运算速度使用预先声明的零矩阵存储图像数据
WPF 路由
Fedora 36 compiling and installing opencv 4.6.0 -- the road to building a dream
[deeply understand tcapulusdb technology] transaction execution of document acceptance
Design and implementation of spark offline development framework
An error prone to appear when MATLAB is doing image processing: to improve the operation speed, use the pre declared zero matrix to store image data
Use the for loop to calculate n! Value of
Qt滚动区域QScrollArea
语音断点检测(短时改进子带谱熵)
Use the do while loop to calculate the odd and even sums in 1-100 [method 1]
JMter测试命令【笔记】
An example of CPU instruction rearrangement leading to errors
Games101 job 7- detailed explanation of implementation steps of multi thread speed up
FPGA之道——FPGA开发流程之项目方案与FPGA设计方案
js实现斐波那契数列
Use the for loop to calculate the odd and even sums in 1-100 [method 1]
C datatable converted to entity (reflection & generics)