当前位置:网站首页>Application practice | Shuhai supply chain construction of data center based on Apache Doris
Application practice | Shuhai supply chain construction of data center based on Apache Doris
2022-07-04 21:38:00 【SelectDB】
Reading guide : Shuhai supply chain is a set of sales 、 Research and development 、 purchase 、 production 、 Quality assurance 、 Storage 、 transport 、 Information 、 Finance as one of the catering supply chain service enterprises , Because its business is relatively complex ,2020 By the end of the year Apache Doris Upgrade the core architecture , And in 2021 Started construction in Apache Doris As the core data center . This article will be from data access , Data service arrangement , Data security ,Doris Application and so on .
author | Head of Shuhai supply chain big data team Wang Yongxu
Business background
Shuhai supply chain is a set of sales 、 Research and development 、 purchase 、 production 、 Quality assurance 、 Storage 、 transport 、 Information 、 Finance as one of the catering supply chain service enterprises , To provide catering chain enterprises and retail customers with overall food supply chain solution services . Because its business is relatively complex ,2020 By the end of the year Apache Doris Upgrade the core architecture , And in 2021 Started construction in Apache Doris As the core data center .
In the use of Doris Before , We used CDH This data platform , Many components are used , But the link is too long , And the development and maintenance costs are relatively large , Finally, there is no good OLAP System .
Because our data history burden is relatively light , Yes Apache Doris Research and testing , decision Use to Apache Doris Build a data platform for the core , It has the following advantages :
- At the same time, it supports high concurrent point query and high throughput Ad-hoc Inquire about .
- At the same time, it supports offline bulk import and real-time data import .
- Both details and aggregate queries are supported .
- compatible MySQL Agreements and standards SQL.
- Support Rollup Table and Rollup Table Intelligent query routing .
- Support better multi table Join Strategy and flexible expression query .
- Support Schema Online change .
- Support Range and Hash Secondary division .
- High availability , It can tolerate some nodes hanging .
- Simple operation and maintenance , Deploy , maintain , Upgrading is relatively simple , Independent of external components .
The architecture is as follows :
Because of the previous understanding of Metadata , Data services , Access data quality , The construction of kinship Introduced , This article will be from data access , Data service arrangement , Data security ,Doris Application and so on .
Data access
Data access function is an important part of data development , We have developed a data access system , stay Web End operation , Realize zero code data access to Doris, The following is an introduction to the main functions :
- subscribe MySQL Binlog, Warehousing to Doris surface .
- subscribe Kafka Topic, Warehousing to Doris surface .
- Data dynamic cleaning , Writing code on the page can complete the conversion before data warehousing .
- Access task merging , To save resources , Support sub database and sub table access in one task , Support multiple TOPIC Access in a task .
- Dynamic data quality verification , Configure field quality rules , Check the access data quality .
- Warehousing encryption , During re connection , You can encrypt sensitive data and then enter Doris surface .
- Error data management , Because of network or data errors and other reasons , Data re warehousing can be completed on the page .
- Data access link monitoring , For example, error data monitoring , Abnormal monitoring of data production link , Abnormal monitoring of data consumption link , Task data access trend chart , Trend chart of cluster data access .
Data access task list :
Data access task configuration :
Data access dynamic code processing :
Data service arrangement
Data services are used by business systems API A system for obtaining data . It can be done on the page API newly build 、 edit 、 Online development debugging 、 Set current limiting 、 Online and offline operations . because API There may be a business logic relationship between , And you cannot configure the same API, We developed the data service orchestration function , By dragging and dropping , Give Way API Can arrange and transfer data between , External provision API when , What is still exposed is a API.
give an example : The user and the city to which the user belongs are stored in one MySQL data source , The sales volume of each city is kept in Doris data source . To be developed API The function of is that users can only view the sales of their cities . Then it can be realized through the service orchestration function ,Node1 Nodes pass through users ID Get the city ,Node2 The node gets the output of the upstream node ( City ) As input , Get city sales as API Output .
The input and output of each node can be customized , Input can come from API Request parameters , It can also come from the output of an upstream node , It can come from global parameters , Such as user ID, Paging parameters ;
Data security construction
Data security is a big topic , It's all about , Here from data encryption , Data authority and data warehouse data backup are briefly introduced .
Data warehousing encryption
In the process of data access , You can choose to encrypt fields , When connected to Doris After the table , It is already encrypted data , Follow up data analysis , You can use the key to decrypt .
Data access encryption configuration :
Data access
Because the company has a wide range of people who view reports , For the same data model , Sales in every city and every region , operating , Factory personnel , The data viewed by managers and other personnel is different , You need to accurately control row permissions and column permissions , therefore We are Doris The upper layer developed a set of Data permission system , Through configuration , Complete data permission configuration , It can be accurate to row permissions and column permissions . BI The reporting system acts as an access party , Introduce the data permission client and implement the corresponding abstract method .
give an example 1: For a report model , Zhang San can only view data in North China or northwest China ; Li Si , Wang Ming can only view Xi'an or Beijing , And the sales volume is greater than 10000 The data of ; Zhang Si , Zhang Wu is not restricted , Others have no authority .
List of model row level permission rules :
Row level empowerment rule editing :
give an example 2: Everyone can view the report data , But everyone can only view their own city , And the amount is greater than 200 Or the amount is less than 100 The data of .
Free combination rule conditions and rule relationships :
Personnel label management :
give an example 3: Column permission rules , You can set forbidden viewing for users , Data desensitization and other rules Column level permission configuration :
Data warehouse backup
We use Dori As the core of storage and Computing ,Doris The data itself has been stored in multiple copies , But considering disaster recovery , We will still back up the core access data to HDFS, Therefore, a data warehouse data backup system is developed , hold Doris The table is divided according to the total quantity or partition , Scheduled backup to HDFS.
Backup schedule configuration :
Backup scheduled task list
Doris Application
We use it Doris It carries the calculation and storage of data analysis . Besides , There is also a scenario like this : Business MySQL Database data has been growing , A large amount of historical data affects online performance , And it cannot be deleted directly , Because there are also low-frequency historical data queries , So , We are based on Doris Developed a set of business historical data archiving system , The historical data that will not be changed can be archived incrementally at regular intervals , Provide data query through data service system , Push archived data to the business side , The business party verifies , And delete historical data .
Archive plan list :
Archive plan configuration :
Data push plan configuration :
earnings
At present Doris As the core data platform , It has supported the data query and data analysis requirements of dozens of business systems of the company . by BI Intelligent analysis , Each business system provides excellent query performance , And greatly reduce the data platform maintenance , Data development , The cost of data center construction .
- The real-time data access is stable and reliable , adopt Stream Load, Thousands of watches are accessed in real time , The total number of data accessed every day is at the level of 100 million , Very stable and reliable ;
- Support high concurrency and high performance data online analysis and query , Every day to Doris The number of online analysis queries is at the level of millions , Most of the SQL At the millisecond level , slow SQL There is also a lot of room for optimization , also Doris It will automatically optimize queries in some scenarios ;
- By directly querying the original access table , Establish a materialization attempt , Index , It supports multiple real-time query requirements with low latency and high concurrency . And many tables Join Excellent performance ;
other :
- Doris The overall structure of is simple , The operation and maintenance cost is very low , Online rolling upgrade , It can save manpower and focus on the construction of data center and business development ;
- Doris Highly compatible MySQL agreement , Interactive query analysis , Provide an efficient data development experience ;
- High availability , Data partition multi copy storage , The overall service will not be unavailable due to the exceptions of some nodes ;
- Wide ecological compatibility , The community provides and Flink,Datax Wait for big data interaction Doris plug-in unit , adopt Broker Importing and exporting data is simple and fast ;
- The community is active ,Doris Functions and performance are constantly expanding and improving , If you encounter problems, you can get close help from the community .
Join the community
Welcome more partners who love open source to join Apache Doris Community , Participate in community building , Except in the GitHub Ascending PR or Issue outside , You are also welcome to actively participate in the daily construction of the community , such as :
Join the community ** Solicitation activities **, Perform technical analysis 、 Applied practice and other articles ; Participate as an instructor Doris Community online and offline activities ; actively participate in Doris Questions and answers from community users .
Last , Welcome more open source technology enthusiasts to join us Apache Doris Community , Grow up hand in hand , Build community ecology .
SelectDB Is an open source technology company , Committed to Apache Doris The community provides a full-time engineer 、 A team of product managers and support engineers , Prosper the open source community ecology , Create an international industry standard in the field of real-time analytical databases . be based on Apache Doris R & D of a new generation of cloud native real-time data warehouse SelectDB, Running on multiple clouds , Provide users and customers with out of the box capability .
Related links :
SelectDB Official website :
https://selectdb.com (We Are Coming Soon)
Apache Doris Official website :
Apache Doris Github:
https://github.com/apache/doris
Apache Doris Developer mail group :
边栏推荐
- [public class preview]: basis and practice of video quality evaluation
- Jerry's ad series MIDI function description [chapter]
- 类方法和类变量的使用
- 华为模拟器ensp常用命令
- Jerry's ad series MIDI function description [chapter]
- 杰理之AD 系列 MIDI 功能说明【篇】
- 杰理之AD 系列 MIDI 功能说明【篇】
- __init__() missing 2 required positional arguments 不易查明的继承错误
- Introduction to pressure measurement of JMeter
- SolidWorks工程图添加材料明细表的操作
猜你喜欢
How was MP3 born?
redis03——Redis的网络配置与心跳机制
【微信小程序】协同工作与发布
案例分享|金融业数据运营运维一体化建设
Configuration of DNS server of Huawei ENSP simulator
每日一题-LeetCode556-下一个更大元素III-字符串-双指针-next_permutation
Huawei ENSP simulator configures ACL access control list
Billions of citizens' information has been leaked! Is there any "rescue" for data security on the public cloud?
[public class preview]: basis and practice of video quality evaluation
华为ensp模拟器 给路由器配置DHCP
随机推荐
【活动早知道】LiveVideoStack近期活动一览
华为模拟器ensp常用命令
Huawei ENSP simulator enables devices of multiple routers to access each other
[wechat applet] collaborative work and release
__ init__ () missing 2 required positive arguments
【C語言】符號的深度理解
Delphi SOAP WebService 服务器端多个 SoapDataModule 实现相同的接口方法,接口继承
flink1.13 sql基础语法(一)DDL、DML
WGCNA analysis basic tutorial summary
minidom 模塊寫入和解析 XML
MYSQL 用!=查询不出等于null的数据,解决办法
Jerry's ad series MIDI function description [chapter]
ApplicationContext 与 BeanFactory 区别(MS)
2021 CCPC Harbin B. magical subsequence (thinking question)
Jerry's ad series MIDI function description [chapter]
解析steam教育中蕴含的众创空间
Jerry's ad series MIDI function description [chapter]
Interviewer: what is XSS attack?
Jerry added the process of turning off the touch module before turning it off [chapter]
Redis pipeline