当前位置:网站首页>Application practice | Shuhai supply chain construction of data center based on Apache Doris
Application practice | Shuhai supply chain construction of data center based on Apache Doris
2022-07-04 21:38:00 【SelectDB】
Reading guide : Shuhai supply chain is a set of sales 、 Research and development 、 purchase 、 production 、 Quality assurance 、 Storage 、 transport 、 Information 、 Finance as one of the catering supply chain service enterprises , Because its business is relatively complex ,2020 By the end of the year Apache Doris Upgrade the core architecture , And in 2021 Started construction in Apache Doris As the core data center . This article will be from data access , Data service arrangement , Data security ,Doris Application and so on .
author | Head of Shuhai supply chain big data team Wang Yongxu

Business background
Shuhai supply chain is a set of sales 、 Research and development 、 purchase 、 production 、 Quality assurance 、 Storage 、 transport 、 Information 、 Finance as one of the catering supply chain service enterprises , To provide catering chain enterprises and retail customers with overall food supply chain solution services . Because its business is relatively complex ,2020 By the end of the year Apache Doris Upgrade the core architecture , And in 2021 Started construction in Apache Doris As the core data center .
In the use of Doris Before , We used CDH This data platform , Many components are used , But the link is too long , And the development and maintenance costs are relatively large , Finally, there is no good OLAP System .
Because our data history burden is relatively light , Yes Apache Doris Research and testing , decision Use to Apache Doris Build a data platform for the core , It has the following advantages :
- At the same time, it supports high concurrent point query and high throughput Ad-hoc Inquire about .
- At the same time, it supports offline bulk import and real-time data import .
- Both details and aggregate queries are supported .
- compatible MySQL Agreements and standards SQL.
- Support Rollup Table and Rollup Table Intelligent query routing .
- Support better multi table Join Strategy and flexible expression query .
- Support Schema Online change .
- Support Range and Hash Secondary division .
- High availability , It can tolerate some nodes hanging .
- Simple operation and maintenance , Deploy , maintain , Upgrading is relatively simple , Independent of external components .
The architecture is as follows :

Because of the previous understanding of Metadata , Data services , Access data quality , The construction of kinship Introduced , This article will be from data access , Data service arrangement , Data security ,Doris Application and so on .
Data access
Data access function is an important part of data development , We have developed a data access system , stay Web End operation , Realize zero code data access to Doris, The following is an introduction to the main functions :
- subscribe MySQL Binlog, Warehousing to Doris surface .
- subscribe Kafka Topic, Warehousing to Doris surface .
- Data dynamic cleaning , Writing code on the page can complete the conversion before data warehousing .
- Access task merging , To save resources , Support sub database and sub table access in one task , Support multiple TOPIC Access in a task .
- Dynamic data quality verification , Configure field quality rules , Check the access data quality .
- Warehousing encryption , During re connection , You can encrypt sensitive data and then enter Doris surface .
- Error data management , Because of network or data errors and other reasons , Data re warehousing can be completed on the page .
- Data access link monitoring , For example, error data monitoring , Abnormal monitoring of data production link , Abnormal monitoring of data consumption link , Task data access trend chart , Trend chart of cluster data access .
Data access task list :

Data access task configuration :

Data access dynamic code processing :

Data service arrangement
Data services are used by business systems API A system for obtaining data . It can be done on the page API newly build 、 edit 、 Online development debugging 、 Set current limiting 、 Online and offline operations . because API There may be a business logic relationship between , And you cannot configure the same API, We developed the data service orchestration function , By dragging and dropping , Give Way API Can arrange and transfer data between , External provision API when , What is still exposed is a API.
give an example : The user and the city to which the user belongs are stored in one MySQL data source , The sales volume of each city is kept in Doris data source . To be developed API The function of is that users can only view the sales of their cities . Then it can be realized through the service orchestration function ,Node1 Nodes pass through users ID Get the city ,Node2 The node gets the output of the upstream node ( City ) As input , Get city sales as API Output .

The input and output of each node can be customized , Input can come from API Request parameters , It can also come from the output of an upstream node , It can come from global parameters , Such as user ID, Paging parameters ;


Data security construction
Data security is a big topic , It's all about , Here from data encryption , Data authority and data warehouse data backup are briefly introduced .
Data warehousing encryption
In the process of data access , You can choose to encrypt fields , When connected to Doris After the table , It is already encrypted data , Follow up data analysis , You can use the key to decrypt .
Data access encryption configuration :

Data access
Because the company has a wide range of people who view reports , For the same data model , Sales in every city and every region , operating , Factory personnel , The data viewed by managers and other personnel is different , You need to accurately control row permissions and column permissions , therefore We are Doris The upper layer developed a set of Data permission system , Through configuration , Complete data permission configuration , It can be accurate to row permissions and column permissions . BI The reporting system acts as an access party , Introduce the data permission client and implement the corresponding abstract method .
give an example 1: For a report model , Zhang San can only view data in North China or northwest China ; Li Si , Wang Ming can only view Xi'an or Beijing , And the sales volume is greater than 10000 The data of ; Zhang Si , Zhang Wu is not restricted , Others have no authority .
List of model row level permission rules :

Row level empowerment rule editing :


give an example 2: Everyone can view the report data , But everyone can only view their own city , And the amount is greater than 200 Or the amount is less than 100 The data of .
Free combination rule conditions and rule relationships :

Personnel label management :


give an example 3: Column permission rules , You can set forbidden viewing for users , Data desensitization and other rules Column level permission configuration :


Data warehouse backup
We use Dori As the core of storage and Computing ,Doris The data itself has been stored in multiple copies , But considering disaster recovery , We will still back up the core access data to HDFS, Therefore, a data warehouse data backup system is developed , hold Doris The table is divided according to the total quantity or partition , Scheduled backup to HDFS.
Backup schedule configuration :

Backup scheduled task list

Doris Application
We use it Doris It carries the calculation and storage of data analysis . Besides , There is also a scenario like this : Business MySQL Database data has been growing , A large amount of historical data affects online performance , And it cannot be deleted directly , Because there are also low-frequency historical data queries , So , We are based on Doris Developed a set of business historical data archiving system , The historical data that will not be changed can be archived incrementally at regular intervals , Provide data query through data service system , Push archived data to the business side , The business party verifies , And delete historical data .
Archive plan list :

Archive plan configuration :

Data push plan configuration :

earnings
At present Doris As the core data platform , It has supported the data query and data analysis requirements of dozens of business systems of the company . by BI Intelligent analysis , Each business system provides excellent query performance , And greatly reduce the data platform maintenance , Data development , The cost of data center construction .
- The real-time data access is stable and reliable , adopt Stream Load, Thousands of watches are accessed in real time , The total number of data accessed every day is at the level of 100 million , Very stable and reliable ;
- Support high concurrency and high performance data online analysis and query , Every day to Doris The number of online analysis queries is at the level of millions , Most of the SQL At the millisecond level , slow SQL There is also a lot of room for optimization , also Doris It will automatically optimize queries in some scenarios ;
- By directly querying the original access table , Establish a materialization attempt , Index , It supports multiple real-time query requirements with low latency and high concurrency . And many tables Join Excellent performance ;
other :
- Doris The overall structure of is simple , The operation and maintenance cost is very low , Online rolling upgrade , It can save manpower and focus on the construction of data center and business development ;
- Doris Highly compatible MySQL agreement , Interactive query analysis , Provide an efficient data development experience ;
- High availability , Data partition multi copy storage , The overall service will not be unavailable due to the exceptions of some nodes ;
- Wide ecological compatibility , The community provides and Flink,Datax Wait for big data interaction Doris plug-in unit , adopt Broker Importing and exporting data is simple and fast ;
- The community is active ,Doris Functions and performance are constantly expanding and improving , If you encounter problems, you can get close help from the community .
Join the community
Welcome more partners who love open source to join Apache Doris Community , Participate in community building , Except in the GitHub Ascending PR or Issue outside , You are also welcome to actively participate in the daily construction of the community , such as :
Join the community ** Solicitation activities **, Perform technical analysis 、 Applied practice and other articles ; Participate as an instructor Doris Community online and offline activities ; actively participate in Doris Questions and answers from community users .
Last , Welcome more open source technology enthusiasts to join us Apache Doris Community , Grow up hand in hand , Build community ecology .



SelectDB Is an open source technology company , Committed to Apache Doris The community provides a full-time engineer 、 A team of product managers and support engineers , Prosper the open source community ecology , Create an international industry standard in the field of real-time analytical databases . be based on Apache Doris R & D of a new generation of cloud native real-time data warehouse SelectDB, Running on multiple clouds , Provide users and customers with out of the box capability .
Related links :
SelectDB Official website :
https://selectdb.com (We Are Coming Soon)
Apache Doris Official website :
Apache Doris Github:
https://github.com/apache/doris
Apache Doris Developer mail group :

边栏推荐
- 刘锦程荣获2022年度中国电商行业创新人物奖
- Configuration of DNS server of Huawei ENSP simulator
- Daily question-leetcode556-next larger element iii-string-double pointer-next_ permutation
- Why does invariant mode improve performance
- 华为ensp模拟器 给路由器配置DHCP
- 面试官:说说XSS攻击是什么?
- Interpreting the development of various intelligent organizations in maker Education
- Detailed explanation of multi-mode input event distribution mechanism
- Use of class methods and class variables
- 学习突围3 - 关于精力
猜你喜欢

开源之夏专访|Apache IoTDB社区 新晋Committer谢其骏

Word文档中标题前面的黑点如何去掉

案例分享|金融业数据运营运维一体化建设

迈动互联中标北京人寿保险

超详细教程,一文入门Istio架构原理及实战应用

Daily question-leetcode556-next larger element iii-string-double pointer-next_ permutation

创客思维在高等教育中的启迪作用

如何借助自动化工具落地DevOps

每日一题-LeetCode1200-最小绝对差-数组-排序

Interpreting the development of various intelligent organizations in maker Education
随机推荐
y56.第三章 Kubernetes从入门到精通 -- 业务镜像版本升级及回滚(二九)
TCP三次握手,四次挥手,你真的了解吗?
Redis cache
为什么说不变模式可以提高性能
输入的查询SQL语句,是如何执行的?
redis RDB AOF
[ 每周译Go ] 《How to Code in Go》系列文章上线了!!
Maidong Internet won the bid of Beijing life insurance
华为模拟器ensp的路由配置以及连通测试
redis RDB AOF
Enlightenment of maker thinking in Higher Education
杰理之AD 系列 MIDI 功能说明【篇】
Kubeadm初始化报错:[ERROR CRI]: container runtime is not running
Interviewer: what is XSS attack?
创客思维在高等教育中的启迪作用
Use of class methods and class variables
redis事务
Word文档中标题前面的黑点如何去掉
应用实践 | 蜀海供应链基于 Apache Doris 的数据中台建设
面试官:说说XSS攻击是什么?