当前位置:网站首页>Application practice | Shuhai supply chain construction of data center based on Apache Doris
Application practice | Shuhai supply chain construction of data center based on Apache Doris
2022-07-04 21:38:00 【SelectDB】
Reading guide : Shuhai supply chain is a set of sales 、 Research and development 、 purchase 、 production 、 Quality assurance 、 Storage 、 transport 、 Information 、 Finance as one of the catering supply chain service enterprises , Because its business is relatively complex ,2020 By the end of the year Apache Doris Upgrade the core architecture , And in 2021 Started construction in Apache Doris As the core data center . This article will be from data access , Data service arrangement , Data security ,Doris Application and so on .
author | Head of Shuhai supply chain big data team Wang Yongxu
Business background
Shuhai supply chain is a set of sales 、 Research and development 、 purchase 、 production 、 Quality assurance 、 Storage 、 transport 、 Information 、 Finance as one of the catering supply chain service enterprises , To provide catering chain enterprises and retail customers with overall food supply chain solution services . Because its business is relatively complex ,2020 By the end of the year Apache Doris Upgrade the core architecture , And in 2021 Started construction in Apache Doris As the core data center .
In the use of Doris Before , We used CDH This data platform , Many components are used , But the link is too long , And the development and maintenance costs are relatively large , Finally, there is no good OLAP System .
Because our data history burden is relatively light , Yes Apache Doris Research and testing , decision Use to Apache Doris Build a data platform for the core , It has the following advantages :
- At the same time, it supports high concurrent point query and high throughput Ad-hoc Inquire about .
- At the same time, it supports offline bulk import and real-time data import .
- Both details and aggregate queries are supported .
- compatible MySQL Agreements and standards SQL.
- Support Rollup Table and Rollup Table Intelligent query routing .
- Support better multi table Join Strategy and flexible expression query .
- Support Schema Online change .
- Support Range and Hash Secondary division .
- High availability , It can tolerate some nodes hanging .
- Simple operation and maintenance , Deploy , maintain , Upgrading is relatively simple , Independent of external components .
The architecture is as follows :
Because of the previous understanding of Metadata , Data services , Access data quality , The construction of kinship Introduced , This article will be from data access , Data service arrangement , Data security ,Doris Application and so on .
Data access
Data access function is an important part of data development , We have developed a data access system , stay Web End operation , Realize zero code data access to Doris, The following is an introduction to the main functions :
- subscribe MySQL Binlog, Warehousing to Doris surface .
- subscribe Kafka Topic, Warehousing to Doris surface .
- Data dynamic cleaning , Writing code on the page can complete the conversion before data warehousing .
- Access task merging , To save resources , Support sub database and sub table access in one task , Support multiple TOPIC Access in a task .
- Dynamic data quality verification , Configure field quality rules , Check the access data quality .
- Warehousing encryption , During re connection , You can encrypt sensitive data and then enter Doris surface .
- Error data management , Because of network or data errors and other reasons , Data re warehousing can be completed on the page .
- Data access link monitoring , For example, error data monitoring , Abnormal monitoring of data production link , Abnormal monitoring of data consumption link , Task data access trend chart , Trend chart of cluster data access .
Data access task list :
Data access task configuration :
Data access dynamic code processing :
Data service arrangement
Data services are used by business systems API A system for obtaining data . It can be done on the page API newly build 、 edit 、 Online development debugging 、 Set current limiting 、 Online and offline operations . because API There may be a business logic relationship between , And you cannot configure the same API, We developed the data service orchestration function , By dragging and dropping , Give Way API Can arrange and transfer data between , External provision API when , What is still exposed is a API.
give an example : The user and the city to which the user belongs are stored in one MySQL data source , The sales volume of each city is kept in Doris data source . To be developed API The function of is that users can only view the sales of their cities . Then it can be realized through the service orchestration function ,Node1 Nodes pass through users ID Get the city ,Node2 The node gets the output of the upstream node ( City ) As input , Get city sales as API Output .
The input and output of each node can be customized , Input can come from API Request parameters , It can also come from the output of an upstream node , It can come from global parameters , Such as user ID, Paging parameters ;
Data security construction
Data security is a big topic , It's all about , Here from data encryption , Data authority and data warehouse data backup are briefly introduced .
Data warehousing encryption
In the process of data access , You can choose to encrypt fields , When connected to Doris After the table , It is already encrypted data , Follow up data analysis , You can use the key to decrypt .
Data access encryption configuration :
Data access
Because the company has a wide range of people who view reports , For the same data model , Sales in every city and every region , operating , Factory personnel , The data viewed by managers and other personnel is different , You need to accurately control row permissions and column permissions , therefore We are Doris The upper layer developed a set of Data permission system , Through configuration , Complete data permission configuration , It can be accurate to row permissions and column permissions . BI The reporting system acts as an access party , Introduce the data permission client and implement the corresponding abstract method .
give an example 1: For a report model , Zhang San can only view data in North China or northwest China ; Li Si , Wang Ming can only view Xi'an or Beijing , And the sales volume is greater than 10000 The data of ; Zhang Si , Zhang Wu is not restricted , Others have no authority .
List of model row level permission rules :
Row level empowerment rule editing :
give an example 2: Everyone can view the report data , But everyone can only view their own city , And the amount is greater than 200 Or the amount is less than 100 The data of .
Free combination rule conditions and rule relationships :
Personnel label management :
give an example 3: Column permission rules , You can set forbidden viewing for users , Data desensitization and other rules Column level permission configuration :
Data warehouse backup
We use Dori As the core of storage and Computing ,Doris The data itself has been stored in multiple copies , But considering disaster recovery , We will still back up the core access data to HDFS, Therefore, a data warehouse data backup system is developed , hold Doris The table is divided according to the total quantity or partition , Scheduled backup to HDFS.
Backup schedule configuration :
Backup scheduled task list
Doris Application
We use it Doris It carries the calculation and storage of data analysis . Besides , There is also a scenario like this : Business MySQL Database data has been growing , A large amount of historical data affects online performance , And it cannot be deleted directly , Because there are also low-frequency historical data queries , So , We are based on Doris Developed a set of business historical data archiving system , The historical data that will not be changed can be archived incrementally at regular intervals , Provide data query through data service system , Push archived data to the business side , The business party verifies , And delete historical data .
Archive plan list :
Archive plan configuration :
Data push plan configuration :
earnings
At present Doris As the core data platform , It has supported the data query and data analysis requirements of dozens of business systems of the company . by BI Intelligent analysis , Each business system provides excellent query performance , And greatly reduce the data platform maintenance , Data development , The cost of data center construction .
- The real-time data access is stable and reliable , adopt Stream Load, Thousands of watches are accessed in real time , The total number of data accessed every day is at the level of 100 million , Very stable and reliable ;
- Support high concurrency and high performance data online analysis and query , Every day to Doris The number of online analysis queries is at the level of millions , Most of the SQL At the millisecond level , slow SQL There is also a lot of room for optimization , also Doris It will automatically optimize queries in some scenarios ;
- By directly querying the original access table , Establish a materialization attempt , Index , It supports multiple real-time query requirements with low latency and high concurrency . And many tables Join Excellent performance ;
other :
- Doris The overall structure of is simple , The operation and maintenance cost is very low , Online rolling upgrade , It can save manpower and focus on the construction of data center and business development ;
- Doris Highly compatible MySQL agreement , Interactive query analysis , Provide an efficient data development experience ;
- High availability , Data partition multi copy storage , The overall service will not be unavailable due to the exceptions of some nodes ;
- Wide ecological compatibility , The community provides and Flink,Datax Wait for big data interaction Doris plug-in unit , adopt Broker Importing and exporting data is simple and fast ;
- The community is active ,Doris Functions and performance are constantly expanding and improving , If you encounter problems, you can get close help from the community .
Join the community
Welcome more partners who love open source to join Apache Doris Community , Participate in community building , Except in the GitHub Ascending PR or Issue outside , You are also welcome to actively participate in the daily construction of the community , such as :
Join the community ** Solicitation activities **, Perform technical analysis 、 Applied practice and other articles ; Participate as an instructor Doris Community online and offline activities ; actively participate in Doris Questions and answers from community users .
Last , Welcome more open source technology enthusiasts to join us Apache Doris Community , Grow up hand in hand , Build community ecology .
SelectDB Is an open source technology company , Committed to Apache Doris The community provides a full-time engineer 、 A team of product managers and support engineers , Prosper the open source community ecology , Create an international industry standard in the field of real-time analytical databases . be based on Apache Doris R & D of a new generation of cloud native real-time data warehouse SelectDB, Running on multiple clouds , Provide users and customers with out of the box capability .
Related links :
SelectDB Official website :
https://selectdb.com (We Are Coming Soon)
Apache Doris Official website :
Apache Doris Github:
https://github.com/apache/doris
Apache Doris Developer mail group :
边栏推荐
- Le module minidom écrit et analyse XML
- MYSQL 用!=查询不出等于null的数据,解决办法
- 华为ensp模拟器 三层交换机
- redis缓存
- How much is the minimum stock account opening commission? Is it safe to open an account online
- Jerry's ad series MIDI function description [chapter]
- 2021 CCPC Harbin B. magical subsequence (thinking question)
- Flutter在 release版本,打开后随机白屏不显示内容
- Drop down selection of Ehlib database records
- 为什么说不变模式可以提高性能
猜你喜欢
CloudCompare&Open3D DBSCAN聚类(非插件式)
巅峰不止,继续奋斗!城链科技数字峰会于重庆隆重举行
【LeetCode】17、电话号码的字母组合
Difference between ApplicationContext and beanfactory (MS)
Huawei ENSP simulator enables devices of multiple routers to access each other
华为ensp模拟器 配置ACL访问控制列表
CAD中能显示打印不显示
Huawei ENSP simulator layer 3 switch
历史最全混合专家(MOE)模型相关精选论文、系统、应用整理分享
Compréhension approfondie du symbole [langue C]
随机推荐
历史最全混合专家(MOE)模型相关精选论文、系统、应用整理分享
Day24:文件系统
旋变串判断
redis管道
2021 CCPC 哈尔滨 I. Power and Zero(二进制 + 思维)
刘锦程荣获2022年度中国电商行业创新人物奖
Flutter WebView示例
Le module minidom écrit et analyse XML
Huawei ENSP simulator realizes communication security (switch)
B站视频 声音很小——解决办法
Caduceus从未停止创新,去中心化边缘渲染技术让元宇宙不再遥远
Go语言循环语句(第10课中3)
Kubeadm初始化报错:[ERROR CRI]: container runtime is not running
Analyzing the maker space contained in steam Education
杰理之AD 系列 MIDI 功能说明【篇】
2021 CCPC Harbin B. magical subsequence (thinking question)
Huawei ENSP simulator enables devices of multiple routers to access each other
[weekly translation go] how to code in go series articles are online!!
Redis03 - network configuration and heartbeat mechanism of redis
为什么说不变模式可以提高性能