当前位置:网站首页>Flume learning notes
Flume learning notes
2022-07-03 19:14:00 【Dream touch reincarnation】
function
Distributed real-time files 、 Network port data flow collection , Data from various data sources can be collected to various destinations in real time
characteristic
Real time acquisition Monitor data sources in real time , Collect data as soon as it is generated
Comprehensive function The common data sources and destinations of big data are encapsulated with corresponding interfaces
Allow custom development Java Source code of development , It provides an interface for user-defined development
Development is relatively simple Develop a configuration file , Just write the configuration
It can realize distributed collection Itself is not a distributed tool , It can realize distributed collection
framework
Agent: One flume The program is a Agent
Event:flume The collected data is encapsulated as Event Object to transmit
Source: Monitor data sources in real time , As soon as the data source generates data, it collects
Channel: Be responsible for temporarily storing the collected data , Will all Event Temporary storage
Sink: Responsible for Channel Send the data in to the destination , Initiative from Channel Count
Multi data source architecture

Design purpose : Write a copy to different destinations
Multi tier architecture

Design purpose : Prevent multiple Flume The program directly interacts with the destination , Affect destination performance
Usage mode
offline ( Collect to HDFS): To configure Source and Sink file , start-up Hive and HDFS, Submit and run on the command line
real time ( Collect to Kafka): To configure Source and Sink file , Collect to kafka, For consumption by real-time computing programs
Comparison of similar software
Sqoop The bottom is MapReduce, It is suitable for collection with a large amount of offline data
Flume Suitable for real-time collection of files , Network port
Canal Suitable for real-time acquisition MySQL database

Comparison of similar software
sqoop The bottom is MapReduce, It is suitable for collection with a large amount of offline data
Flume Suitable for real-time collection of files , Network port
Canal Suitable for real-time acquisition MySQL database

Advanced components
Interceptor: Interceptor , stay source Convert each piece of data into event When , Can be in event Head shop addition kv Or filter the data
Add data :
Timestamp Interceptor
Host Interceptor
Static Interceptor
Filtering data :
Regex Filtering Interceptor
Channel Select
Default source Data is sent to each channel One copy , According to Agent The head of the key Different , Send to different channel
Sink Processor
function
Load balancing
Multiple sink With sink group Way to work together , One of the faults , It can also collect normally
Fail over
Multiple sink, One job , Others don't work , Only at work sink It works only after failure , Ensure the normal collection
边栏推荐
- We have built an intelligent retail settlement platform
- [free sharing] kotalog diary2022 plan electronic manual ledger
- EGO Planner代码解析bspline_optimizer部分(2)
- math_ Taylor formula
- Understanding of database architecture
- Verilog HDL continuous assignment statement, process assignment statement, process continuous assignment statement
- Counting from the East and counting from the West will stimulate 100 billion industries. Only storage manufacturers who dare to bite the "hard bone" will have more opportunities
- Record: MySQL changes the time zone
- Work Measurement - 1
- Thinking about festivals
猜你喜欢

PyTorch中在反向传播前为什么要手动将梯度清零?

Driveseg: dynamic driving scene segmentation data set

SQL: special update operation

平淡的生活里除了有扎破皮肤的刺,还有那些原本让你魂牵梦绕的诗与远方

Integrated easy to pay secondary domain name distribution system

How to build an efficient information warehouse

Analysis of dart JSON encoder and decoder
![[leetcode周赛]第300场——6110. 网格图中递增路径的数目-较难](/img/8d/0e515af6c17971ddf461e3f3b87c30.png)
[leetcode周赛]第300场——6110. 网格图中递增路径的数目-较难

Think of new ways

【光学】基于matlab涡旋光产生【含Matlab源码 1927期】
随机推荐
EGO Planner代码解析bspline_optimizer部分(1)
Record: install MySQL on ubuntu18.04
The online customer service system developed by PHP is fully open source without encryption, and supports wechat customer service docking
[leetcode] [SQL] notes
235. 二叉搜索樹的最近公共祖先【lca模板 + 找路徑相同】
FBI警告:有人利用AI换脸冒充他人身份进行远程面试
达梦数据库的物理备份和还原简解
The way to treat feelings
Compose LazyColumn 顶部添加控件
These problems should be paid attention to in the production of enterprise promotional videos
Integrated easy to pay secondary domain name distribution system
SQL custom collation
【光学】基于matlab介电常数计算【含Matlab源码 1926期】
SQL injection for Web Security (1)
Scrapy爬虫框架
[water quality prediction] water quality prediction based on MATLAB Fuzzy Neural Network [including Matlab source code 1923]
Failed to start component [StandardEngine[Catalina]. StandardHost[localhost]. StandardContext
What does a really excellent CTO look like in my eyes
Does SQL always report foreign key errors when creating tables?
Latex image rotates with title