当前位置:网站首页>Flume learning notes
Flume learning notes
2022-07-03 19:14:00 【Dream touch reincarnation】
function
Distributed real-time files 、 Network port data flow collection , Data from various data sources can be collected to various destinations in real time
characteristic
Real time acquisition Monitor data sources in real time , Collect data as soon as it is generated
Comprehensive function The common data sources and destinations of big data are encapsulated with corresponding interfaces
Allow custom development Java Source code of development , It provides an interface for user-defined development
Development is relatively simple Develop a configuration file , Just write the configuration
It can realize distributed collection Itself is not a distributed tool , It can realize distributed collection
framework
Agent: One flume The program is a Agent
Event:flume The collected data is encapsulated as Event Object to transmit
Source: Monitor data sources in real time , As soon as the data source generates data, it collects
Channel: Be responsible for temporarily storing the collected data , Will all Event Temporary storage
Sink: Responsible for Channel Send the data in to the destination , Initiative from Channel Count
Multi data source architecture

Design purpose : Write a copy to different destinations
Multi tier architecture

Design purpose : Prevent multiple Flume The program directly interacts with the destination , Affect destination performance
Usage mode
offline ( Collect to HDFS): To configure Source and Sink file , start-up Hive and HDFS, Submit and run on the command line
real time ( Collect to Kafka): To configure Source and Sink file , Collect to kafka, For consumption by real-time computing programs
Comparison of similar software
Sqoop The bottom is MapReduce, It is suitable for collection with a large amount of offline data
Flume Suitable for real-time collection of files , Network port
Canal Suitable for real-time acquisition MySQL database

Comparison of similar software
sqoop The bottom is MapReduce, It is suitable for collection with a large amount of offline data
Flume Suitable for real-time collection of files , Network port
Canal Suitable for real-time acquisition MySQL database

Advanced components
Interceptor: Interceptor , stay source Convert each piece of data into event When , Can be in event Head shop addition kv Or filter the data
Add data :
Timestamp Interceptor
Host Interceptor
Static Interceptor
Filtering data :
Regex Filtering Interceptor
Channel Select
Default source Data is sent to each channel One copy , According to Agent The head of the key Different , Send to different channel
Sink Processor
function
Load balancing
Multiple sink With sink group Way to work together , One of the faults , It can also collect normally
Fail over
Multiple sink, One job , Others don't work , Only at work sink It works only after failure , Ensure the normal collection
边栏推荐
- [optics] dielectric constant calculation based on MATLAB [including Matlab source code 1926]
- HOW TO WRITE A DAILY LAB NOTE?
- 【学术相关】顶级论文创新点怎么找?中国高校首次获CVPR最佳学生论文奖有感...
- Flutter网络和数据存储框架搭建 -b1
- 235. 二叉搜索樹的最近公共祖先【lca模板 + 找路徑相同】
- Record: writing MySQL commands
- Which do MySQL and Oracle learn?
- Record the errors reported when running fluent in the simulator
- Thesis study - 7 Very Deep Convolutional Networks for Large-Scale Image Recognition (3/3)
- 【数学建模】基于matlab船舶三自由度MMG模型【含Matlab源码 1925期】
猜你喜欢

为什么要做特征的归一化/标准化?
![[proteus simulation] a simple encrypted electronic password lock designed with 24C04 and 1602LCD](/img/51/209e35e0b94a51b3b406a184459475.png)
[proteus simulation] a simple encrypted electronic password lock designed with 24C04 and 1602LCD

Pan for in-depth understanding of the attention mechanism in CV

PyTorch中在反向传播前为什么要手动将梯度清零?
![Leetcode: 11. Récipient contenant le plus d'eau [double pointeur + cupidité + enlèvement de la plaque la plus courte]](/img/d4/cbbaec40119be6cb5594899e348261.png)
Leetcode: 11. Récipient contenant le plus d'eau [double pointeur + cupidité + enlèvement de la plaque la plus courte]
![[mathematical modeling] ship three degree of freedom MMG model based on MATLAB [including Matlab source code 1925]](/img/a9/d89ee2b88517eea6b3c38d72cf099f.jpg)
[mathematical modeling] ship three degree of freedom MMG model based on MATLAB [including Matlab source code 1925]

Zhengda futures news: soaring oil prices may continue to push up global inflation

SQL injection for Web Security (1)

Flutter网络和数据存储框架搭建 -b1
![leetcode:11. Container with the most water [double pointer + greed + remove the shortest board]](/img/d4/cbbaec40119be6cb5594899e348261.png)
leetcode:11. Container with the most water [double pointer + greed + remove the shortest board]
随机推荐
PyTorch中在反向传播前为什么要手动将梯度清零?
Suffix derivation based on query object fields
Web3 credential network project galaxy is better than nym?
Comments on flowable source code (37) asynchronous job processor
Recommend a GIF processing artifact less than 300K - gifsicle (free download)
我眼中真正优秀的CTO长啥样
cipher
Simulation scheduling problem of SystemVerilog (1)
Valentine's Day - make an exclusive digital collection for your lover
Today I am filled with emotion
Common PostgreSQL commands
[free sharing] kotalog diary2022 plan electronic manual ledger
Latex image rotates with title
Flutter网络和数据存储框架搭建 -b1
Help change the socket position of PCB part
“google is not defined” when using Google Maps V3 in Firefox remotely
Web Security (VII) specific process of authentication with session cookie scheme
[wallpaper] (commercially available) 70 wallpaper HD free
Failed to start component [StandardEngine[Catalina]. StandardHost[localhost]. StandardContext
EGO Planner代码解析bspline_optimizer部分(1)