当前位置:网站首页>Flume learning notes
Flume learning notes
2022-07-03 19:14:00 【Dream touch reincarnation】
function
Distributed real-time files 、 Network port data flow collection , Data from various data sources can be collected to various destinations in real time
characteristic
Real time acquisition Monitor data sources in real time , Collect data as soon as it is generated
Comprehensive function The common data sources and destinations of big data are encapsulated with corresponding interfaces
Allow custom development Java Source code of development , It provides an interface for user-defined development
Development is relatively simple Develop a configuration file , Just write the configuration
It can realize distributed collection Itself is not a distributed tool , It can realize distributed collection
framework
Agent: One flume The program is a Agent
Event:flume The collected data is encapsulated as Event Object to transmit
Source: Monitor data sources in real time , As soon as the data source generates data, it collects
Channel: Be responsible for temporarily storing the collected data , Will all Event Temporary storage
Sink: Responsible for Channel Send the data in to the destination , Initiative from Channel Count
Multi data source architecture

Design purpose : Write a copy to different destinations
Multi tier architecture

Design purpose : Prevent multiple Flume The program directly interacts with the destination , Affect destination performance
Usage mode
offline ( Collect to HDFS): To configure Source and Sink file , start-up Hive and HDFS, Submit and run on the command line
real time ( Collect to Kafka): To configure Source and Sink file , Collect to kafka, For consumption by real-time computing programs
Comparison of similar software
Sqoop The bottom is MapReduce, It is suitable for collection with a large amount of offline data
Flume Suitable for real-time collection of files , Network port
Canal Suitable for real-time acquisition MySQL database

Comparison of similar software
sqoop The bottom is MapReduce, It is suitable for collection with a large amount of offline data
Flume Suitable for real-time collection of files , Network port
Canal Suitable for real-time acquisition MySQL database

Advanced components
Interceptor: Interceptor , stay source Convert each piece of data into event When , Can be in event Head shop addition kv Or filter the data
Add data :
Timestamp Interceptor
Host Interceptor
Static Interceptor
Filtering data :
Regex Filtering Interceptor
Channel Select
Default source Data is sent to each channel One copy , According to Agent The head of the key Different , Send to different channel
Sink Processor
function
Load balancing
Multiple sink With sink group Way to work together , One of the faults , It can also collect normally
Fail over
Multiple sink, One job , Others don't work , Only at work sink It works only after failure , Ensure the normal collection
边栏推荐
- Yolov3 network model building
- Understanding of database architecture
- application
- EGO Planner代码解析bspline_optimizer部分(2)
- Record: writing MySQL commands
- Web3 credential network project galaxy is better than nym?
- Today I am filled with emotion
- The earliest record
- Le changement est un thème éternel
- During MySQL installation, the download interface is empty, and the components to be downloaded are not displayed. MySQL installer 8.0.28.0 download interface is empty solution
猜你喜欢

Thesis study - 7 Very Deep Convolutional Networks for Large-Scale Image Recognition (3/3)

SQL injection for Web Security (1)
![[free sharing] kotalog diary2022 plan electronic manual ledger](/img/ca/1ffbfcc16e3019261f70274a89c16f.jpg)
[free sharing] kotalog diary2022 plan electronic manual ledger

235. 二叉搜索树的最近公共祖先【lca模板 + 找路径相同】

2022.2.14 Li Kou - daily question - single element in an ordered array

leetcode:11. 盛最多水的容器【双指针 + 贪心 + 去除最短板】

Flutter网络和数据存储框架搭建 -b1
![[new year job hopping season] test the technical summary of interviewers' favorite questions (with video tutorials and interview questions)](/img/4e/a51365bb88b1fc29d1c77fcdde5350.jpg)
[new year job hopping season] test the technical summary of interviewers' favorite questions (with video tutorials and interview questions)

Flutter network and data storage framework construction-b1
![Free hand account sharing in September - [cream Nebula]](/img/4f/fec31778a56886585e35be87885452.jpg)
Free hand account sharing in September - [cream Nebula]
随机推荐
The way to treat feelings
Pytorch introduction to deep learning practice notes 13- advanced chapter of cyclic neural network - Classification
Suffix derivation based on query object fields
【学术相关】顶级论文创新点怎么找?中国高校首次获CVPR最佳学生论文奖有感...
DriveSeg:动态驾驶场景分割数据集
Record: MySQL changes the time zone
我们做了一个智能零售结算平台
High concurrency Architecture - distributed search engine (ES)
Random numbers in a long range, is that right- Random number in long range, is this the way?
Go home early today
Briefly describe the quantitative analysis system of services
【水质预测】基于matlab模糊神经网络水质预测【含Matlab源码 1923期】
[optics] vortex generation based on MATLAB [including Matlab source code 1927]
东数西算拉动千亿产业,敢啃“硬骨头”的存储厂商才更有机会
平淡的生活里除了有扎破皮肤的刺,还有那些原本让你魂牵梦绕的诗与远方
Redis master-slave synchronization, clustering, persistence
01. Preparation for automated office (free guidance, only three steps)
Ctrip will implement a 3+2 work system in March, with 3 days on duty and 2 days at home every week
Common PostgreSQL commands
Analysis of dart JSON encoder and decoder