当前位置:网站首页>Flume learning notes
Flume learning notes
2022-07-03 19:14:00 【Dream touch reincarnation】
function
Distributed real-time files 、 Network port data flow collection , Data from various data sources can be collected to various destinations in real time
characteristic
Real time acquisition Monitor data sources in real time , Collect data as soon as it is generated
Comprehensive function The common data sources and destinations of big data are encapsulated with corresponding interfaces
Allow custom development Java Source code of development , It provides an interface for user-defined development
Development is relatively simple Develop a configuration file , Just write the configuration
It can realize distributed collection Itself is not a distributed tool , It can realize distributed collection
framework
Agent: One flume The program is a Agent
Event:flume The collected data is encapsulated as Event Object to transmit
Source: Monitor data sources in real time , As soon as the data source generates data, it collects
Channel: Be responsible for temporarily storing the collected data , Will all Event Temporary storage
Sink: Responsible for Channel Send the data in to the destination , Initiative from Channel Count
Multi data source architecture

Design purpose : Write a copy to different destinations
Multi tier architecture

Design purpose : Prevent multiple Flume The program directly interacts with the destination , Affect destination performance
Usage mode
offline ( Collect to HDFS): To configure Source and Sink file , start-up Hive and HDFS, Submit and run on the command line
real time ( Collect to Kafka): To configure Source and Sink file , Collect to kafka, For consumption by real-time computing programs
Comparison of similar software
Sqoop The bottom is MapReduce, It is suitable for collection with a large amount of offline data
Flume Suitable for real-time collection of files , Network port
Canal Suitable for real-time acquisition MySQL database

Comparison of similar software
sqoop The bottom is MapReduce, It is suitable for collection with a large amount of offline data
Flume Suitable for real-time collection of files , Network port
Canal Suitable for real-time acquisition MySQL database

Advanced components
Interceptor: Interceptor , stay source Convert each piece of data into event When , Can be in event Head shop addition kv Or filter the data
Add data :
Timestamp Interceptor
Host Interceptor
Static Interceptor
Filtering data :
Regex Filtering Interceptor
Channel Select
Default source Data is sent to each channel One copy , According to Agent The head of the key Different , Send to different channel
Sink Processor
function
Load balancing
Multiple sink With sink group Way to work together , One of the faults , It can also collect normally
Fail over
Multiple sink, One job , Others don't work , Only at work sink It works only after failure , Ensure the normal collection
边栏推荐
- The way to treat feelings
- How to design a high concurrency system
- Random numbers in a long range, is that right- Random number in long range, is this the way?
- I didn't cancel
- Integrated easy to pay secondary domain name distribution system
- leetcode:556. Next larger element III [simulation + change as little as possible]
- “google is not defined” when using Google Maps V3 in Firefox remotely
- Simple solution of physical backup and restore of Damon database
- Web Security (VIII) what is CSRF attack? Why can token prevent csdf attacks?
- Think of new ways
猜你喜欢

leetcode:556. 下一个更大元素 III【模拟 + 尽可能少变更】

FBI warning: some people use AI to disguise themselves as others for remote interview

235. 二叉搜索樹的最近公共祖先【lca模板 + 找路徑相同】
![235. The nearest common ancestor of the binary search tree [LCA template + same search path]](/img/f5/f2d244e7f19e9ddeebf070a1d06dce.png)
235. The nearest common ancestor of the binary search tree [LCA template + same search path]

Thesis study - 7 Very Deep Convolutional Networks for Large-Scale Image Recognition (3/3)
Know what it is, and know why, JS object creation and inheritance [summary and sorting]

During MySQL installation, the download interface is empty, and the components to be downloaded are not displayed. MySQL installer 8.0.28.0 download interface is empty solution
![[mathematical modeling] ship three degree of freedom MMG model based on MATLAB [including Matlab source code 1925]](/img/a9/d89ee2b88517eea6b3c38d72cf099f.jpg)
[mathematical modeling] ship three degree of freedom MMG model based on MATLAB [including Matlab source code 1925]

Add control at the top of compose lazycolumn

东数西算拉动千亿产业,敢啃“硬骨头”的存储厂商才更有机会
随机推荐
Suffix derivation based on query object fields
php-fpm的max_chindren的一些误区
Flutter网络和数据存储框架搭建 -b1
EGO Planner代码解析bspline_optimizer部分(3)
Streaming media server (16) -- figure out the difference between live broadcast and on-demand
KINGS
Comments on flowable source code (37) asynchronous job processor
How to read the source code [debug and observe the source code]
[optics] dielectric constant calculation based on MATLAB [including Matlab source code 1926]
[water quality prediction] water quality prediction based on MATLAB Fuzzy Neural Network [including Matlab source code 1923]
Simulation scheduling problem of SystemVerilog (1)
Record: MySQL changes the time zone
math_ Taylor formula
High concurrency Architecture - distributed search engine (ES)
Typescript configuration
Php based campus lost and found platform (automatic matching push)
01. Preparation for automated office (free guidance, only three steps)
Common PostgreSQL commands
【LeetCode】【SQL】刷题笔记
Flutter network and data storage framework construction-b1