当前位置:网站首页>Flume learning notes
Flume learning notes
2022-07-03 19:14:00 【Dream touch reincarnation】
function
Distributed real-time files 、 Network port data flow collection , Data from various data sources can be collected to various destinations in real time
characteristic
Real time acquisition Monitor data sources in real time , Collect data as soon as it is generated
Comprehensive function The common data sources and destinations of big data are encapsulated with corresponding interfaces
Allow custom development Java Source code of development , It provides an interface for user-defined development
Development is relatively simple Develop a configuration file , Just write the configuration
It can realize distributed collection Itself is not a distributed tool , It can realize distributed collection
framework
Agent: One flume The program is a Agent
Event:flume The collected data is encapsulated as Event Object to transmit
Source: Monitor data sources in real time , As soon as the data source generates data, it collects
Channel: Be responsible for temporarily storing the collected data , Will all Event Temporary storage
Sink: Responsible for Channel Send the data in to the destination , Initiative from Channel Count
Multi data source architecture
Design purpose : Write a copy to different destinations
Multi tier architecture
Design purpose : Prevent multiple Flume The program directly interacts with the destination , Affect destination performance
Usage mode
offline ( Collect to HDFS): To configure Source and Sink file , start-up Hive and HDFS, Submit and run on the command line
real time ( Collect to Kafka): To configure Source and Sink file , Collect to kafka, For consumption by real-time computing programs
Comparison of similar software
Sqoop The bottom is MapReduce, It is suitable for collection with a large amount of offline data
Flume Suitable for real-time collection of files , Network port
Canal Suitable for real-time acquisition MySQL database
Comparison of similar software
sqoop The bottom is MapReduce, It is suitable for collection with a large amount of offline data
Flume Suitable for real-time collection of files , Network port
Canal Suitable for real-time acquisition MySQL database
Advanced components
Interceptor: Interceptor , stay source Convert each piece of data into event When , Can be in event Head shop addition kv Or filter the data
Add data :
Timestamp Interceptor
Host Interceptor
Static Interceptor
Filtering data :
Regex Filtering Interceptor
Channel Select
Default source Data is sent to each channel One copy , According to Agent The head of the key Different , Send to different channel
Sink Processor
function
Load balancing
Multiple sink With sink group Way to work together , One of the faults , It can also collect normally
Fail over
Multiple sink, One job , Others don't work , Only at work sink It works only after failure , Ensure the normal collection
边栏推荐
- flask 生成swagger文档
- [optics] vortex generation based on MATLAB [including Matlab source code 1927]
- Compose LazyColumn 顶部添加控件
- The installation path cannot be selected when installing MySQL 8.0.23
- The online customer service system developed by PHP is fully open source without encryption, and supports wechat customer service docking
- Common PostgreSQL commands
- In addition to the prickles that pierce your skin, there are poems and distant places that originally haunt you in plain life
- “google is not defined” when using Google Maps V3 in Firefox remotely
- Integrated easy to pay secondary domain name distribution system
- HOW TO WRITE A DAILY LAB NOTE?
猜你喜欢
Help change the socket position of PCB part
Add control at the top of compose lazycolumn
【水质预测】基于matlab模糊神经网络水质预测【含Matlab源码 1923期】
我们做了一个智能零售结算平台
Compose LazyColumn 顶部添加控件
ActiveMQ的基础
组策略中开机脚本与登录脚本所使用的用户身份
Leetcode: 11. Récipient contenant le plus d'eau [double pointeur + cupidité + enlèvement de la plaque la plus courte]
Counting from the East and counting from the West will stimulate 100 billion industries. Only storage manufacturers who dare to bite the "hard bone" will have more opportunities
Pytorch introduction to deep learning practice notes 13- advanced chapter of cyclic neural network - Classification
随机推荐
SQL custom collation
Common PostgreSQL commands
High concurrency Architecture - read write separation
235. Ancêtre public le plus proche de l'arbre de recherche binaire [modèle LCA + même chemin de recherche]
變化是永恒的主題
Smart wax therapy machine based on STM32 and smart cloud
Understanding of database architecture
I didn't cancel
Ego planner code parsing Bspline_ Optimizer section (1)
Foundation of ActiveMQ
shell 脚本中关于用户输入参数的处理
Zhengda futures news: soaring oil prices may continue to push up global inflation
Briefly describe the quantitative analysis system of services
The online customer service system developed by PHP is fully open source without encryption, and supports wechat customer service docking
235. 二叉搜索樹的最近公共祖先【lca模板 + 找路徑相同】
【LeetCode】【SQL】刷题笔记
变化是永恒的主题
Streaming media server (16) -- figure out the difference between live broadcast and on-demand
How to build an efficient information warehouse
application