当前位置:网站首页>Flume-ng配置
Flume-ng配置
2022-06-29 19:57:00 【星哥玩云】
1)简介
Flume是一个分布式、可靠、和高可用的海量日志聚合的系统,支持在系统中定制各类数据发送方,用于收集数据;同时,Flume提供对数据进行简单处理,并写到各种数据接受方(可定制)的能力。
设计目标: (1) 可靠性 当节点出现故障时,日志能够被传送到其他节点上而不会丢失。Flume提供了三种级别的可靠性保障,从强到弱依次分别为:end-to-end(收到数据agent首先将event写到磁盘上,当数据传送成功后,再删除;如果数据发送失败,可以重新发送。),Store on failure(当数据接收方crash时,将数据写到本地,待恢复后,继续发送),Best effort(数据发送到接收方后,不会进行确认)。 (2) 可扩展性 Flume采用了三层架构,分别为agent,collector和storage,每一层均可以水平扩展。其中,所有agent和collector由master统一管理,这使得系统容易监控和维护,且master允许有多个(使用ZooKeeper进行管理和负载均衡),这就避免了单点故障问题。 (3) 可管理性 所有agent和colletor由master统一管理,这使得系统便于维护。多master情况,Flume利用ZooKeeper和gossip,保证动态配置数据的一致性。用户可以在master上查看各个数据源或者数据流执行情况,且可以对各个数据源配置和动态加载。Flume提供了web 和shell script command两种形式对数据流进行管理。 (4) 功能可扩展性 用户可以根据需要添加自己的agent,collector或者storage。此外,Flume自带了很多组件,包括各种agent(file, syslog等),collector和storage(File,HDFS,HBase等)。
2)配置
之前配置过Hadoop和hbase,所以需要先将hadoop和hbase启动,才能将文件写入hdfs和hbase。hadoop-2.2.0和hbase-0.96.0的配置分别参考《Ubuntu和CentOS中分布式配置Hadoop-2.2.0》 http://www.linuxidc.com/Linux/2014-01/95799.htm 和《CentOS分布式环境安装HBase-0.96.0》 http://www.linuxidc.com/Linux/2014-01/95801.htm 。
本次配置环境为两台装有centos 的测试集群。主机名为master的机器负责收集日志,主机名为node的机器负责日志的写入,本次配置的写入方式有三种:写入普通目录,写入hdfs。
首先下载flume-ng的二进制压缩文件。地址:http://flume.apache.org/download.html。下载好后,解压文件。首先编辑/etc/profile文件,在其中添加如下几行:
- export FLUME_HOME=/home/aaron/apache-flume-1.4.0-bin
- export FLUME_CONF_DIR=$FLUME_HOME/conf
- export PATH=$PATH:$FLUME_HOME/bin
export FLUME_HOME=/home/aaron/apache-flume-1.4.0-bin
export FLUME_CONF_DIR=$FLUME_HOME/conf
export PATH=$PATH:$FLUME_HOME/bin添加好之后记得运行$ souce /etc/profile命令使修改生效。
在master的flume文件夹的conf目录中,新建一个flume-master.conf文件,内容如下:
- agent.sources = seqGenSrc
- agent.channels = memoryChannel
- agent.sinks = remoteSink
- # For each one of the sources, the type is defined
- agent.sources.seqGenSrc.type = exec
- agent.sources.seqGenSrc.command = tail -F /home/aaron/test
- # The channel can be defined as follows.
- agent.sources.seqGenSrc.channels = memoryChannel
- # Each sink's type must be defined
- agent.sinks.loggerSink.type = logger
- #Specify the channel the sink should use
- agent.sinks.loggerSink.channel = memoryChannel
- # Each channel's type is defined.
- agent.channels.memoryChannel.type = memory
- # Other config values specific to each type of channel(sink or source)
- # can be defined as well
- # In this case, it specifies the capacity of the memory channel
- agent.channels.memoryChannel.capacity = 100
- agent.channels.memoryChannel.keep-alive = 100
- agent.sinks.remoteSink.type = avro
- agent.sinks.remoteSink.hostname = node
- agent.sinks.remoteSink.port = 23004
- agent.sinks.remoteSink.channel = memoryChannel
agent.sources = seqGenSrc
agent.channels = memoryChannel
agent.sinks = remoteSink
# For each one of the sources, the type is defined
agent.sources.seqGenSrc.type = exec
agent.sources.seqGenSrc.command = tail -F /home/aaron/test
# The channel can be defined as follows.
agent.sources.seqGenSrc.channels = memoryChannel
# Each sink's type must be defined
agent.sinks.loggerSink.type = logger
#Specify the channel the sink should use
agent.sinks.loggerSink.channel = memoryChannel
# Each channel's type is defined.
agent.channels.memoryChannel.type = memory
# Other config values specific to each type of channel(sink or source)
# can be defined as well
# In this case, it specifies the capacity of the memory channel
agent.channels.memoryChannel.capacity = 100
agent.channels.memoryChannel.keep-alive = 100
agent.sinks.remoteSink.type = avro
agent.sinks.remoteSink.hostname = node
agent.sinks.remoteSink.port = 23004
agent.sinks.remoteSink.channel = memoryChannel在node机器上也将/etc/profile文件添加上面的配置。然后,在conf中新建一个flume-node.conf文件,修改如下:
- agent.sources = seqGenSrc1
- agent.channels = memoryChannel
- #agent.sinks = fileSink
- agent.sinks = <SPANstyle="FONT-FAMILY: Arial, Helvetica, sans-serif">fileSink</SPAN>
- # For each one of the sources, the type is defined
- agent.sources.seqGenSrc1.type = avro
- agent.sources.seqGenSrc1.bind = node
- agent.sources.seqGenSrc1.port = 23004
- # The channel can be defined as follows.
- agent.sources.seqGenSrc1.channels = memoryChannel
- # Each sink's type must be defined
- agent.sinks.loggerSink.type = logger
- #Specify the channel the sink should use
- agent.sinks.loggerSink.channel = memoryChannel
- # Each channel's type is defined.
- agent.channels.memoryChannel.type = memory
- # Other config values specific to each type of channel(sink or source)
- # can be defined as well
- # In this case, it specifies the capacity of the memory channel
- agent.channels.memoryChannel.capacity = 100
- agent.channels.memoryChannel.keep-alive = 100
- agent.sources.flieSink.type = avro
- agent.sources.fileSink.channel = memoryChannel
- agent.sources.fileSink.sink.directory = /home/aaron/
- agent.sources.fileSink.serializer.appendNewline = true
agent.sources = seqGenSrc1
agent.channels = memoryChannel
#agent.sinks = fileSink
agent.sinks = fileSink
# For each one of the sources, the type is defined
agent.sources.seqGenSrc1.type = avro
agent.sources.seqGenSrc1.bind = node
agent.sources.seqGenSrc1.port = 23004
# The channel can be defined as follows.
agent.sources.seqGenSrc1.channels = memoryChannel
# Each sink's type must be defined
agent.sinks.loggerSink.type = logger
#Specify the channel the sink should use
agent.sinks.loggerSink.channel = memoryChannel
# Each channel's type is defined.
agent.channels.memoryChannel.type = memory
# Other config values specific to each type of channel(sink or source)
# can be defined as well
# In this case, it specifies the capacity of the memory channel
agent.channels.memoryChannel.capacity = 100
agent.channels.memoryChannel.keep-alive = 100
agent.sources.flieSink.type = avro
agent.sources.fileSink.channel = memoryChannel
agent.sources.fileSink.sink.directory = /home/aaron/
agent.sources.fileSink.serializer.appendNewline = true在master上面运行命令:
- $ bin/flume-ng agent --conf ./conf/ -f conf/flume-maste.conf -Dflume.root.logger=DEBUG,console -n agent
$ bin/flume-ng agent --conf ./conf/ -f conf/flume-maste.conf -Dflume.root.logger=DEBUG,console -n agent在node上运行命令:
- $ bin/flume-ng agent --conf ./conf/ -f conf/flume-node.conf -Dflume.root.logger=DEBUG,console -n agent
$ bin/flume-ng agent --conf ./conf/ -f conf/flume-node.conf -Dflume.root.logger=DEBUG,console -n agent启动之后,就可以发现两者之间可以相互通信,master上面的文件就能发送到node上,修改master上的test文件,在后面追加内容时,node也可以接收到。
如果想要将内容写入hadoop,可以将node中的flume-node.conf文件做如下修改:
- agent.sinks = k2
- agent.sinks.k2.type = hdfs
- agent.sinks.k2.channel = memoryChannel
- agent.sinks.k2.hdfs.path = hdfs://master:8089/hbase
- agent.sinks.k2.hdfs.fileType = DataStream
- agent.sinks.k2.hdfs.writeFormat = Text
agent.sinks = k2
agent.sinks.k2.type = hdfs
agent.sinks.k2.channel = memoryChannel
agent.sinks.k2.hdfs.path = hdfs://master:8089/hbase
agent.sinks.k2.hdfs.fileType = DataStream
agent.sinks.k2.hdfs.writeFormat = Text其中,hdfs://master:8089/hbase为hadoop的hdfs文件路径。
边栏推荐
- Measures to support the development of advanced manufacturing industry in Futian District of Shenzhen in 2022
- 14,04 millions! Appel d'offres pour la mise à niveau de la base de données relationnelle et du système logiciel Middleware du Département des ressources humaines et sociales de la province du Sichuan!
- Community interview -- jumpserver open source fortress in the eyes of an it newcomer
- ETCD数据库源码分析——服务端PUT流程
- The era of data security solutions
- MBA-day19 如果p则q矛盾关系p 且非q
- 剑指 Offer 59 - I. 滑动窗口的最大值
- go: 如何编写一个正确的udp服务端
- Linux Installation mysql8
- 通过MeterSphere和DataEase实现项目Bug处理进展实时跟进
猜你喜欢

Common knowledge of ECS security settings

以其他组件为代价的性能提升不是好提升

Win11 system component cannot be opened? Win11 system widget cannot be opened solution

A keepalived high availability accident made me learn it again!

Flutter calls Baidu map app to realize location search and route planning

Tiger painter mengxiangshun's digital collection is on sale in limited quantities and comes with Maotai in the year of the tiger

Creators foundation highlights in June
![[boutique] detailed explanation of Pinia](/img/94/d332e32dba54be3c2d3f6ff08a85fa.png)
[boutique] detailed explanation of Pinia

画虎国手孟祥顺数字藏品限量发售,随赠虎年茅台

Flume configuration 1 - basic case
随机推荐
Zotero期刊自動匹配更新影響因子
[network orientation training] - Enterprise Park Network Design - [had done]
自动获取本地连接及网络地址修改
Flume配置1——基础案例
7.取消与关闭
npm ERR! fatal: early EOF npm ERR! fatal: index-pack failed
Static static member variables use @value injection
【U盘检测】为了转移压箱底的资料,买了个2T U盘检测仅仅只有47G~
How to use filters in jfinal to monitor Druid for SQL execution?
剑指 Offer 66. 构建乘积数组
14.04 million! Sichuan provincial human resources and social security department relational database and middleware software system upgrade procurement bidding!
idea中方法上没有小绿色三角
QC protocol + Huawei fcp+ Samsung AFC fast charging 5v9v chip fs2601 application
剑指 Offer 41. 数据流中的中位数
MySQL remote connection
NLP - GIZA++ 实现词对齐
童年经典蓝精灵之百变蓝爸爸数字藏品中奖名单公布
Linux Installation mysql5
What is a database? Database detailed notes! Take you into the database ~ you want to know everything here!
NLP - giza++ implements word alignment