当前位置:网站首页>Flume ng configuration
Flume ng configuration
2022-06-29 20:05:00 【Brother Xing plays with the clouds】
1) brief introduction
Flume It's a Distributed 、 reliable 、 And highly available massive log aggregation system , Support to customize all kinds of data senders in the system , To collect data ; meanwhile ,Flume Provides simple processing of data , And write to the various data recipients ( Customizable ) The ability of .
Design objectives : (1) reliability When a node fails , Logs can be delivered to other nodes without loss .Flume There are three levels of Reliability Assurance , The order from strong to weak is :end-to-end( Receive the data agent First of all, will event Write to disk , When the data transfer is successful , And then delete ; If the data delivery fails , You can resend it .),Store on failure( When the data receiver crash when , Write the data locally , After waiting for recovery , Continue to send ),Best effort( After the data is sent to the receiver , There is no confirmation ). (2) Extensibility Flume It adopts three-tier architecture , Respectively agent,collector and storage, Each layer can be expanded horizontally . among , all agent and collector from master Unified management , This makes the system easy to monitor and maintain , And master More than one is allowed ( Use ZooKeeper Manage and load balance ), This avoids a single point of failure . (3) manageability all agent and colletor from master Unified management , This makes the system easy to maintain . many master situation ,Flume utilize ZooKeeper and gossip, Ensure the consistency of dynamic configuration data . Users can go to master Check the execution of each data source or data flow on , And it can configure and load data sources dynamically .Flume Provides web and shell script command Two forms of data flow management . (4) Functional scalability Users can add their own agent,collector perhaps storage. Besides ,Flume It comes with a lot of components , Includes a variety of agent(file, syslog etc. ),collector and storage(File,HDFS,HBase etc. ).
2) To configure
Previously configured Hadoop and hbase, So we need to put hadoop and hbase start-up , To write the file to hdfs and hbase.hadoop-2.2.0 and hbase-0.96.0 Refer to... For the configuration of 《Ubuntu and CentOS in Distributed To configure Hadoop-2.2.0》 http://www.linuxidc.com/Linux/2014-01/95799.htm and 《CentOS Distributed Environmental installation HBase-0.96.0》 http://www.linuxidc.com/Linux/2014-01/95801.htm .
The configuration environment is two sets equipped with centos Test of colony . The host name is master Your machine is responsible for collecting logs , The host name is node Your machine is responsible for writing logs , There are three writing methods configured this time : Write to normal Directory , write in hdfs.
First download flume-ng The binary compressed file of . Address :http://flume.apache.org/download.html. After the download , Unzip the file . The first edit /etc/profile file , Add the following lines :
- export FLUME_HOME=/home/aaron/apache-flume-1.4.0-bin
- export FLUME_CONF_DIR=$FLUME_HOME/conf
- export PATH=$PATH:$FLUME_HOME/bin
export FLUME_HOME=/home/aaron/apache-flume-1.4.0-bin
export FLUME_CONF_DIR=$FLUME_HOME/conf
export PATH=$PATH:$FLUME_HOME/binRemember to run after adding $ souce /etc/profile Order the modification to take effect .
stay master Of flume The folder conf Directory , Create a new one flume-master.conf file , The contents are as follows :
- agent.sources = seqGenSrc
- agent.channels = memoryChannel
- agent.sinks = remoteSink
- # For each one of the sources, the type is defined
- agent.sources.seqGenSrc.type = exec
- agent.sources.seqGenSrc.command = tail -F /home/aaron/test
- # The channel can be defined as follows.
- agent.sources.seqGenSrc.channels = memoryChannel
- # Each sink's type must be defined
- agent.sinks.loggerSink.type = logger
- #Specify the channel the sink should use
- agent.sinks.loggerSink.channel = memoryChannel
- # Each channel's type is defined.
- agent.channels.memoryChannel.type = memory
- # Other config values specific to each type of channel(sink or source)
- # can be defined as well
- # In this case, it specifies the capacity of the memory channel
- agent.channels.memoryChannel.capacity = 100
- agent.channels.memoryChannel.keep-alive = 100
- agent.sinks.remoteSink.type = avro
- agent.sinks.remoteSink.hostname = node
- agent.sinks.remoteSink.port = 23004
- agent.sinks.remoteSink.channel = memoryChannel
agent.sources = seqGenSrc
agent.channels = memoryChannel
agent.sinks = remoteSink
# For each one of the sources, the type is defined
agent.sources.seqGenSrc.type = exec
agent.sources.seqGenSrc.command = tail -F /home/aaron/test
# The channel can be defined as follows.
agent.sources.seqGenSrc.channels = memoryChannel
# Each sink's type must be defined
agent.sinks.loggerSink.type = logger
#Specify the channel the sink should use
agent.sinks.loggerSink.channel = memoryChannel
# Each channel's type is defined.
agent.channels.memoryChannel.type = memory
# Other config values specific to each type of channel(sink or source)
# can be defined as well
# In this case, it specifies the capacity of the memory channel
agent.channels.memoryChannel.capacity = 100
agent.channels.memoryChannel.keep-alive = 100
agent.sinks.remoteSink.type = avro
agent.sinks.remoteSink.hostname = node
agent.sinks.remoteSink.port = 23004
agent.sinks.remoteSink.channel = memoryChannelstay node The machine will also /etc/profile File add the above configuration . then , stay conf Create a new one in flume-node.conf file , Revised as follows :
- agent.sources = seqGenSrc1
- agent.channels = memoryChannel
- #agent.sinks = fileSink
- agent.sinks = <SPANstyle="FONT-FAMILY: Arial, Helvetica, sans-serif">fileSink</SPAN>
- # For each one of the sources, the type is defined
- agent.sources.seqGenSrc1.type = avro
- agent.sources.seqGenSrc1.bind = node
- agent.sources.seqGenSrc1.port = 23004
- # The channel can be defined as follows.
- agent.sources.seqGenSrc1.channels = memoryChannel
- # Each sink's type must be defined
- agent.sinks.loggerSink.type = logger
- #Specify the channel the sink should use
- agent.sinks.loggerSink.channel = memoryChannel
- # Each channel's type is defined.
- agent.channels.memoryChannel.type = memory
- # Other config values specific to each type of channel(sink or source)
- # can be defined as well
- # In this case, it specifies the capacity of the memory channel
- agent.channels.memoryChannel.capacity = 100
- agent.channels.memoryChannel.keep-alive = 100
- agent.sources.flieSink.type = avro
- agent.sources.fileSink.channel = memoryChannel
- agent.sources.fileSink.sink.directory = /home/aaron/
- agent.sources.fileSink.serializer.appendNewline = true
agent.sources = seqGenSrc1
agent.channels = memoryChannel
#agent.sinks = fileSink
agent.sinks = fileSink
# For each one of the sources, the type is defined
agent.sources.seqGenSrc1.type = avro
agent.sources.seqGenSrc1.bind = node
agent.sources.seqGenSrc1.port = 23004
# The channel can be defined as follows.
agent.sources.seqGenSrc1.channels = memoryChannel
# Each sink's type must be defined
agent.sinks.loggerSink.type = logger
#Specify the channel the sink should use
agent.sinks.loggerSink.channel = memoryChannel
# Each channel's type is defined.
agent.channels.memoryChannel.type = memory
# Other config values specific to each type of channel(sink or source)
# can be defined as well
# In this case, it specifies the capacity of the memory channel
agent.channels.memoryChannel.capacity = 100
agent.channels.memoryChannel.keep-alive = 100
agent.sources.flieSink.type = avro
agent.sources.fileSink.channel = memoryChannel
agent.sources.fileSink.sink.directory = /home/aaron/
agent.sources.fileSink.serializer.appendNewline = truestay master Run the command above :
- $ bin/flume-ng agent --conf ./conf/ -f conf/flume-maste.conf -Dflume.root.logger=DEBUG,console -n agent
$ bin/flume-ng agent --conf ./conf/ -f conf/flume-maste.conf -Dflume.root.logger=DEBUG,console -n agentstay node Run command on :
- $ bin/flume-ng agent --conf ./conf/ -f conf/flume-node.conf -Dflume.root.logger=DEBUG,console -n agent
$ bin/flume-ng agent --conf ./conf/ -f conf/flume-node.conf -Dflume.root.logger=DEBUG,console -n agentAfter starting , It can be found that the two can communicate with each other ,master The above file can be sent to node On , modify master Upper test file , When adding content later ,node You can also receive .
If you want to write content to hadoop, Can be node Medium flume-node.conf The document is modified as follows :
- agent.sinks = k2
- agent.sinks.k2.type = hdfs
- agent.sinks.k2.channel = memoryChannel
- agent.sinks.k2.hdfs.path = hdfs://master:8089/hbase
- agent.sinks.k2.hdfs.fileType = DataStream
- agent.sinks.k2.hdfs.writeFormat = Text
agent.sinks = k2
agent.sinks.k2.type = hdfs
agent.sinks.k2.channel = memoryChannel
agent.sinks.k2.hdfs.path = hdfs://master:8089/hbase
agent.sinks.k2.hdfs.fileType = DataStream
agent.sinks.k2.hdfs.writeFormat = Textamong ,hdfs://master:8089/hbase by hadoop Of hdfs File path .
边栏推荐
- 【Try to Hack】vulnhub narak
- 1404萬!四川省人社廳關系型數據庫及中間件軟件系統昇級采購招標!
- Withdrawal of user curve in qualified currency means loss
- 14.04 million! Sichuan provincial human resources and social security department relational database and middleware software system upgrade procurement bidding!
- XSS漏洞
- Three.js开发:粗线的画法
- Sentinel的快速入门,三分钟带你体验流量控制
- wangeditor富文本编辑器使用(详细)
- proxmox集群节点崩溃处理
- 剑指 Offer 59 - I. 滑动窗口的最大值
猜你喜欢

日本樱桃一颗拍出1980元天价,网友:吃了有上当的感觉

A keepalived high availability accident made me learn it again!

如何设置 Pod 到指定节点运行

【U盘检测】为了转移压箱底的资料,买了个2T U盘检测仅仅只有47G~

A great open source image watermarking solution

Performance improvement at the cost of other components is not good

ASP. Net core creates razor page and uploads multiple files (buffer mode) (Continued)

Flume configuration 3 - interceptor filtering

Withdrawal of user curve in qualified currency means loss
![[USB flash disk test] in order to transfer the data at the bottom of the pressure box, I bought a 2T USB flash disk, and the test result is only 47g~](/img/c3/e0637385d35943f1914477bb9f2b54.png)
[USB flash disk test] in order to transfer the data at the bottom of the pressure box, I bought a 2T USB flash disk, and the test result is only 47g~
随机推荐
2022年理财利率都降了,那该如何选择理财产品?
Linux安装MySQL5
[sword finger offer] 51 Reverse pair in array
As the "only" privacy computing provider, insight technology is the "first" to settle in the Yangtze River Delta data element circulation service platform
Configuration du Flume 4 - source personnalisée + sink
软件测试逻辑覆盖相关理解
Snowflake ID, distributed unique ID
罗清启:高端家电已成红海?卡萨帝率先破局
1404万!四川省人社厅关系型数据库及中间件软件系统升级采购招标!
npm ERR! fatal: early EOF npm ERR! fatal: index-pack failed
Flume配置3——拦截器过滤
Union find
Golang基础学习
npm ERR! fatal: early EOF npm ERR! fatal: index-pack failed
SSH command and instructions
Real time tracking of bug handling progress of the project through metersphere and dataease
A great open source image watermarking solution
苹果iPhone手机升级系统内存空间变小不够如何解决?
proxmox集群节点崩溃处理
Sword finger offer 59 - I. maximum value of sliding window