当前位置:网站首页>Flume ng configuration
Flume ng configuration
2022-06-29 20:05:00 【Brother Xing plays with the clouds】
1) brief introduction
Flume It's a Distributed 、 reliable 、 And highly available massive log aggregation system , Support to customize all kinds of data senders in the system , To collect data ; meanwhile ,Flume Provides simple processing of data , And write to the various data recipients ( Customizable ) The ability of .
Design objectives : (1) reliability When a node fails , Logs can be delivered to other nodes without loss .Flume There are three levels of Reliability Assurance , The order from strong to weak is :end-to-end( Receive the data agent First of all, will event Write to disk , When the data transfer is successful , And then delete ; If the data delivery fails , You can resend it .),Store on failure( When the data receiver crash when , Write the data locally , After waiting for recovery , Continue to send ),Best effort( After the data is sent to the receiver , There is no confirmation ). (2) Extensibility Flume It adopts three-tier architecture , Respectively agent,collector and storage, Each layer can be expanded horizontally . among , all agent and collector from master Unified management , This makes the system easy to monitor and maintain , And master More than one is allowed ( Use ZooKeeper Manage and load balance ), This avoids a single point of failure . (3) manageability all agent and colletor from master Unified management , This makes the system easy to maintain . many master situation ,Flume utilize ZooKeeper and gossip, Ensure the consistency of dynamic configuration data . Users can go to master Check the execution of each data source or data flow on , And it can configure and load data sources dynamically .Flume Provides web and shell script command Two forms of data flow management . (4) Functional scalability Users can add their own agent,collector perhaps storage. Besides ,Flume It comes with a lot of components , Includes a variety of agent(file, syslog etc. ),collector and storage(File,HDFS,HBase etc. ).
2) To configure
Previously configured Hadoop and hbase, So we need to put hadoop and hbase start-up , To write the file to hdfs and hbase.hadoop-2.2.0 and hbase-0.96.0 Refer to... For the configuration of 《Ubuntu and CentOS in Distributed To configure Hadoop-2.2.0》 http://www.linuxidc.com/Linux/2014-01/95799.htm and 《CentOS Distributed Environmental installation HBase-0.96.0》 http://www.linuxidc.com/Linux/2014-01/95801.htm .
The configuration environment is two sets equipped with centos Test of colony . The host name is master Your machine is responsible for collecting logs , The host name is node Your machine is responsible for writing logs , There are three writing methods configured this time : Write to normal Directory , write in hdfs.
First download flume-ng The binary compressed file of . Address :http://flume.apache.org/download.html. After the download , Unzip the file . The first edit /etc/profile file , Add the following lines :
- export FLUME_HOME=/home/aaron/apache-flume-1.4.0-bin
- export FLUME_CONF_DIR=$FLUME_HOME/conf
- export PATH=$PATH:$FLUME_HOME/bin
export FLUME_HOME=/home/aaron/apache-flume-1.4.0-bin
export FLUME_CONF_DIR=$FLUME_HOME/conf
export PATH=$PATH:$FLUME_HOME/binRemember to run after adding $ souce /etc/profile Order the modification to take effect .
stay master Of flume The folder conf Directory , Create a new one flume-master.conf file , The contents are as follows :
- agent.sources = seqGenSrc
- agent.channels = memoryChannel
- agent.sinks = remoteSink
- # For each one of the sources, the type is defined
- agent.sources.seqGenSrc.type = exec
- agent.sources.seqGenSrc.command = tail -F /home/aaron/test
- # The channel can be defined as follows.
- agent.sources.seqGenSrc.channels = memoryChannel
- # Each sink's type must be defined
- agent.sinks.loggerSink.type = logger
- #Specify the channel the sink should use
- agent.sinks.loggerSink.channel = memoryChannel
- # Each channel's type is defined.
- agent.channels.memoryChannel.type = memory
- # Other config values specific to each type of channel(sink or source)
- # can be defined as well
- # In this case, it specifies the capacity of the memory channel
- agent.channels.memoryChannel.capacity = 100
- agent.channels.memoryChannel.keep-alive = 100
- agent.sinks.remoteSink.type = avro
- agent.sinks.remoteSink.hostname = node
- agent.sinks.remoteSink.port = 23004
- agent.sinks.remoteSink.channel = memoryChannel
agent.sources = seqGenSrc
agent.channels = memoryChannel
agent.sinks = remoteSink
# For each one of the sources, the type is defined
agent.sources.seqGenSrc.type = exec
agent.sources.seqGenSrc.command = tail -F /home/aaron/test
# The channel can be defined as follows.
agent.sources.seqGenSrc.channels = memoryChannel
# Each sink's type must be defined
agent.sinks.loggerSink.type = logger
#Specify the channel the sink should use
agent.sinks.loggerSink.channel = memoryChannel
# Each channel's type is defined.
agent.channels.memoryChannel.type = memory
# Other config values specific to each type of channel(sink or source)
# can be defined as well
# In this case, it specifies the capacity of the memory channel
agent.channels.memoryChannel.capacity = 100
agent.channels.memoryChannel.keep-alive = 100
agent.sinks.remoteSink.type = avro
agent.sinks.remoteSink.hostname = node
agent.sinks.remoteSink.port = 23004
agent.sinks.remoteSink.channel = memoryChannelstay node The machine will also /etc/profile File add the above configuration . then , stay conf Create a new one in flume-node.conf file , Revised as follows :
- agent.sources = seqGenSrc1
- agent.channels = memoryChannel
- #agent.sinks = fileSink
- agent.sinks = <SPANstyle="FONT-FAMILY: Arial, Helvetica, sans-serif">fileSink</SPAN>
- # For each one of the sources, the type is defined
- agent.sources.seqGenSrc1.type = avro
- agent.sources.seqGenSrc1.bind = node
- agent.sources.seqGenSrc1.port = 23004
- # The channel can be defined as follows.
- agent.sources.seqGenSrc1.channels = memoryChannel
- # Each sink's type must be defined
- agent.sinks.loggerSink.type = logger
- #Specify the channel the sink should use
- agent.sinks.loggerSink.channel = memoryChannel
- # Each channel's type is defined.
- agent.channels.memoryChannel.type = memory
- # Other config values specific to each type of channel(sink or source)
- # can be defined as well
- # In this case, it specifies the capacity of the memory channel
- agent.channels.memoryChannel.capacity = 100
- agent.channels.memoryChannel.keep-alive = 100
- agent.sources.flieSink.type = avro
- agent.sources.fileSink.channel = memoryChannel
- agent.sources.fileSink.sink.directory = /home/aaron/
- agent.sources.fileSink.serializer.appendNewline = true
agent.sources = seqGenSrc1
agent.channels = memoryChannel
#agent.sinks = fileSink
agent.sinks = fileSink
# For each one of the sources, the type is defined
agent.sources.seqGenSrc1.type = avro
agent.sources.seqGenSrc1.bind = node
agent.sources.seqGenSrc1.port = 23004
# The channel can be defined as follows.
agent.sources.seqGenSrc1.channels = memoryChannel
# Each sink's type must be defined
agent.sinks.loggerSink.type = logger
#Specify the channel the sink should use
agent.sinks.loggerSink.channel = memoryChannel
# Each channel's type is defined.
agent.channels.memoryChannel.type = memory
# Other config values specific to each type of channel(sink or source)
# can be defined as well
# In this case, it specifies the capacity of the memory channel
agent.channels.memoryChannel.capacity = 100
agent.channels.memoryChannel.keep-alive = 100
agent.sources.flieSink.type = avro
agent.sources.fileSink.channel = memoryChannel
agent.sources.fileSink.sink.directory = /home/aaron/
agent.sources.fileSink.serializer.appendNewline = truestay master Run the command above :
- $ bin/flume-ng agent --conf ./conf/ -f conf/flume-maste.conf -Dflume.root.logger=DEBUG,console -n agent
$ bin/flume-ng agent --conf ./conf/ -f conf/flume-maste.conf -Dflume.root.logger=DEBUG,console -n agentstay node Run command on :
- $ bin/flume-ng agent --conf ./conf/ -f conf/flume-node.conf -Dflume.root.logger=DEBUG,console -n agent
$ bin/flume-ng agent --conf ./conf/ -f conf/flume-node.conf -Dflume.root.logger=DEBUG,console -n agentAfter starting , It can be found that the two can communicate with each other ,master The above file can be sent to node On , modify master Upper test file , When adding content later ,node You can also receive .
If you want to write content to hadoop, Can be node Medium flume-node.conf The document is modified as follows :
- agent.sinks = k2
- agent.sinks.k2.type = hdfs
- agent.sinks.k2.channel = memoryChannel
- agent.sinks.k2.hdfs.path = hdfs://master:8089/hbase
- agent.sinks.k2.hdfs.fileType = DataStream
- agent.sinks.k2.hdfs.writeFormat = Text
agent.sinks = k2
agent.sinks.k2.type = hdfs
agent.sinks.k2.channel = memoryChannel
agent.sinks.k2.hdfs.path = hdfs://master:8089/hbase
agent.sinks.k2.hdfs.fileType = DataStream
agent.sinks.k2.hdfs.writeFormat = Textamong ,hdfs://master:8089/hbase by hadoop Of hdfs File path .
边栏推荐
- [sword finger offer] 51 Reverse pair in array
- 文件包含漏洞
- There is no small green triangle on the method in idea
- Lock4j -- distributed lock Middleware -- customize the logic of lock acquisition failure
- [notes] take notes again -- learn by doing Verilog HDL – 014
- Measures to support the development of advanced manufacturing industry in Futian District of Shenzhen in 2022
- Spark存储体系底层架构剖析-Spark商业环境实战
- Flume configuration 1 - basic case
- A keepalived high availability accident made me learn it again!
- Connaissance générale des paramètres de sécurité du serveur Cloud
猜你喜欢

Koa 源码剖析

画虎国手孟祥顺数字藏品限量发售,随赠虎年茅台

There is no small green triangle on the method in idea

As the "only" privacy computing provider, insight technology is the "first" to settle in the Yangtze River Delta data element circulation service platform

数据链路层

Flume configuration 2 - ganglia for monitoring

File contains vulnerability

Flume configuration 3 - interceptor filtering

Real time tracking of bug handling progress of the project through metersphere and dataease

Jmeter之BeanShell详解和夸线程调用
随机推荐
ETCD数据库源码分析——服务端PUT流程
Hangfire details
XSS vulnerability
A great open source image watermarking solution
2022年深圳市福田区支持先进制造业发展若干措施
NLP - giza++ implements word alignment
Software engineering - principles, methods and Applications
Regular expression series of mobile phone numbers
How to use filters in jfinal to monitor Druid for SQL execution?
Chapter II (physical layer)
剑指 Offer 59 - I. 滑动窗口的最大值
通过MeterSphere和DataEase实现项目Bug处理进展实时跟进
Sword finger offer 66 Building a product array
Zotero期刊自动匹配更新影响因子
数据链路层
文件包含漏洞
Flume配置4——自定義Source+Sink
There is no small green triangle on the method in idea
Real time tracking of bug handling progress of the project through metersphere and dataease
wangeditor富文本编辑器使用(详细)