当前位置：网站首页>Flume ng configuration

Flume ng configuration

2022-06-29 20:05:00 【Brother Xing plays with the clouds】

1） brief introduction

Flume It's a Distributed 、 reliable 、 And highly available massive log aggregation system , Support to customize all kinds of data senders in the system , To collect data ; meanwhile ,Flume Provides simple processing of data , And write to the various data recipients （ Customizable ） The ability of .

Design objectives ： (1) reliability When a node fails , Logs can be delivered to other nodes without loss .Flume There are three levels of Reliability Assurance , The order from strong to weak is ：end-to-end（ Receive the data agent First of all, will event Write to disk , When the data transfer is successful , And then delete ; If the data delivery fails , You can resend it .）,Store on failure（ When the data receiver crash when , Write the data locally , After waiting for recovery , Continue to send ）,Best effort（ After the data is sent to the receiver , There is no confirmation ）. (2) Extensibility Flume It adopts three-tier architecture , Respectively agent,collector and storage, Each layer can be expanded horizontally . among , all agent and collector from master Unified management , This makes the system easy to monitor and maintain , And master More than one is allowed （ Use ZooKeeper Manage and load balance ）, This avoids a single point of failure . (3) manageability all agent and colletor from master Unified management , This makes the system easy to maintain . many master situation ,Flume utilize ZooKeeper and gossip, Ensure the consistency of dynamic configuration data . Users can go to master Check the execution of each data source or data flow on , And it can configure and load data sources dynamically .Flume Provides web and shell script command Two forms of data flow management . (4) Functional scalability Users can add their own agent,collector perhaps storage. Besides ,Flume It comes with a lot of components , Includes a variety of agent（file, syslog etc. ）,collector and storage（File,HDFS,HBase etc. ）.

2） To configure

Previously configured Hadoop and hbase, So we need to put hadoop and hbase start-up , To write the file to hdfs and hbase.hadoop-2.2.0 and hbase-0.96.0 Refer to... For the configuration of 《Ubuntu and CentOS in Distributed To configure Hadoop-2.2.0》 http://www.linuxidc.com/Linux/2014-01/95799.htm and 《CentOS Distributed Environmental installation HBase-0.96.0》 http://www.linuxidc.com/Linux/2014-01/95801.htm .

The configuration environment is two sets equipped with centos Test of colony . The host name is master Your machine is responsible for collecting logs , The host name is node Your machine is responsible for writing logs , There are three writing methods configured this time ： Write to normal Directory , write in hdfs.

First download flume-ng The binary compressed file of . Address ：http://flume.apache.org/download.html. After the download , Unzip the file . The first edit /etc/profile file , Add the following lines ：

export FLUME_HOME=/home/aaron/apache-flume-1.4.0-bin
export FLUME_CONF_DIR=$FLUME_HOME/conf
export PATH=$PATH:$FLUME_HOME/bin

export FLUME_HOME=/home/aaron/apache-flume-1.4.0-bin
export FLUME_CONF_DIR=$FLUME_HOME/conf
export PATH=$PATH:$FLUME_HOME/bin

Remember to run after adding $ souce /etc/profile Order the modification to take effect .

stay master Of flume The folder conf Directory , Create a new one flume-master.conf file , The contents are as follows ：

agent.sources = seqGenSrc
agent.channels = memoryChannel
agent.sinks = remoteSink
# For each one of the sources, the type is defined
agent.sources.seqGenSrc.type = exec
agent.sources.seqGenSrc.command = tail -F /home/aaron/test
# The channel can be defined as follows.
agent.sources.seqGenSrc.channels = memoryChannel
# Each sink's type must be defined
agent.sinks.loggerSink.type = logger
#Specify the channel the sink should use
agent.sinks.loggerSink.channel = memoryChannel
# Each channel's type is defined.
agent.channels.memoryChannel.type = memory
# Other config values specific to each type of channel(sink or source)
# can be defined as well
# In this case, it specifies the capacity of the memory channel
agent.channels.memoryChannel.capacity = 100
agent.channels.memoryChannel.keep-alive = 100
agent.sinks.remoteSink.type = avro
agent.sinks.remoteSink.hostname = node
agent.sinks.remoteSink.port = 23004
agent.sinks.remoteSink.channel = memoryChannel

agent.sources = seqGenSrc
agent.channels = memoryChannel
agent.sinks = remoteSink

# For each one of the sources, the type is defined
agent.sources.seqGenSrc.type = exec
agent.sources.seqGenSrc.command = tail -F /home/aaron/test

# The channel can be defined as follows.
agent.sources.seqGenSrc.channels = memoryChannel

# Each sink's type must be defined
agent.sinks.loggerSink.type = logger

#Specify the channel the sink should use
agent.sinks.loggerSink.channel = memoryChannel

# Each channel's type is defined.
agent.channels.memoryChannel.type = memory

# Other config values specific to each type of channel(sink or source)
# can be defined as well
# In this case, it specifies the capacity of the memory channel
agent.channels.memoryChannel.capacity = 100
agent.channels.memoryChannel.keep-alive = 100

agent.sinks.remoteSink.type = avro
agent.sinks.remoteSink.hostname = node
agent.sinks.remoteSink.port = 23004
agent.sinks.remoteSink.channel = memoryChannel

stay node The machine will also /etc/profile File add the above configuration . then , stay conf Create a new one in flume-node.conf file , Revised as follows ：

agent.sources = seqGenSrc1
agent.channels = memoryChannel
#agent.sinks = fileSink
agent.sinks = <SPANstyle="FONT-FAMILY: Arial, Helvetica, sans-serif">fileSink</SPAN>
# For each one of the sources, the type is defined
agent.sources.seqGenSrc1.type = avro
agent.sources.seqGenSrc1.bind = node
agent.sources.seqGenSrc1.port = 23004
# The channel can be defined as follows.
agent.sources.seqGenSrc1.channels = memoryChannel
# Each sink's type must be defined
agent.sinks.loggerSink.type = logger
#Specify the channel the sink should use
agent.sinks.loggerSink.channel = memoryChannel
# Each channel's type is defined.
agent.channels.memoryChannel.type = memory
# Other config values specific to each type of channel(sink or source)
# can be defined as well
# In this case, it specifies the capacity of the memory channel
agent.channels.memoryChannel.capacity = 100
agent.channels.memoryChannel.keep-alive = 100
agent.sources.flieSink.type = avro
agent.sources.fileSink.channel = memoryChannel
agent.sources.fileSink.sink.directory = /home/aaron/
agent.sources.fileSink.serializer.appendNewline = true

agent.sources = seqGenSrc1
agent.channels = memoryChannel
#agent.sinks = fileSink
agent.sinks = fileSink

# For each one of the sources, the type is defined
agent.sources.seqGenSrc1.type = avro 
agent.sources.seqGenSrc1.bind = node
agent.sources.seqGenSrc1.port = 23004 

# The channel can be defined as follows.
agent.sources.seqGenSrc1.channels = memoryChannel

# Each sink's type must be defined
agent.sinks.loggerSink.type = logger

#Specify the channel the sink should use
agent.sinks.loggerSink.channel = memoryChannel

# Each channel's type is defined.
agent.channels.memoryChannel.type = memory

# Other config values specific to each type of channel(sink or source)
# can be defined as well
# In this case, it specifies the capacity of the memory channel
agent.channels.memoryChannel.capacity = 100
agent.channels.memoryChannel.keep-alive = 100

agent.sources.flieSink.type = avro
agent.sources.fileSink.channel = memoryChannel
agent.sources.fileSink.sink.directory = /home/aaron/
agent.sources.fileSink.serializer.appendNewline = true

stay master Run the command above ：

$ bin/flume-ng agent --conf ./conf/ -f conf/flume-maste.conf -Dflume.root.logger=DEBUG,console -n agent

$ bin/flume-ng agent --conf ./conf/ -f conf/flume-maste.conf -Dflume.root.logger=DEBUG,console -n agent

stay node Run command on ：

$ bin/flume-ng agent --conf ./conf/ -f conf/flume-node.conf -Dflume.root.logger=DEBUG,console -n agent

$ bin/flume-ng agent --conf ./conf/ -f conf/flume-node.conf -Dflume.root.logger=DEBUG,console -n agent

After starting , It can be found that the two can communicate with each other ,master The above file can be sent to node On , modify master Upper test file , When adding content later ,node You can also receive .

If you want to write content to hadoop, Can be node Medium flume-node.conf The document is modified as follows ：

agent.sinks = k2
agent.sinks.k2.type = hdfs
agent.sinks.k2.channel = memoryChannel
agent.sinks.k2.hdfs.path = hdfs://master:8089/hbase
agent.sinks.k2.hdfs.fileType = DataStream
agent.sinks.k2.hdfs.writeFormat = Text

agent.sinks = k2

agent.sinks.k2.type = hdfs
agent.sinks.k2.channel = memoryChannel
agent.sinks.k2.hdfs.path = hdfs://master:8089/hbase
agent.sinks.k2.hdfs.fileType = DataStream
agent.sinks.k2.hdfs.writeFormat = Text

among ,hdfs://master:8089/hbase by hadoop Of hdfs File path .

原网站

版权声明
本文为[Brother Xing plays with the clouds]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/180/202206291956461917.html

当前位置：网站首页>Flume ng configuration

Flume ng configuration

边栏推荐

猜你喜欢

随机推荐