当前位置:网站首页>Flume ng configuration
Flume ng configuration
2022-06-29 20:05:00 【Brother Xing plays with the clouds】
1) brief introduction
Flume It's a Distributed 、 reliable 、 And highly available massive log aggregation system , Support to customize all kinds of data senders in the system , To collect data ; meanwhile ,Flume Provides simple processing of data , And write to the various data recipients ( Customizable ) The ability of .
Design objectives : (1) reliability When a node fails , Logs can be delivered to other nodes without loss .Flume There are three levels of Reliability Assurance , The order from strong to weak is :end-to-end( Receive the data agent First of all, will event Write to disk , When the data transfer is successful , And then delete ; If the data delivery fails , You can resend it .),Store on failure( When the data receiver crash when , Write the data locally , After waiting for recovery , Continue to send ),Best effort( After the data is sent to the receiver , There is no confirmation ). (2) Extensibility Flume It adopts three-tier architecture , Respectively agent,collector and storage, Each layer can be expanded horizontally . among , all agent and collector from master Unified management , This makes the system easy to monitor and maintain , And master More than one is allowed ( Use ZooKeeper Manage and load balance ), This avoids a single point of failure . (3) manageability all agent and colletor from master Unified management , This makes the system easy to maintain . many master situation ,Flume utilize ZooKeeper and gossip, Ensure the consistency of dynamic configuration data . Users can go to master Check the execution of each data source or data flow on , And it can configure and load data sources dynamically .Flume Provides web and shell script command Two forms of data flow management . (4) Functional scalability Users can add their own agent,collector perhaps storage. Besides ,Flume It comes with a lot of components , Includes a variety of agent(file, syslog etc. ),collector and storage(File,HDFS,HBase etc. ).
2) To configure
Previously configured Hadoop and hbase, So we need to put hadoop and hbase start-up , To write the file to hdfs and hbase.hadoop-2.2.0 and hbase-0.96.0 Refer to... For the configuration of 《Ubuntu and CentOS in Distributed To configure Hadoop-2.2.0》 http://www.linuxidc.com/Linux/2014-01/95799.htm and 《CentOS Distributed Environmental installation HBase-0.96.0》 http://www.linuxidc.com/Linux/2014-01/95801.htm .
The configuration environment is two sets equipped with centos Test of colony . The host name is master Your machine is responsible for collecting logs , The host name is node Your machine is responsible for writing logs , There are three writing methods configured this time : Write to normal Directory , write in hdfs.
First download flume-ng The binary compressed file of . Address :http://flume.apache.org/download.html. After the download , Unzip the file . The first edit /etc/profile file , Add the following lines :
- export FLUME_HOME=/home/aaron/apache-flume-1.4.0-bin
- export FLUME_CONF_DIR=$FLUME_HOME/conf
- export PATH=$PATH:$FLUME_HOME/bin
export FLUME_HOME=/home/aaron/apache-flume-1.4.0-bin
export FLUME_CONF_DIR=$FLUME_HOME/conf
export PATH=$PATH:$FLUME_HOME/binRemember to run after adding $ souce /etc/profile Order the modification to take effect .
stay master Of flume The folder conf Directory , Create a new one flume-master.conf file , The contents are as follows :
- agent.sources = seqGenSrc
- agent.channels = memoryChannel
- agent.sinks = remoteSink
- # For each one of the sources, the type is defined
- agent.sources.seqGenSrc.type = exec
- agent.sources.seqGenSrc.command = tail -F /home/aaron/test
- # The channel can be defined as follows.
- agent.sources.seqGenSrc.channels = memoryChannel
- # Each sink's type must be defined
- agent.sinks.loggerSink.type = logger
- #Specify the channel the sink should use
- agent.sinks.loggerSink.channel = memoryChannel
- # Each channel's type is defined.
- agent.channels.memoryChannel.type = memory
- # Other config values specific to each type of channel(sink or source)
- # can be defined as well
- # In this case, it specifies the capacity of the memory channel
- agent.channels.memoryChannel.capacity = 100
- agent.channels.memoryChannel.keep-alive = 100
- agent.sinks.remoteSink.type = avro
- agent.sinks.remoteSink.hostname = node
- agent.sinks.remoteSink.port = 23004
- agent.sinks.remoteSink.channel = memoryChannel
agent.sources = seqGenSrc
agent.channels = memoryChannel
agent.sinks = remoteSink
# For each one of the sources, the type is defined
agent.sources.seqGenSrc.type = exec
agent.sources.seqGenSrc.command = tail -F /home/aaron/test
# The channel can be defined as follows.
agent.sources.seqGenSrc.channels = memoryChannel
# Each sink's type must be defined
agent.sinks.loggerSink.type = logger
#Specify the channel the sink should use
agent.sinks.loggerSink.channel = memoryChannel
# Each channel's type is defined.
agent.channels.memoryChannel.type = memory
# Other config values specific to each type of channel(sink or source)
# can be defined as well
# In this case, it specifies the capacity of the memory channel
agent.channels.memoryChannel.capacity = 100
agent.channels.memoryChannel.keep-alive = 100
agent.sinks.remoteSink.type = avro
agent.sinks.remoteSink.hostname = node
agent.sinks.remoteSink.port = 23004
agent.sinks.remoteSink.channel = memoryChannelstay node The machine will also /etc/profile File add the above configuration . then , stay conf Create a new one in flume-node.conf file , Revised as follows :
- agent.sources = seqGenSrc1
- agent.channels = memoryChannel
- #agent.sinks = fileSink
- agent.sinks = <SPANstyle="FONT-FAMILY: Arial, Helvetica, sans-serif">fileSink</SPAN>
- # For each one of the sources, the type is defined
- agent.sources.seqGenSrc1.type = avro
- agent.sources.seqGenSrc1.bind = node
- agent.sources.seqGenSrc1.port = 23004
- # The channel can be defined as follows.
- agent.sources.seqGenSrc1.channels = memoryChannel
- # Each sink's type must be defined
- agent.sinks.loggerSink.type = logger
- #Specify the channel the sink should use
- agent.sinks.loggerSink.channel = memoryChannel
- # Each channel's type is defined.
- agent.channels.memoryChannel.type = memory
- # Other config values specific to each type of channel(sink or source)
- # can be defined as well
- # In this case, it specifies the capacity of the memory channel
- agent.channels.memoryChannel.capacity = 100
- agent.channels.memoryChannel.keep-alive = 100
- agent.sources.flieSink.type = avro
- agent.sources.fileSink.channel = memoryChannel
- agent.sources.fileSink.sink.directory = /home/aaron/
- agent.sources.fileSink.serializer.appendNewline = true
agent.sources = seqGenSrc1
agent.channels = memoryChannel
#agent.sinks = fileSink
agent.sinks = fileSink
# For each one of the sources, the type is defined
agent.sources.seqGenSrc1.type = avro
agent.sources.seqGenSrc1.bind = node
agent.sources.seqGenSrc1.port = 23004
# The channel can be defined as follows.
agent.sources.seqGenSrc1.channels = memoryChannel
# Each sink's type must be defined
agent.sinks.loggerSink.type = logger
#Specify the channel the sink should use
agent.sinks.loggerSink.channel = memoryChannel
# Each channel's type is defined.
agent.channels.memoryChannel.type = memory
# Other config values specific to each type of channel(sink or source)
# can be defined as well
# In this case, it specifies the capacity of the memory channel
agent.channels.memoryChannel.capacity = 100
agent.channels.memoryChannel.keep-alive = 100
agent.sources.flieSink.type = avro
agent.sources.fileSink.channel = memoryChannel
agent.sources.fileSink.sink.directory = /home/aaron/
agent.sources.fileSink.serializer.appendNewline = truestay master Run the command above :
- $ bin/flume-ng agent --conf ./conf/ -f conf/flume-maste.conf -Dflume.root.logger=DEBUG,console -n agent
$ bin/flume-ng agent --conf ./conf/ -f conf/flume-maste.conf -Dflume.root.logger=DEBUG,console -n agentstay node Run command on :
- $ bin/flume-ng agent --conf ./conf/ -f conf/flume-node.conf -Dflume.root.logger=DEBUG,console -n agent
$ bin/flume-ng agent --conf ./conf/ -f conf/flume-node.conf -Dflume.root.logger=DEBUG,console -n agentAfter starting , It can be found that the two can communicate with each other ,master The above file can be sent to node On , modify master Upper test file , When adding content later ,node You can also receive .
If you want to write content to hadoop, Can be node Medium flume-node.conf The document is modified as follows :
- agent.sinks = k2
- agent.sinks.k2.type = hdfs
- agent.sinks.k2.channel = memoryChannel
- agent.sinks.k2.hdfs.path = hdfs://master:8089/hbase
- agent.sinks.k2.hdfs.fileType = DataStream
- agent.sinks.k2.hdfs.writeFormat = Text
agent.sinks = k2
agent.sinks.k2.type = hdfs
agent.sinks.k2.channel = memoryChannel
agent.sinks.k2.hdfs.path = hdfs://master:8089/hbase
agent.sinks.k2.hdfs.fileType = DataStream
agent.sinks.k2.hdfs.writeFormat = Textamong ,hdfs://master:8089/hbase by hadoop Of hdfs File path .
边栏推荐
- static静态成员变量使用@Value注入方式
- Linux安装MySQL5
- [sword finger offer] 51 Reverse pair in array
- 【剑指Offer】51. 数组中的逆序对
- 攻防演练中的防守基石——全方位监控
- Zotero journal Automatic Matching Update Influencing Factors
- Luoqingqi: has high-end household appliances become a red sea? Casati took the lead in breaking the game
- npm ERR! fatal: early EOF npm ERR! fatal: index-pack failed
- Performance improvement at the cost of other components is not good
- 日本樱桃一颗拍出1980元天价,网友:吃了有上当的感觉
猜你喜欢

How to solve the problem of insufficient memory space in Apple iPhone upgrade system?

How to set a pod to run on a specified node

Introduction to the latest version 24.1.0.360 update of CorelDRAW

Tiger painter mengxiangshun's digital collection is on sale in limited quantities and comes with Maotai in the year of the tiger

Finally, Amazon~

【摸鱼神器】UI库秒变低代码工具——表单篇(一)设计

Flume theory

Flume配置4——自定义Source+Sink

Tag based augmented reality using OpenCV

云服务器的安全设置常识
随机推荐
软件工程—原理、方法与应用
【译】十二因子应用(四)
Hangfire详解
Lock4j -- distributed lock Middleware -- customize the logic of lock acquisition failure
14.04 million! Sichuan provincial human resources and social security department relational database and middleware software system upgrade procurement bidding!
Flume configuration 4 - Custom source+sink
jfinal中如何使用过滤器监控Druid监听SQL执行?
一次 Keepalived 高可用的事故,让我重学了一遍它!
Ovirt database modify delete node
CorelDRAW最新24.1.0.360版本更新介绍讲解
Nutch2.1在Windows平台上使用Eclipse debug 存储在MySQL的搭建过程
苹果iPhone手机升级系统内存空间变小不够如何解决?
【Try to Hack】vulnhub narak
Notepad++--宏(记录操作过程)
Is it safe to open a new bond Online
一个超赞的开源的图片去水印解决方案
There are more than 20 databases in a MySQL with 3306 ports. How can I backup more than 20 databases with one click and do system backup to prevent data from being deleted by mistake?
[sword finger offer] 51 Reverse pair in array
【剑指Offer】51. 数组中的逆序对
剑指 Offer 66. 构建乘积数组