当前位置:网站首页>MapReduce instance (II): Average
MapReduce instance (II): Average
2022-07-27 16:15:00 【Laugh at Fengyun Road】
MR Realization averaging
Hello everyone , I am Fengyun , Welcome to my blog perhaps WeChat official account 【 Laugh at Fengyun Road 】, In the days to come, let's learn about big data related technologies , Work hard together , Meet a better self !
Realize the idea
The average is MapReduce Common algorithms , The algorithm of finding the average is also relatively simple , One idea is Map End read data , Enter data into Reduce Go through before shuffle, take map Function output key All with the same value value Values form a set value-list, Then enter into Reduce End ,Reduce The end summarizes and counts the number of records , Then do business . The specific principle is shown in the figure below :
Write code
Mapper Code
public static class Map extends Mapper<Object , Text , Text , IntWritable>{
private static Text newKey=new Text();
// Realization map function
public void map(Object key,Text value,Context context) throws IOException, InterruptedException{
// Convert the data of the input plain text file into String
String line=value.toString();
System.out.println(line);
String arr[]=line.split("\t");
newKey.set(arr[0]);
int click=Integer.parseInt(arr[1]);
context.write(newKey, new IntWritable(click));
}
}
map End in use Hadoop After the default input method , Will input value Value through split() Method to intercept , We convert the intercepted product Click times field into IntWritable Type and set it to value, Set the commodity classification field to key, And then directly output key/value Value .
Reducer Code
public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable>{
// Realization reduce function
public void reduce(Text key,Iterable<IntWritable> values,Context context) throws IOException, InterruptedException{
int num=0;
int count=0;
for(IntWritable val:values){
num+=val.get(); // Sum each element num
count++; // Count the number of elements count
}
int avg=num/count; // Calculate the average
context.write(key,new IntWritable(avg));
}
}
map Output <key,value> after shuffle Process integration <key,values> Key value pair , And then <key,values> Key value pairs are given to reduce.reduce Termination received values after , Will input key Copy directly to the output key, take values adopt for Loop sums every element inside num And count the number of elements count, And then use num Divide count Get the average avg, take avg Set to value, Finally, output directly <key,value> That's all right. .
Complete code
package mapreduce;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
public class MyAverage{
public static class Map extends Mapper<Object , Text , Text , IntWritable>{
private static Text newKey=new Text();
public void map(Object key,Text value,Context context) throws IOException, InterruptedException{
String line=value.toString();
System.out.println(line);
String arr[]=line.split("\t");
newKey.set(arr[0]);
int click=Integer.parseInt(arr[1]);
context.write(newKey, new IntWritable(click));
}
}
public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable>{
public void reduce(Text key,Iterable<IntWritable> values,Context context) throws IOException, InterruptedException{
int num=0;
int count=0;
for(IntWritable val:values){
num+=val.get();
count++;
}
int avg=num/count;
context.write(key,new IntWritable(avg));
}
}
public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException{
Configuration conf=new Configuration();
System.out.println("start");
Job job =new Job(conf,"MyAverage");
job.setJarByClass(MyAverage.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
Path in=new Path("hdfs://localhost:9000/mymapreduce4/in/goods_click");
Path out=new Path("hdfs://localhost:9000/mymapreduce4/out");
FileInputFormat.addInputPath(job,in);
FileOutputFormat.setOutputPath(job,out);
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
-------------- end ----------------
WeChat official account : Below scan QR code or Search for Laugh at Fengyun Road Focus on 
边栏推荐
- Openwrt compilation driver module (write code at any position outside the openwrt source code, and compile independently in a modular manner.Ko)
- Introduction to JWT
- Busybox login: can't execute'/bin/bash': no such file or directory solution
- 少见的按位操作符
- Common problems of mobile terminal H5
- Have you ever used the comma operator?
- Leetcode 226 翻转二叉树(递归)
- Mapreduce实例(三):数据去重
- Keil implements compilation with makefile
- flume增量采集mysql数据到kafka
猜你喜欢

centos上mysql5.7主从热备设置

Solve mt7620 continuous cycle uboot (LZMA error 1 - must reset board to recover)

Text capture picture (Wallpaper of Nezha's demon child coming to the world)

Pychart imports the existing local installation package

JSP Foundation

Nacos

centos yum方式安装mysql

Mapreduce实例(一):WordCount

web测试学习笔记01

Mapreduce实例(二):求平均值
随机推荐
Common problems of mobile terminal H5
Personal perception of project optimization
MySQL index
makefile 中指定程序运行时加载的库文件路径
Common Oracle statements
leetcode25题:K 个一组翻转链表——链表困难题目详解
Content ambiguity occurs when using transform:translate()
Oracle 常用语句
Wechat applet personal number opens traffic master
Scratch crawler framework
Understand │ what is cross domain? How to solve cross domain problems?
Sword finger offer 51. reverse pairs in the array
ARIMA模型选择与残差
Chapter I Marxist philosophy is a scientific world outlook and methodology
多行文本溢出打点
Makefile specifies the path of the library file loaded when the program runs
QT (VI) value and string conversion
Solve mt7620 continuous cycle uboot (LZMA error 1 - must reset board to recover)
单机高并发模型设计
三星关闭在中国最后一家手机工厂
