当前位置:网站首页>MapReduce instance (II): Average
MapReduce instance (II): Average
2022-07-27 16:15:00 【Laugh at Fengyun Road】
MR Realization averaging
Hello everyone , I am Fengyun , Welcome to my blog perhaps WeChat official account 【 Laugh at Fengyun Road 】, In the days to come, let's learn about big data related technologies , Work hard together , Meet a better self !
Realize the idea
The average is MapReduce Common algorithms , The algorithm of finding the average is also relatively simple , One idea is Map End read data , Enter data into Reduce Go through before shuffle, take map Function output key All with the same value value Values form a set value-list, Then enter into Reduce End ,Reduce The end summarizes and counts the number of records , Then do business . The specific principle is shown in the figure below :
Write code
Mapper Code
public static class Map extends Mapper<Object , Text , Text , IntWritable>{
private static Text newKey=new Text();
// Realization map function
public void map(Object key,Text value,Context context) throws IOException, InterruptedException{
// Convert the data of the input plain text file into String
String line=value.toString();
System.out.println(line);
String arr[]=line.split("\t");
newKey.set(arr[0]);
int click=Integer.parseInt(arr[1]);
context.write(newKey, new IntWritable(click));
}
}
map End in use Hadoop After the default input method , Will input value Value through split() Method to intercept , We convert the intercepted product Click times field into IntWritable Type and set it to value, Set the commodity classification field to key, And then directly output key/value Value .
Reducer Code
public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable>{
// Realization reduce function
public void reduce(Text key,Iterable<IntWritable> values,Context context) throws IOException, InterruptedException{
int num=0;
int count=0;
for(IntWritable val:values){
num+=val.get(); // Sum each element num
count++; // Count the number of elements count
}
int avg=num/count; // Calculate the average
context.write(key,new IntWritable(avg));
}
}
map Output <key,value> after shuffle Process integration <key,values> Key value pair , And then <key,values> Key value pairs are given to reduce.reduce Termination received values after , Will input key Copy directly to the output key, take values adopt for Loop sums every element inside num And count the number of elements count, And then use num Divide count Get the average avg, take avg Set to value, Finally, output directly <key,value> That's all right. .
Complete code
package mapreduce;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
public class MyAverage{
public static class Map extends Mapper<Object , Text , Text , IntWritable>{
private static Text newKey=new Text();
public void map(Object key,Text value,Context context) throws IOException, InterruptedException{
String line=value.toString();
System.out.println(line);
String arr[]=line.split("\t");
newKey.set(arr[0]);
int click=Integer.parseInt(arr[1]);
context.write(newKey, new IntWritable(click));
}
}
public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable>{
public void reduce(Text key,Iterable<IntWritable> values,Context context) throws IOException, InterruptedException{
int num=0;
int count=0;
for(IntWritable val:values){
num+=val.get();
count++;
}
int avg=num/count;
context.write(key,new IntWritable(avg));
}
}
public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException{
Configuration conf=new Configuration();
System.out.println("start");
Job job =new Job(conf,"MyAverage");
job.setJarByClass(MyAverage.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
Path in=new Path("hdfs://localhost:9000/mymapreduce4/in/goods_click");
Path out=new Path("hdfs://localhost:9000/mymapreduce4/out");
FileInputFormat.addInputPath(job,in);
FileOutputFormat.setOutputPath(job,out);
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
-------------- end ----------------
WeChat official account : Below scan QR code or Search for Laugh at Fengyun Road Focus on 
边栏推荐
- First acquaintance with MySQL database
- 三星关闭在中国最后一家手机工厂
- [sword finger offer] interview question 41: median in data flow - large and small heap implementation
- 解决openwrt package目录下多个文件夹重名编译警告(call subdir 函数)
- Nacos
- [sword finger offer] interview question 55 - Ⅰ / Ⅱ: depth of binary tree / balanced binary tree
- Enable shallow and deep copies+
- 这些题~~
- C language realizes the conversion between byte stream and hexadecimal string
- Delete node quickly and efficiently_ modules
猜你喜欢

无线网络分析有关的安全软件(aircrack-ng)

openwrt 增加RTC(MCP7940 I2C总线)驱动详解

Servlet基础知识点

DRF学习笔记(五):视图集ViewSet

juc包下常用工具类

First acquaintance with MySQL database

时间序列——使用tsfresh进行分类任务

For enterprise operation and maintenance security, use the cloud housekeeper fortress machine!

Ncnn reasoning framework installation; Onnx to ncnn
![[sword finger offer] interview question 53-i: find the number 1 in the sorted array -- three templates for binary search](/img/4b/460ac517e9a5d840a0961f5d7d8c9d.png)
[sword finger offer] interview question 53-i: find the number 1 in the sorted array -- three templates for binary search
随机推荐
Mapreduce实例(二):求平均值
MySQL索引
QT (VI) value and string conversion
Openwrt compilation driver module (write code at any position outside the openwrt source code, and compile independently in a modular manner.Ko)
编码技巧——全局异常捕获&统一的返回体&业务异常
解决MT7620不断循环uboot(LZMA ERROR 1 - must RESET board to recover)
flume增量采集mysql数据到kafka
43亿欧元现金收购欧司朗宣告失败!ams表示将继续收购
profileapi.h header
TSMC's counterattack: it accused lattice core of infringing 25 patents and asked for prohibition!
DRF学习笔记(四):DRF视图
To meet risc-v challenges? ARM CPU introduces custom instruction function!
时间序列-ARIMA模型
JMeter5.3 及以后的版本jmeter函数助手生成的字符在置灰无法复制
[sword finger offer] interview question 55 - Ⅰ / Ⅱ: depth of binary tree / balanced binary tree
__typeof和typeof的差异
Brief description of tenant and multi tenant concepts in cloud management platform
Six capabilities of test and development
DRF学习笔记(准备)
Personal perception of project optimization
