当前位置:网站首页>MapReduce implements matrix multiplication - implementation code
MapReduce implements matrix multiplication - implementation code
2022-07-03 13:22:00 【Brother Xing plays with the clouds】
I wrote an analysis before MapReduce Implementation of matrix multiplication algorithm article :Mapreduce The algorithm of matrix multiplication http://www.linuxidc.com/Linux/2014-09/106646.htm
In order to make you more intuitive understanding of program execution , Today, I wrote the implementation code for your reference .
Programming environment :
- java version "1.7.0_40"
- Eclipse Kepler
- Windows7 x64
- Ubuntu 12.04 LTS
- Hadoop2.2.0
- Vmware 9.0.0 build-812388
input data :
A Matrix storage address :hdfs://singlehadoop:8020/wordspace/dataguru/hadoopdev/week09/matrixmultiply/matrixA/matrixa
A Matrix content : 3 4 6 4 0 8
B Matrix storage address :hdfs://singlehadoop:8020/wordspace/dataguru/hadoopdev/week09/matrixmultiply/matrixB/matrixb
B Matrix content : 2 3 3 0 4 1
Implementation code :
There are three classes :
- Drive class MMDriver
- Map class MMMapper
- Reduce class MMReducer
You can combine them into one class according to your personal habits .
MMDriver.java
package dataguru.matrixmultiply;
import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class MMDriver { public static void main(String[] args) throws Exception { // set configuration Configuration conf = new Configuration();
// create job Job job = new Job(conf,"MatrixMultiply"); job.setJarByClass(dataguru.matrixmultiply.MMDriver.class); // specify Mapper & Reducer job.setMapperClass(dataguru.matrixmultiply.MMMapper.class); job.setReducerClass(dataguru.matrixmultiply.MMReducer.class); // specify output types of mapper and reducer job.setOutputKeyClass(Text.class); job.setOutputValueClass(Text.class); job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(Text.class); // specify input and output DIRECTORIES Path inPathA = new Path("hdfs://singlehadoop:8020/wordspace/dataguru/hadoopdev/week09/matrixmultiply/matrixA"); Path inPathB = new Path("hdfs://singlehadoop:8020/wordspace/dataguru/hadoopdev/week09/matrixmultiply/matrixB"); Path outPath = new Path("hdfs://singlehadoop:8020/wordspace/dataguru/hadoopdev/week09/matrixmultiply/matrixC"); FileInputFormat.addInputPath(job, inPathA); FileInputFormat.addInputPath(job, inPathB); FileOutputFormat.setOutputPath(job,outPath);
// delete output directory try{ FileSystem hdfs = outPath.getFileSystem(conf); if(hdfs.exists(outPath)) hdfs.delete(outPath); hdfs.close(); } catch (Exception e){ e.printStackTrace(); return ; } // run the job System.exit(job.waitForCompletion(true) ? 0 : 1); } }
MMMapper.java
package dataguru.matrixmultiply;
import java.io.IOException; import java.util.StringTokenizer;
import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.lib.input.FileSplit;
public class MMMapper extends Mapper<Object, Text, Text, Text> { private String tag; //current matrix private int crow = 2;// matrix A The number of rows private int ccol = 2;// matrix B Columns of private static int arow = 0; //current arow private static int brow = 0; //current brow @Override protected void setup(Context context) throws IOException, InterruptedException { // TODO get inputpath of input data, set to tag FileSplit fs = (FileSplit)context.getInputSplit(); tag = fs.getPath().getParent().getName(); }
/** * input data include two matrix files */ public void map(Object key, Text value, Context context) throws IOException, InterruptedException { StringTokenizer str = new StringTokenizer(value.toString()); if ("matrixA".equals(tag)) { //left matrix,output key:x,y int col = 0; while (str.hasMoreTokens()) { String item = str.nextToken(); //current x,y = line,col for (int i = 0; i < ccol; i++) { Text outkey = new Text(arow+","+i); Text outvalue = new Text("a,"+col+","+item); context.write(outkey, outvalue); System.out.println(outkey+" | "+outvalue); } col++; } arow++; }else if ("matrixB".equals(tag)) { int col = 0; while (str.hasMoreTokens()) { String item = str.nextToken(); //current x,y = line,col for (int i = 0; i < crow; i++) { Text outkey = new Text(i+","+col); Text outvalue = new Text("b,"+brow+","+item); context.write(outkey, outvalue); System.out.println(outkey+" | "+outvalue); } col++; } brow++; } } }
MMReducer.java
package dataguru.matrixmultiply;
import java.io.IOException; import java.util.HashMap; import java.util.Iterator; import java.util.Map; import java.util.StringTokenizer;
import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.Reducer.Context;
public class MMReducer extends Reducer<Text, Text, Text, Text> {
public void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
Map<String,String> matrixa = new HashMap<String,String>(); Map<String,String> matrixb = new HashMap<String,String>(); for (Text val : values) { //values example : b,0,2 or a,0,4 StringTokenizer str = new StringTokenizer(val.toString(),","); String sourceMatrix = str.nextToken(); if ("a".equals(sourceMatrix)) { matrixa.put(str.nextToken(), str.nextToken()); //(0,4) } if ("b".equals(sourceMatrix)) { matrixb.put(str.nextToken(), str.nextToken()); //(0,2) } } int result = 0; Iterator<String> iter = matrixa.keySet().iterator(); while (iter.hasNext()) { String mapkey = iter.next(); result += Integer.parseInt(matrixa.get(mapkey)) * Integer.parseInt(matrixb.get(mapkey)); }
context.write(key, new Text(String.valueOf(result))); } }
边栏推荐
- Logseq evaluation: advantages, disadvantages, evaluation, learning tutorial
- Asp.Net Core1.1版本没了project.json,这样来生成跨平台包
- 物联网毕设 --(STM32f407连接云平台检测数据)
- rxjs Observable filter Operator 的实现原理介绍
- Flink SQL knows why (17): Zeppelin, a sharp tool for developing Flink SQL
- IDEA 全文搜索快捷键Ctr+Shift+F失效问题
- PowerPoint tutorial, how to save a presentation as a video in PowerPoint?
- 71 articles on Flink practice and principle analysis (necessary for interview)
- 2022-02-10 introduction to the design of incluxdb storage engine TSM
- Fabric.js 更换图片的3种方法(包括更换分组内的图片,以及存在缓存的情况)
猜你喜欢
Sitescms v3.1.0 release, launch wechat applet
The shortage of graphics cards finally came to an end: 3070ti for more than 4000 yuan, 2000 yuan cheaper than the original price, and 3090ti
Seven habits of highly effective people
我的创作纪念日:五周年
剑指 Offer 14- II. 剪绳子 II
Elk note 24 -- replace logstash consumption log with gohangout
Resolved (error in viewing data information in machine learning) attributeerror: target_ names
18W word Flink SQL God Road manual, born in the sky
sitesCMS v3.1.0发布,上线微信小程序
(first) the most complete way to become God of Flink SQL in history (full text 180000 words, 138 cases, 42 pictures)
随机推荐
Libuv库 - 设计概述(中文版)
Flink SQL knows why (XIV): the way to optimize the performance of dimension table join (Part 1) with source code
File uploading and email sending
Typeerror resolved: argument 'parser' has incorrect type (expected lxml.etree.\u baseparser, got type)
Tutoriel PowerPoint, comment enregistrer une présentation sous forme de vidéo dans Powerpoint?
106. How to improve the readability of SAP ui5 application routing URL
Logseq 评测:优点、缺点、评价、学习教程
AI 考高数得分 81,网友:AI 模型也免不了“内卷”!
2022-01-27 use liquibase to manage MySQL execution version
SwiftUI 开发经验之作为一名程序员需要掌握的五个最有力的原则
Servlet
IDEA 全文搜索快捷键Ctr+Shift+F失效问题
8皇后问题
Image component in ETS development mode of openharmony application development
2022-01-27 redis cluster technology research
PowerPoint tutorial, how to save a presentation as a video in PowerPoint?
Flink SQL knows why (VIII): the wonderful way to parse Flink SQL tumble window
我的创作纪念日:五周年
Sword finger offer 12 Path in matrix
【电脑插入U盘或者内存卡显示无法格式化FAT32如何解决】