当前位置:网站首页>编译Hudi
编译Hudi
2022-07-31 02:40:00 【hyunbar】

大数据技术AI
Flink/Spark/Hadoop/数仓,数据分析、面试,源码解读等干货学习资料
129篇原创内容
公众号
版本分布
centos:centos8
hudi:0.10.1
spark:3.1.3
scala:2.12
1、Maven安装
1.1 手动安装
(1)下载maven
https://maven.apache.org/download.cgi

(2)上传解压maven
tar -zxvf apache-maven-3.6.1-bin.tar.gz -C /bigdata/
(3)添加环境变量到/etc/profile中
#MAVEN_HOME
export MAVEN_HOME=/bigdata/apache-maven-3.6.1
export PATH=$PATH:$MAVEN_HOME/bin
source /etc/profile
(4)测试安装结果
[email protected]:~$ mvn -v
Apache Maven 3.6.3
Maven home: /bigdata/apache-maven-3.6.1
Java version: 1.8.0_321, vendor: Oracle Corporation, runtime: /bigdata/module/jdk1.8.0_321/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "5.13.0-44-generic", arch: "aarch64", family: "unix"
(5)修改setting.xml,指定为阿里云
nexus-aliyun central Nexus aliyun http://maven.aliyun.com/nexus/content/groups/public

### 1.2 apt或yum安装
apt install maven
2、安装git
-------
yum install [email protected]:~$ git --versiongit version 2.25.1
3、构建hudi
--------
### 3.1 通过国内镜像拉取源码
git clone --branch release-0.10.1 https://gitee.com/apache/Hudi.git
3.2 修改pom.xml
[email protected]:~# vim Hudi/pom.xml
nexus-aliyun
nexus-aliyun
http://maven.aliyun.com/nexus/content/groups/public/
true
false
### 3.3 构建
不同spark版本的编译
| Maven build options | Expected Spark bundle jar name | Notes |
| :-- | :-- | :-- |
| (empty) | hudi-spark-bundle\_2.11 (legacy bundle name) | For Spark 2.4.4 and Scala 2.11 (default options) |
| `-Dspark2.4` | hudi-spark2.4-bundle\_2.11 | For Spark 2.4.4 and Scala 2.11 (same as default) |
| `-Dspark2.4 -Dscala-2.12` | hudi-spark2.4-bundle\_2.12 | For Spark 2.4.4 and Scala 2.12 |
| `-Dspark3.1 -Dscala-2.12` | hudi-spark3.1-bundle\_2.12 | For Spark 3.1.x and Scala 2.12 |
| `-Dspark3.2 -Dscala-2.12` | hudi-spark3.2-bundle\_2.12 | For Spark 3.2.x and Scala 2.12 |
| `-Dspark3` | hudi-spark3-bundle\_2.12 (legacy bundle name) | For Spark 3.2.x and Scala 2.12 |
| `-Dscala-2.12` | hudi-spark-bundle\_2.12 (legacy bundle name) | For Spark 2.4.4 and Scala 2.12 |
mvn clean package -DskipTests -Dspark3 -Dscala-2.12
耗时周末一天,终于编译成功

### 4、问题总结
#### **Q1:dependencies at io.confluent:kafka-avro-serializer:jar**
ERROR] Failed to execute goal on project hudi-utilities_2.12: Could not resolve dependencies for project org.apache.hudi:hudi-utilities_2.12:jar:0.10.1: Failed to collect dependencies at io.confluent:kafka-avro-serializer:jar:5.3.4: Failed to read artifact descriptor for io.confluent:kafka-avro-serializer:jar:5.3.4: Could not transfer artifact io.confluent:kafka-avro-serializer:pom:5.3.4 from/to maven-default-http-blocker (http://0.0.0.0/): Blocked mirror for repositories: [nexus-aliyun (http://maven.aliyun.com/nexus/content/groups/public/, default, releases)] -> [Help 1]
解决:将原来的mirror也打开,阿里仓库没有
Starting from versions 0.11, Hudi no longer requires spark-avro to be specified using --packages

#### **Q2:The goal you specified requires a project to execute but there is no POM in this directory (/root). Please verify you invoked Maven from the correct directory**
解决:切换到有pom的文件夹下才能执行
边栏推荐
- MPPT solar charge controller data collection - through the gateway acquisition capacity battery SOC battery voltage, wi-fi
- 拒绝加班,程序员开发的效率工具集
- BAT can't sell "Medical Cloud": Hospitals flee, mountains stand, and there are rules
- LeetCode 1161 The largest element in the layer and the LeetCode road of [BFS binary tree] HERODING
- Unity3D Button 鼠标悬浮进入与鼠标悬浮退出按钮事件
- Mathematical Ideas in AI
- 19. Support Vector Machines - Intuitive Understanding of Optimization Objectives and Large Spacing
- Brute Force/Adjacency Matrix Breadth First Directed Weighted Graph Undirected Weighted Graph
- Pythagorean tuple od js
- Verify the integer input
猜你喜欢

Huawei od dice js

The whole process scheduling, MySQL and Sqoop

Installation, start and stop of redis7 under Linux
![The comprehensive result of the case statement, do you know it?[Verilog Advanced Tutorial]](/img/8a/28427aa773e46740eda9e95f6669f2.png)
The comprehensive result of the case statement, do you know it?[Verilog Advanced Tutorial]

二层广播风暴(产生原因+判断+解决)

字体压缩神器font-spider的使用

php 网站的多语言设置(IP地址区分国内国外)

8、统一处理异常(控制器通知@ControllerAdvice全局配置类、@ExceptionHandler统一处理异常)

图像处理技术的心酸史

【Bank Series Phase 1】People's Bank of China
随机推荐
7. List of private messages
Uninstallation of mysql5.7.37 under CentOS7 [perfect solution]
Huawei od dice js
AtCoder Beginner Contest 261 部分题解
1. Non-type template parameters 2. Specialization of templates 3. Explanation of inheritance
[1153] The boundary range of between in mysql
cudaMemcpy study notes
JS 函数 this上下文 运行时点语法 圆括号 数组 IIFE 定时器 延时器 self.备份上下文 call apply
【shell基础】判断目录是否为空
10 权限介绍
Intel's software and hardware optimization empowers Neusoft to accelerate the arrival of the era of smart medical care
Draw Your Cards
7、私信列表
Shell script to loop through values in log file to sum and calculate average, max and min
Mathematical Ideas in AI
12 磁盘相关命令
try-catch中含return
图解lower_bound&upper_bound
Project development software directory structure specification
BAT卖不动「医疗云」:医院逃离、山头林立、行有行规