当前位置:网站首页>4、安装部署Spark(Spark on Yarn模式)
4、安装部署Spark(Spark on Yarn模式)
2022-07-06 09:15:00 【@小蜗牛】
目录
- 4.1使用下面的命令,解压Spark安装包到用户根目录:
- 4.2配置Hadoop环境变量
- 4.3验证Spark安装
- 4.4重启hadoop集群(使配置生效)
- 4.5进入Spark安装主目录
- 4.6安装部署Spark-SQL
4.1使用下面的命令,解压Spark安装包到用户根目录:
[[email protected] ~]$ cd /home/zkpk/tgz/spark/
[[email protected] spark]$ tar -xzvf spark-2.1.1-bin-hadoop2.7.tgz -C /home/zkpk/
[[email protected] spark]$ cd
[[email protected] ~]$ cd spark-2.1.1-bin-hadoop2.7/
[[email protected] spark-2.1.1-bin-hadoop2.7]$ ls -l
执行ls -l命令会看到下面的图片所示内容,这些内容是Spark包含的文件:
4.2配置Hadoop环境变量
4.2.1在Yarn上运行Spark需要配置HADOOP_CONF_DIR、YARN_CONF_DIR和HDFS_CONF_DIR环境变量
4.2.1.1命令:
[[email protected] ~]$ cd
[[email protected] ~]$ gedit ~/.bash_profile
4.2.1.2在文件末尾添加如下内容;保存、退出
#SPARK ON YARN
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HDFS_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
4.2.1.3重新编译文件,使环境变量生效
[[email protected] ~]$ source ~/.bash_profile
4.3验证Spark安装
4.3.1修改${HADOOP_HOME}/etc/Hadoop/yarn-site.xml;
说明:在master和slave01、slave02节点都要如此修改此文件
4.3.2添加两个property
[[email protected] ~]$ vim ~/hadoop-2.7.3/etc/hadoop/yarn-site.xml
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
![在这里插入图片描述](https://img-blog.csdnimg.cn/30b25836994545c191442ff18f227621.png)
4.4重启hadoop集群(使配置生效)
[[email protected] ~]$ stop-all.sh
[[email protected] ~]$ start-all.sh
4.5进入Spark安装主目录
[[email protected] ~]$ cd ~/spark-2.1.1-bin-hadoop2.7
4.5.1执行下面的命令(注意这是1行代码):
[[email protected] spark-2.1.1-bin-hadoop2.7]$ ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --num-executors 3 --driver-memory 1g --executor-memory 1g --executor-cores 1 examples/jars/spark-examples*.jar 10
4.5.2执行命令后会出现如下界面:
4.5.3Web UI验证
4.5.3.1进入spark-shell交互终端,命令如下:
[[email protected] spark-2.1.1-bin-hadoop2.7]$ ./bin/spark-shell
4.5.3.2打开浏览器,输入下面地址,查看运行界面(地址:http://master:4040/)
4.5.3.3退出交互终端,按ctrl+d组合键
scala> :quit
4.6安装部署Spark-SQL
4.6.1将hadoop安装目录下的hdfs-site.xml文件复制到spark安装目录下的conf目录下
[[email protected] spark-2.1.1-bin-hadoop2.7]$ cd
[[email protected] ~]$ cd hadoop-2.7.3/etc/hadoop/
[[email protected] hadoop]$ cp hdfs-site.xml /home/zkpk/spark-2.1.1-bin-hadoop2.7/conf
4.6.2将Hive安装目录conf子目录下的hive-site.xml文件,拷贝到spark的配置子目录
[[email protected] hadoop]$ cd
[[email protected] ~]$ cd apache-hive-2.1.1-bin/conf/
[[email protected] conf]$ cp hive-site.xml /home/zkpk/spark-2.1.1-bin-hadoop2.7/conf/
4.6.3修改spark配置目录中的hive-site.xml文件
[[email protected] conf]$ cd
[[email protected] ~]$ cd spark-2.1.1-bin-hadoop2.7/conf/
[[email protected] conf]$ vim hive-site.xml
4.6.3.1添加如下属性
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/spark/warehouse</value>
</property>
4.6.4将mysql连接的驱动包拷贝到spark目录的jars子目录
[[email protected] conf]$ cd
[[email protected] ~]$ cd apache-hive-2.1.1-bin/lib/
[[email protected] lib]$ cp mysql-connector-java-5.1.28.jar /home/zkpk/spark-2.1.1-bin-hadoop2.7/jars/
4.6.5重启Hadoop集群并验证spark-sql;下图,进入spark shell客户端,说明spark sql配置成功
[[email protected] lib]$ cd
[[email protected] ~]$ stop-all.sh
[[email protected] ~]$ start-all.sh
[[email protected] ~]$ cd ~/spark-2.1.1-bin-hadoop2.7
[[email protected] spark-2.1.1-bin-hadoop2.7]$ ./bin/spark-sql --master yarn
4.6.6按ctrl+d组合键,退出spark shell
4.6.7若hadoop集群不再使用,请关闭集群
[[email protected] spark-2.1.1-bin-hadoop2.7]$ cd
[[email protected] ~]$ stop-all.sh
边栏推荐
- Antlr4 uses keywords as identifiers
- error C4996: ‘strcpy‘: This function or variable may be unsafe. Consider using strcpy_ s instead
- Image recognition - pyteseract TesseractNotFoundError: tesseract is not installed or it‘s not in your path
- 解决安装Failed building wheel for pillow
- 误删Path变量解决
- {一周总结}带你走进js知识的海洋
- Pytorch基础
- Double to int precision loss
- 机器学习--人口普查数据分析
- 数据库高级学习笔记--SQL语句
猜你喜欢
wangeditor富文本引用、表格使用问题
MySQL and C language connection (vs2019 version)
Nanny level problem setting tutorial
QT creator create button
Neo4j installation tutorial
vs2019 第一个MFC应用程序
error C4996: ‘strcpy‘: This function or variable may be unsafe. Consider using strcpy_s instead
QT creator specifies dependencies
PyCharm中无法调用numpy,报错ModuleNotFoundError: No module named ‘numpy‘
{一周总结}带你走进js知识的海洋
随机推荐
Install mongdb tutorial and redis tutorial under Windows
When you open the browser, you will also open mango TV, Tiktok and other websites outside the home page
AcWing 1294.樱花 题解
Library function -- (continuous update)
error C4996: ‘strcpy‘: This function or variable may be unsafe. Consider using strcpy_ s instead
QT creator custom build process
2020网鼎杯_朱雀组_Web_nmap
Basic use of redis
快来走进JVM吧
vs2019 桌面程序快速入门
wangeditor富文本组件-复制可用
软件测试与质量学习笔记3--白盒测试
Niuke novice monthly race 40
Error connecting to MySQL database: 2059 - authentication plugin 'caching_ sha2_ The solution of 'password'
C语言读取BMP文件
Why can't STM32 download the program
[Blue Bridge Cup 2017 preliminary] buns make up
Software I2C based on Hal Library
牛客Novice月赛40
L2-007 family real estate (25 points)