当前位置:网站首页>cdh6.x 集成spark-sql
cdh6.x 集成spark-sql
2022-08-04 01:49:00 【涤生大数据】
写在前面
CDH系列默认阉割掉了Spark的spark-sql工具,但是很多公司还是有这个使用的需要,因为线上就有这块的需求,所以结合cdh6.x版本,做了具体的对spark-sql工具支持的集成。
版本说明
| 组件名称 | 组件版本 |
| CDH | CDH 6.2.1 |
| spark | spark-2.4.8 |
第一步:下载原生apache spark
# cd /opt/cloudera/parcels/CDH/lib
# wget http://archive.apache.org/dist/spark/spark-2.4.8/spark-2.4.8-bin-hadoop2.7.tgz
# tar zxvf spark-2.4.8-bin-hadoop2.7.tgz
# ln -s spark2 spark-2.4.8-bin-hadoop2.7
第二步:修改spark配置文件
2.1 配置spark-env.sh
|
2.2 配置spark-defaults.conf
小提示:建议直接cp 现有spark的配置文件,在此基础上修改。
|
Vim /opt/cloudera/parcels/CDH/lib/spark2/conf/spark-defaults.conf
小提示:修改配置文件时,只需要修改文件中标红部分即可,其他可以保持默认。
|
2.3 配置日志级别
# vim /opt/cloudera/parcels/CDH/lib/spark2/conf/log4j.properties
在配置文件中追加以下配置项,其他保持默认
|
第三步:配置依赖包
3.1 上传spark 依赖jar包
|
3.2 配置lzo jar包
|
第四步:配置spark-sql 的全局变量
vim /etc/profile.d/spark.sh
|
#生效
|
第五步:测试使用
全局的任意位置执行 spark-sql指令(注意:执行的linux用户需要有提交任务到yarn的权限)
如果都没有问题,执行show databases ;会看到集群中的所有库。

补充说明:其他客户端如果需要此环境,将上述配置全部scp过去即可。
边栏推荐
- - heavy OpenCV 】 【 mapping
- MySQL回表指的是什么
- C语言力扣第54题之螺旋矩阵。模拟旋转
- Intranet penetration - application
- Sticker Spelling - Memory Search / Shape Pressure DP
- 2022 China Computing Power Conference released the excellent results of "Innovation Pioneer"
- 静态文件快速建站
- esp32 releases robot battery voltage to ros2 (micro-ros+CoCube)
- 【正则表达式】笔记
- 【store商城项目01】环境准备以及测试
猜你喜欢
![[store mall project 01] environment preparation and testing](/img/78/415b18a26fdc9e6f59b59ba0a00c4f.png)
[store mall project 01] environment preparation and testing

nodejs installation and environment configuration

ASP.NET 获取数据库的数据并写入到excel表格中

The idea of the diagram

Deng Qinglin, Alibaba Cloud Technical Expert: Best Practices for Disaster Recovery across Availability Zones and Multiple Lives in Different Locations on the Cloud

Installation and configuration of nodejs+npm

JS 保姆级贴心,从零教你手写实现一个防抖debounce方法

Priority_queue element as a pointer, the overloaded operators

Continuing to invest in product research and development, Dingdong Maicai wins in supply chain investment

实例040:逆序列表
随机推荐
Promise solves blocking synchronization and turns asynchronous into synchronous
Continuing to invest in product research and development, Dingdong Maicai wins in supply chain investment
在Activity中获取另一个XML文件的控件
Deng Qinglin, Alibaba Cloud Technical Expert: Best Practices for Disaster Recovery across Availability Zones and Multiple Lives in Different Locations on the Cloud
Example 035: Setting the output color
5.scrapy中间件&分布式爬虫
【QT小记】QT中信号和槽的基本使用
【store商城项目01】环境准备以及测试
2022 中国算力大会发布“创新先锋”优秀成果
The idea of the diagram
Multithreading JUC Learning Chapter 1 Steps to Create Multithreading
FileNotFoundException: This file can not be opened as a file descriptor; it is probably compressed
SAP SD module foreground operation
thinkphp 常用技巧
ThreadLocal
JS 保姆级贴心,从零教你手写实现一个防抖debounce方法
Intranet penetration - application
Apache DolphinScheduler actual combat task scheduling platform - a new generation of distributed workflow
LDO investigation
FileNotFoundException: This file can not be opened as a file descriptor; it is probably compressed