当前位置:网站首页>How to deploy dolphin scheduler Apache dolphin scheduler 1.2.0 in cdh5.16.2
How to deploy dolphin scheduler Apache dolphin scheduler 1.2.0 in cdh5.16.2
2022-06-12 04:52:00 【sinat_ twenty-eight million three hundred and seventy-one thous】
Apache Dolphin Scheduler
Component is introduced
Distributed and extensible visualization DAG Workflow task scheduling system . Committed to solving the complex dependencies in the data processing process , Make scheduling system use out of the box in data processing flow .
Official website : https://dolphinscheduler.apache.org/en-us/
Github : https://github.com/apache/incubator-dolphinscheduler
Deployment environment
- CDH Test environment
- 6 Taiwan machine
- Gateway node deployment worker
- CM Node deployment master And monitoring web
- The gateway node has been deployed hive&spark gateway
- Platform version
- CDH5.16.2
- Dolphin Scheduler 1.2.0
- Basic software
- PostgreSQL or MySql Store metadata
Front end deployment
Installation package download
https://dolphinscheduler.apache.org/en-us/docs/release/download.html

- Create deployment folder /opt/ds, Upload tar Package into this directory , And unpack it
# create deploy dir
mkdir -p /opt/ds/ds-ui;
# decompression
tar -zxvf apache-dolphinscheduler-incubating-1.2.1-SNAPSHOT-dolphinscheduler-front-bin.tar.gz -C /opt/ds/;
mv apache-dolphinscheduler-incubating-1.2.1-SNAPSHOT-dolphinscheduler-front-bin ds-1.2.0-ui;


- Check yum Source , Here is the development environment , The external network needs to use a proxy , Need to install nginx
- Get into ds-1.2.0-ui Catalog , perform install-dolphinscheduler-ui.sh set up script
- Modify the front-end port to :8886, Prevent and Hue Port conflict
- modify api-server Of ip
- modify api-server port
- choice centos7 install

modify nginx Upload size parameters
- add to nginx To configure client_max_body_size 1024m;
- restart nginx
- This step must be done , Otherwise, the resource is too large to upload to the resource center
vi /etc/nginx/nginx.conf
# add param
client_max_body_size 1024m;
# restart nginx
systemctl restart nginx

Visit the front page 8888( Custom becomes 8886) port , appear loading page , front end web installation is complete

Back end deployment
Download installation package
https://dolphinscheduler.apache.org/en-us/docs/release/download.html

Upload tar Package to /opt/ds in , And unpack it
tar -zxvf apache-dolphinscheduler-incubating-1.2.1-SNAPSHOT-dolphinscheduler-backend-bin.tar.gz -C /opt/ds/;
mv apache-dolphinscheduler-incubating-1.2.1-SNAPSHOT-dolphinscheduler-backend-bin ds-1.2.0-backend;


Create deployment user
- Create the deployment user and set the password ( All deployment machines )
- Hang the deployment user to hadoop Under the group , Use hdfs As a Resource Center
- To configure sudo Unclassified
# add user dscheduler
useradd dscheduler;
# modify user password
passwd dscheduler;
# add sudo
vi /etc/sudoers;
dscheduler ALL=(ALL) NOPASSWD: ALL

- Switch to the deployment user and configure the machine to log in without secret , The pseudo distributed system needs to be configured with local password free login
su dscheduler;
ssh-keygen -t rsa;
# Configure mutual security free and stand-alone security free ,[hostname] Configure the machine that needs to be secret free hosts
ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected][hostname];


Database initialization
- Get into CDH Clustered mysql
- mysql -uroot -p
- The default database is pg,mysql Need to add mysql-connector-java Package to lib Under the table of contents
- Execute database initialization command , Set the access account password
CREATE DATABASE dscheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL PRIVILEGES ON dscheduler.* TO 'dscheduler'@'%' IDENTIFIED BY 'xxxx';
GRANT ALL PRIVILEGES ON dscheduler.* TO 'dscheduler'@'localhost' IDENTIFIED BY 'xxxx';
FLUSH PRIVILEGES;


- Create tables and import basic data
- modify conf In the catalog application-dao.properties file
- Comment out pg, Use mysql
- add to mysql-connector-java Package to lib Under the table of contents


- perform script In the catalog create-dolphinscheduler.sh

Configure environment variables
- Modify directory permissions
chown -R dscheduler:dscheduler ds-1.2.0-backend/;
chmod -R 755 ds-1.2.0-backend/;

- modify conf/env In the catalog .dolphinscheduler_env.sh file
- ds-1.2.0 In version Spark Task component , You can only submit Spark1 The task of
- SPARK_HOME1&SPARK_HOME2 All are configured as cluster Spark2-Home
- You can also comment out SPARK_HOME1
- Flink Not deployed in the cluster , Do not modify parameters

- take jdk Soft chain to /bin/java Next
ln -s /usr/java/jdk1.8.0_131/bin/java /usr/bin/java

- modify install.sh Configuration of , According to the cluster itself
- Pay attention to the parameters
- installPath - take ds Where to install it , Such as :/opt/ds-agent
- zkQuorum - It must be ip:2181, Remember to 2181 Port with
- deployUser - Deploy users , Action required HDFS Authority
- To use HDFS As a Resource Center ,HA Under the circumstances , Need to cluster core-site.xml Document and hdfs-site.xml File copy to conf Catalog
- Pay attention to the parameters
Deployment installation kazoo
- install python Of zk Tools
- CDH Cluster defaults to python2.7
yum -y install python-pip;
pip install kazoo;

- perform install Script ,sh install.sh
- stay worker and master Use on the machine jps Check whether the service is started


- Access the front end
- user name admin
- password dolphinscheduler123


- dolphin scheduler 1.2.0 Deployment completed
DAG test
- Create tenants

- Create user
- If there is a problem with tenant creation , Please check whether content center is enabled

- Create a new project and a new workflow

- Run the workflow , View the execution results

- thus Dolphin Scheduler 1.2.0 dag demo Testing is completed
边栏推荐
- SQL注入上传一句话木马(转)
- How to deploy PostgreSQL as a docker container
- Sword finger offer30 days re brush
- [GIS tutorial] ArcGIS for sunshine analysis (with exercise data download)
- 加速訓練之並行化 tf.data.Dataset 生成器
- In the era of smart retail, Weimeng reshapes the value of "shopping guide"
- kali下安装pycharm并创建快捷访问
- 【C语言】实现字符串截取功能
- Parallelization of accelerated training tf data. Dataset generator
- How to count the total length of roads in the region and draw data histogram
猜你喜欢

Find missing sequence numbers - SQL query to find missing sequence numbers

Drive safety coding & troubleshooting guide

1009 word search

MFC General dialog color dialog

Enhanced vegetation index evi, NDVI data, NPP data, GPP data, land use data, vegetation type data, rainfall data

QT compile 45 graphic report of security video monitoring system

2022 "college entrance examination memory" has been packaged, please check!

How Windows installs multiple versions of MySQL and starts it at the same time

BI 如何让SaaS产品具有 “安全感”和“敏锐感”(上)

In the era of smart retail, Weimeng reshapes the value of "shopping guide"
随机推荐
LabVIEW about TDMS and Binary Storage Speed
LabVIEW關於TDMS和Binary存儲速度
LabVIEW about TDMS and binary storage speeds
Advanced MySQL knowledge points (7)
Map coordinate conversion of Baidu map API
cellular automaton
Tasks in C #
2022 low voltage electrician test questions and simulation test
Parallelization of accelerated training tf data. Dataset generator
Detailed explanation of Command Execution Vulnerability
asp. Net core theme Middleware
Jwt Learning and use
The master programmer "plays" a C program that is not like C
JS set the position of the current scroll bar
Solid programming concepts
Normalized vegetation index (NDVI) data, NPP data, GPP data, evapotranspiration data, vegetation type data, ecosystem type distribution data
How Bi makes SaaS products have a "sense of security" and "sensitivity" (Part I)
Link: fatal error lnk1168: cannot open debug/test Solution of exe for writing
Oracle's instr()
Why is Julia so popular?