当前位置:网站首页>How to deploy dolphin scheduler 1.3.1 on cdh5
How to deploy dolphin scheduler 1.3.1 on cdh5
2022-06-12 04:53:00 【sinat_ twenty-eight million three hundred and seventy-one thous】
This paper records in CDH5.16.2 Integration on Cluster Dolphin Scheduler 1.3.1 Detailed process of , Pay special attention to MySQL Database connection string !
1
Purpose of documentation
- Record in detail CDH5 On Dolphin Scheduler 1.3.1 Deployment process of
- Distributed deployment Dolphin Scheduler
2
Deployment environment and dependent components
For adaptation CDH5 Upper Hive edition , Need to be right DS Compile and deploy the source code , Finally, the compiled CDH5 Version for you to download
Cluster environment
- CDH 5.16.2
- HDFS and YARN It's all single point
- DS Its official website
- https://dolphinscheduler.apache.org/en-us/
DS Dependent components
- MySQL: Used to store dolphin scheduler Metadata , You can also use pg, This is mainly because CDH The cluster uses MySQL
- Zookeeper: Use CDH Clustered zk
3
Dolphin Scheduler 1.3.1 Cluster planning
DS service | master.eights.com | dn1.eights.com | dn2.eights.com |
|---|---|---|---|
api | √ | ||
master | √ | √ | |
worker/log | √ | √ | |
alert | √ |
4
Source code compilation
precondition
- maven
- jdk
- nvm
Code pull
git clone https://github.com/apache/incubator-dolphinscheduler.git
Switch CDH5 Branch
git checkout 1.3.1-release; git checkout -b ds-1.3.1-cdh5.16.2;
modify pom
Modify the root root Of hadoop edition ,hive Version and version Information , Of each module version All adjusted to 1.3.1-cdh5.16.2
<hadoop.version>2.6.0</hadoop.version> <hive.jdbc.version>1.1.0</hive.jdbc.version> <version>1.3.1-cdh5.16.2</version>
Remove mysql Bag scope
Execute the compile command to compile the source code
mvn -U clean package -Prelease -Dmaven.test.skip=true
After compilation ,dolphinscheduler-dist The directory will generate
apache-dolphinscheduler-incubating-1.3.1-cdh5.16.2-dolphinscheduler-bin.tar.gz package ,1.3.1 The front and back ends of the version are tied together , No two bags .
5
Component deployment
preparation
- Create deployment users and configurations SSH Unclassified
Create deployment users on all deployment machines ,ds When carrying out the task, we will use sudo -u [linux-user] To execute the job , Here the dscheduler As a deployment user .
# Add deployment user useradd dscheduler; # Set the password echo "dscheduler" | passwd --stdin dscheduler # Configuration free echo 'dscheduler ALL=(ALL) NOPASSWD: NOPASSWD: ALL' >> /etc/sudoers # Switch to deployment user and build ssh key su dscheduler; ssh-keygen -t rsa; ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected][hostname];
establish ds Metabase of -MySQL
CREATE DATABASE dolphinscheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; CREATE USER 'dscheduler'@'%' IDENTIFIED BY 'dscheduler'; GRANT ALL PRIVILEGES ON dolphinscheduler.* TO 'dscheduler'@'%' IDENTIFIED BY 'dscheduler'; flush privileges;
The installation package is decompressed & Authority modification
Upload the installation package to the cluster /opt Catalog , Perform decompression
# Unzip the installation package tar -zxvf apache-dolphinscheduler-incubating-1.3.1-cdh5.16.2-dolphinscheduler-bin.tar.gz -C /opt/ # rename mv apache-dolphinscheduler-incubating-1.3.1-cdh5.16.2-dolphinscheduler-bin ds-1.3.1-cdh5.16.2 # Modify file permissions and groups chmod -R 755 ds-1.3.1-cdh5.16.2; chown -R dscheduler:dscheduler ds-1.3.1-cdh5.16.2;
initialization MySQL
Modify database configuration
Pay special attention to the database connection configuration
Pay special attention to the database connection configuration
Pay special attention to the database connection configuration
vi /opt/ds-1.3.1-cdh5.16.2/conf/datasource.properties; spring.datasource.driver-class-name=com.mysql.jdbc.Driver spring.datasource.url=jdbc:mysql://cm.eights.com:3306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8&allowMultiQueries=true spring.datasource.username=dscheduler spring.datasource.password=dscheduler
stay ds Execute the database initialization script under the installation package directory of
./script/create-dolphinscheduler.sh
To configure ds Required environment variables , Remember to match it here ,ds When carrying out tasks, they will first source dolphinscheduler_env.sh
vi /opt/ds-1.3.1-cdh5.16.2/conf/env/dolphinscheduler_env.sh # Not on the test cluster datax and flink Please ignore the configuration export HADOOP_HOME=/opt/cloudera/parcels/CDH/lib/hadoop export HADOOP_CONF_DIR=/opt/cloudera/parcels/CDH/lib/hadoop/etc/hadoop export SPARK_HOME1=/opt/cloudera/parcels/CDH/lib/spark export SPARK_HOME2=/opt/cloudera/parcels/SPARK2/lib/spark2 export PYTHON_HOME=/usr/local/anaconda3/bin/python export JAVA_HOME=/usr/java/jdk1.8.0_131 export HIVE_HOME=/opt/cloudera/parcels/CDH/lib/hive export FLINK_HOME=/opt/soft/flink export DATAX_HOME=/opt/soft/datax/bin/datax.py
To write ds Configuration file for
ds stay 1.3.0 Before , The configuration file of one click deployment is in install.sh in .1.3.0 Version of install.sh A script is just a deployment script , The deployment profile is in the conf/config/install_config.conf in .install_config.conf A lot of unnecessary parameters have been simplified , If further parameter adjustment is required , It needs to be modified conf The configuration file of the corresponding module .
The following is the cluster deployment configuration for this deployment
# NOTICE : If the following config has special characters in the variable `.*[]^${}\+?|()@#&`, Please escape, for example, `[` escape to `\[`
# postgresql or mysql
dbtype="mysql"
# db config
# db address and port
dbhost="cm.eights.com:3306"
# db username
username="dscheduler"
# database name
dbname="dolphinscheduler"
# db passwprd
# NOTICE: if there are special characters, please use the \ to escape, for example, `[` escape to `\[`
password="dscheduler"
# zk cluster
zkQuorum="master.eights.com:2181,dn1.eights.com:2181,cm.eights.com:2181"
# Note: the target installation path for dolphinscheduler, please not config as the same as the current path (pwd)
# On each deployment machine install ds The catalog of , The role log files and task logs are in this directory
installPath="/opt/ds-1.3.1-agent"
# deployment user
# Note: the deployment user needs to have sudo privileges and permissions to operate hdfs. If hdfs is enabled, the root directory needs to be created by itself
deployUser="dscheduler"
# Mail my side is the intranet mailbox , The configuration method of Internet mail will be presented later
# alert config
# mail server host
mailServerHost="xxxx"
# mail server port
# note: Different protocols and encryption methods correspond to different ports, when SSL/TLS is enabled, make sure the port is correct.
mailServerPort="25"
# sender
mailSender="xxxx"
# user
mailUser="xxxx"
# sender password
# note: The mail.passwd is email service authorization code, not the email login password.
mailPassword="xxxx"
# TLS mail protocol support
starttlsEnable="false"
# SSL mail protocol support
# only one of TLS and SSL can be in the true state.
sslEnable="false"
#note: sslTrust is the same as mailServerHost
sslTrust="xxxxxx"
# resource storage type:HDFS,S3,NONE
resourceStorageType="HDFS"
# Single point HDFS and yarn You can configure it directly
# if resourceStorageType is HDFS,defaultFS write namenode address,HA you need to put core-site.xml and hdfs-site.xml in the conf directory.
# if S3,write S3 address,HA,for example :s3a://dolphinscheduler,
# Note,s3 be sure to create the root directory /dolphinscheduler
defaultFS="hdfs://master.eights.com:8020"
# if resourceStorageType is S3, the following three configuration is required, otherwise please ignore
s3Endpoint="http://192.168.xx.xx:9010"
s3AccessKey="xxxxxxxxxx"
s3SecretKey="xxxxxxxxxx"
# Note here , Even if yarn It's better to have a single point yarnHaIps Deserve to go up
# if resourcemanager HA enable, please type the HA ips ; if resourcemanager is single, make this value empty
yarnHaIps="master.eights.com"
# if resourcemanager HA enable or not use resourcemanager, please skip this value setting; If resourcemanager is single, you only need to replace yarnIp1 to actual resourcemanager hostname.
singleYarnIp="master.eights.com"
# resource store on HDFS/S3 path, resource file will store to this hadoop hdfs path, self configuration, please make sure the directory exists on hdfs and have read write permissions./dolphinscheduler is recommended
resourceUploadPath="/dolphinscheduler"
# who have permissions to create directory under HDFS/S3 root path
# Note: if kerberos is enabled, please config hdfsRootUser=
hdfsRootUser="hdfs"
# kerberos config
# whether kerberos starts, if kerberos starts, following four items need to config, otherwise please ignore
kerberosStartUp="false"
# kdc krb5 config file path
krb5ConfPath="$installPath/conf/krb5.conf"
# keytab username
keytabUserName="[email protected]"
# username keytab path
keytabPath="$installPath/conf/hdfs.headless.keytab"
# api server port
apiServerPort="12345"
# install hosts
# Note: install the scheduled hostname list. If it is pseudo-distributed, just write a pseudo-distributed hostname
ips="master.eights.com,dn1.eights.com,dn2.eights.com"
# ssh port, default 22
# Note: if ssh port is not default, modify here
sshPort="22"
# run master machine
# Note: list of hosts hostname for deploying master
masters="master.eights.com,dn1.eights.com"
# run worker machine
# 1.3.1 hold worker Group from mysql Moved to zookeeper in , This needs a dozen worker The label of
# At present, one is not supported worker Belong to multiple groups
# note: need to write the worker group name of each worker, the default value is "default"
workers="master.eights.com:default,dn2.eights.com:sqoop"
# run alert machine
# note: list of machine hostnames for deploying alert server
alertServer="dn1.eights.com"
# run api machine
# note: list of machine hostnames for deploying api server
apiServers="dn1.eights.com"add to Hadoop Cluster profile
- If the cluster is not enabled HA, Directly in install_config.conf Document
- If the cluster is enabled HA, Please put hadoop Of hdfs-site.xml and core-site.xml copy to /conf Under the table of contents
modify JVM Parameters
- Two documents
- /bin/dolphinscheduler-daemon.sh
- /scripts/dolphinscheduler-daemon.sh
export DOLPHINSCHEDULER_OPTS="-server -Xmx16g -Xms1g -Xss512k -XX:+DisableExplicitGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:LargePageSizeInBytes=128m -XX:+UseFastAccessorMethods -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=70"
One key deployment & Process check & Single service start stop
# Run the deployment ds Script sh install.sh # Process check jps
Service start stop
# One click stop sh ./bin/stop-all.sh # One key open sh ./bin/start-all.sh # Start stop master sh ./bin/dolphinscheduler-daemon.sh start master-server sh ./bin/dolphinscheduler-daemon.sh stop master-server # Start stop worker sh ./bin/dolphinscheduler-daemon.sh start worker-server sh ./bin/dolphinscheduler-daemon.sh stop worker-server # Start stop api-server sh ./bin/dolphinscheduler-daemon.sh start api-server sh ./bin/dolphinscheduler-daemon.sh stop api-server # Start stop logger sh ./bin/dolphinscheduler-daemon.sh start logger-server sh ./bin/dolphinscheduler-daemon.sh stop logger-server # Start stop alert sh ./bin/dolphinscheduler-daemon.sh start alert-server sh ./bin/dolphinscheduler-daemon.sh stop alert-server
Front end access
dolphinscheduler-1.3.1 The front end is no longer needed nginx, Use it directly
apiserver:12345/dolphinscheduler Visit
account number admin password dolphinscheduler123 land
Check worker grouping
- You can see 1.3.1 Version of worker Grouping is through install_config.conf To perform the , It cannot be modified on the page
Inspection services
compiled Dolphin Scheduler-1.3.1-cdh5.16.2 My bag
Baidu SkyDrive
link :
https://pan.baidu.com/s/1gEwEF2R2XJVRv76SgiW0hA
Extraction code :joyq
6
summary
dolphinscheduler-1.3.1 The version is greatly simplified in deployment install.sh Configuration of , It allows users to deploy quickly . However, if you want to upgrade the old railway, you need to conf Modify the configuration file of the corresponding module in the directory .
There is no need for a separate front-end nginx, Users' deployment experience is better
边栏推荐
- Drive safety coding & troubleshooting guide
- L1-064 AI core code valued at 100 million (20 points)
- Spatial distribution data of China's tertiary watershed / national new area distribution data /npp net primary productivity data / spatial distribution data of vegetation cover / land use data /ndvi d
- [backtracking based on bit operation] queen n problem 2
- Big manufacturers compete to join rust, performance and safety are the key, and the 2021 rust developer survey report is announced
- asp. Net core theme Middleware
- Acquisition of Lai data, NPP data, GPP data and vegetation coverage data
- JS disable mobile sharing
- Musk promotes the development of fascinating new products partners remind important questions
- BI 如何让SaaS产品具有 “安全感”和“敏锐感”(上)
猜你喜欢

Memory protection

Data processing and data set preparation

Why is Julia so popular?

Redis learning notes (continuously updating)

Image processing 13- calculation of integral diagram

C asynchronous programming (async and await) and asynchronous method synchronous invocation
![February 19, 2022 [Nolan] Nolan resurrected? Change · Nolan [soul orchid] can be connected to XDD / silly girl](/img/1a/ab2158a532683632f3a12fe41812f5.jpg)
February 19, 2022 [Nolan] Nolan resurrected? Change · Nolan [soul orchid] can be connected to XDD / silly girl

2022“高考记忆” 已打包完成,请查收!

MFC General dialog color dialog

Simple Tetris
随机推荐
Day18 creation and restoration of sparse array
2022“高考记忆” 已打包完成,请查收!
Some problems of Qinglong panel
Token based authentication
2022 low voltage electrician test questions and simulation test
JS to determine whether the tags of multiple classes are empty
MFC General dialog color dialog
Spatial distribution data of China's tertiary watershed / national new area distribution data /npp net primary productivity data / spatial distribution data of vegetation cover / land use data /ndvi d
Bearpi IOT serial port transceiver 1- normal mode
IC验证中的force/release 学习整理(5)研究对 reg类型信号的影响
Find missing sequence numbers - SQL query to find missing sequence numbers
Data processing and data set preparation
[backtracking] backtracking method to solve combinatorial problems
Force/release learning arrangement in IC Verification (5) research on the influence of reg type signals
[efficient] the most powerful development tool, ctool, is a compilation tool
kali_ Nat mode, bridging Internet / host only_ detailed
[wechat applet] the mobile terminal selects and publishes pictures
[SC] OpenService FAILED 5: Access is denied.
Interview must ask: summary of ten classic sorting algorithms
leetcode 263. Ugly number