当前位置:网站首页>Technical practice of dolphin dispatching in kubernetes system
Technical practice of dolphin dispatching in kubernetes system
2022-06-11 09:23:00 【CSDN cloud computing】


author | Yangdian
edit | warrior_
* Editor's note
Kubernetes Is a container based technology 、 Implement container choreography 、 A cluster system that provides microservices and buses , It involves a large number of knowledge systems .
This article starts from the author's actual work experience , It shows us the use and technology sharing of dolphin scheduling in actual scenarios , I hope this article can give some inspiration to people who have the same experience .
Why do we use dolphin scheduling ,
What value has it brought , What's the problem
Dolphin scheduling is an excellent distributed and extensible visual workflow task scheduling platform .
Starting from the author's industry , The application of dolphin scheduling has quickly solved the ten pain points in data development :
Multi source data connection and access , Most common data sources in the technical field can be accessed , Adding a new data source does not require much change ;
diversified + professional + Massive data task management , Really focus on big data (hadoop family,flink etc. ) Task scheduling , It is significantly different from the traditional scheduler ;
Graphical task scheduling , Super user experience , It can be directly benchmarked with commercial products , And most foreign open source products cannot directly drag and drop the task of generating data ;
Task details , Atomic task rich view 、 Log view , The timeline displays , Meet the developers' fine management of data tasks , Fast positioning is slow sql, Performance bottleneck ;
Support for multiple distributed file systems , Enrich users' choice of unstructured data ;
Natural multi tenant management , Meet the data task management and isolation requirements of large organizations ;
Fully automatic distributed scheduling algorithm , Balance all scheduled tasks ;
With cluster monitoring function , monitor cpu, Memory , The number of connections ,zookeeper state , It is suitable for one-stop operation and maintenance of small and medium-sized enterprises ;
Built in task alarm function , Minimize the risk of task operation ;
Strong community operation , Listen to the real voice of the customer , Constantly adding new functions , Continuously optimize the customer experience ;
In the projects that the author participated in online dolphin scheduling , Also encountered many new challenges :
How to deploy dolphin dispatch with less human resources , Whether the fully automatic cluster installation and deployment mode can be realized ?
How to standardize technical component implementation specifications ?
Whether it can be unsupervised , System self healing ?
Network security control requirements , How to achieve air-gap Mode installation and update ?
Whether it can automatically expand the capacity without feeling ?
How to build and integrate the monitoring system ?
Based on the above challenges , We have incorporated dolphin scheduling into the existing kubernetes Cloud native system , Solve the pain , Make dolphin technology more powerful .
Kubernetes Technical system
New technical features brought to dolphin dispatching
In the use of kubernetes After managing dolphins , The overall technical scheme soon has rich and efficient technical features , It also solves the above practical challenges :
Various independent deployment projects , Quickly establish development and production environments , All of them have realized one click deployment , One click upgrade implementation mode ;
Fully support offline installation without Internet , Cutting installation speed is faster ;
Try to unify the information of installation configuration , Reduce exceptions to multiple project configurations , All configuration items , Can be based on different projects through the enterprise git management ;
And object storage technology , The technology of unifying unstructured data ;
Convenient monitoring system , With the existing prometheus Monitoring system integration ;
Mixed use of multiple schedulers ;
Fully automatic resource adjustment capability ;
Fast self-healing ability , Automatic abnormal restart , And restart based on probe mode ;
The cases in this paper are all based on dolphin scheduling 1.3.9 Version based .
be based on Helm Automated and efficient deployment of tools
First , We will introduce the helm Installation method of .helm Is to find 、 Share and use software to build kubernetes The best way . It's also Yunyuan cncf One of my graduation projects .

Dolphin's official website and github There are very detailed configuration files and cases on . Here we will focus on some of the consultations and problems that often occur in the community .
Official website document address
https://dolphinscheduler.apache.org/zh-cn/docs/1.3.9/user_doc/kubernetes-deployment.html
github Folder address
https://github.com/apache/dolphinscheduler/tree/1.3.9-release/docker/kubernetes/dolphinscheduler
stay value.yaml Modify the image in the file , For offline installation (air-gap install);
image:
repository: "apache/dolphinscheduler"
tag: "1.3.9"
pullPolicy: "IfNotPresent"For internal installed harbor, Or other private warehouses of the public cloud , Conduct pull,tag, as well as push. Here we assume that the private warehouse address is harbor.abc.com, The host on which the image is being built has docker login harbor.abc.com, And the private warehouse has been established and authorized to create apache project .
perform shell command
docker pull apache/dolphinscheduler:1.3.9
dock tag apache/dolphinscheduler:1.3.9 harbor.abc.com/apache/dolphinscheduler:1.3.9
docker push apache/dolphinscheduler:1.3.9Replace with value Image information in the file , Here we recommend Always To pull the image , In the production environment, try to check whether it is the latest image content every time , Ensure the correctness of software products . Besides , Many students will put tag It's written in latest, Make an image without writing tag Information , This is very dangerous in the production environment , anybody push Mirrored , It's a change latest Mirror image , And I can't judge latest What version is it , Therefore, it is suggested to specify the of each release tag, And use Always.
image:
repository: "harbor.abc.com/apache/dolphinscheduler"
tag: "1.3.9"
pullPolicy: "Always"hold https://github.com/apache/dolphinscheduler/tree/1.3.9-release/docker/kubernetes/dolphinscheduler Entire directory copy To be able to execute helm Command host , Then follow the official website
kubectl create ns ds139
helm install dolphinscheduler . -n ds139Offline installation can be realized .
Integrate datax、mysql、oracle Client component , Download the following components first
https://repo1.maven.org/maven2/mysql/mysql-connector-java/5.1.49/mysql-connector-java-5.1.49.jar
https://repo1.maven.org/maven2/com/oracle/database/jdbc/ojdbc8/
https://github.com/alibaba/DataX/blob/master/userGuid.md
Compile and build according to the prompts , The package is located at
{DataX_source_code_home}/target/datax/datax/
Based on the above plugin New component dockerfile, The basic image can use the already push Mirror image to private warehouse .
FROM harbor.abc.com/apache/dolphinscheduler:1.3.9
COPY *.jar /opt/dolphinscheduler/lib/
RUN mkdir -p /opt/soft/datax
COPY datax /opt/soft/dataxpreservation dockerfile, perform shell command
docker build -t harbor.abc.com/apache/dolphinscheduler:1.3.9-mysql-oracle-datax . # Don't forget the last point
docker push harbor.abc.com/apache/dolphinscheduler:1.3.9-mysql-oracle-dataxmodify value file
image:
repository: "harbor.abc.com/apache/dolphinscheduler"
tag: "1.3.9-mysql-oracle-datax"
pullPolicy: "Always"perform helm install dolphinscheduler . -n ds139, Or perform helm upgrade dolphinscheduler -n ds139, You can also start with helm uninstall dolphinscheduler -n ds139, Re execution helm install dolphinscheduler . -n ds139.
It is generally recommended to use independent peripherals in production environments postgresql As a management database , And use a stand-alone installation zookeeper Environmental Science ( This case uses zookeeper operator https://github.com/pravega/zookeeper-operator, In the same place as the dolphin kubernetes In the cluster ). We found that , After using the external database , The dolphin is dispatched in kubernetes Complete deletion in , Then redeploy dolphin dispatch , Mission data 、 Tenant data 、 User data and so on are reserved , This verifies the high availability and data integrity of the system once again .( If you delete pvc , The historical job log will be lost )
## If not exists external database, by default, Dolphinscheduler's database will use it.
postgresql:
enabled: false
postgresqlUsername: "root"
postgresqlPassword: "root"
postgresqlDatabase: "dolphinscheduler"
persistence:
enabled: false
size: "20Gi"
storageClass: "-"
## If exists external database, and set postgresql.enable value to false.
## external database will be used, otherwise Dolphinscheduler's database will be used.
externalDatabase:
type: "postgresql"
driver: "org.postgresql.Driver"
host: "192.168.1.100"
port: "5432"
username: "admin"
password: "password"
database: "dolphinscheduler"
params: "characterEncoding=utf8"
## If not exists external zookeeper, by default, Dolphinscheduler's zookeeper will use it.
zookeeper:
enabled: false
fourlwCommandsWhitelist: "srvr,ruok,wchs,cons"
persistence:
enabled: false
size: "20Gi"
storageClass: "storage-nfs"
zookeeperRoot: "/dolphinscheduler"
## If exists external zookeeper, and set zookeeper.enable value to false.
## If zookeeper.enable is false, Dolphinscheduler's zookeeper will use it.
externalZookeeper:
zookeeperQuorum: "zookeeper-0.zookeeper-headless.zookeeper.svc.cluster.local:2181,zookeeper-1.zookeeper-headless.zookeeper.svc.cluster.local:2181,zookeeper-2.zookeeper-headless.zookeeper.svc.cluster.local:2181"
zookeeperRoot: "/dolphinscheduler"be based on argo-cd Of gitops Deployment way
argo-cd Is based on Kubernetes The declarative form of gitops Continuous delivery tools .argo-cd yes cncf The incubation program of ,gitops Best practice tools for . About gitops The explanation can be referred to https://about.gitlab.com/topics/gitops/

gitops It can bring the following advantages to the implementation of dolphin scheduling .
Graphical installation of clustered software , A key to install ;
git Record the full release process , One click rollback ;
Convenient dolphin tool log viewing ;
Use argo-cd Implementation and installation steps of :
from github Download dolphin scheduling source code , modify value file , Refer to the previous chapter helm Install what needs to be modified ;
Create a new directory of the modified source code git project , also push To the inside of the company gitlab in ,github The directory name of the source code is docker/kubernetes/dolphinscheduler;
stay argo-cd Middle configuration gitlab Information , We use https The pattern of ;

argo-cd Create a new deployment project , Fill in relevant information


Yes git Refresh and pull the deployment information in , Realize the final deployment . You can see pod,configmap,secret,service,ingress And so on , also argo-cd Shows the previous git push The use of commit Information and submitter user name , In this way, all the publishing event information is completely recorded . At the same time, you can roll back to the historical version with one click .


adopt kubectl Command can see relevant resource information ;
[[email protected] ~]# kubectl get po -n ds139
NAME READY STATUS RESTARTS AGE
dolphinscheduler-alert-96c74dc84-72cc9 1/1 Running 0 22m
dolphinscheduler-api-78db664b7b-gsltq 1/1 Running 0 22m
dolphinscheduler-master-0 1/1 Running 0 22m
dolphinscheduler-master-1 1/1 Running 0 22m
dolphinscheduler-master-2 1/1 Running 0 22m
dolphinscheduler-worker-0 1/1 Running 0 22m
dolphinscheduler-worker-1 1/1 Running 0 22m
dolphinscheduler-worker-2 1/1 Running 0 22m
[[email protected] ~]# kubectl get statefulset -n ds139
NAME READY AGE
dolphinscheduler-master 3/3 22m
dolphinscheduler-worker 3/3 22m
[[email protected] ~]# kubectl get cm -n ds139
NAME DATA AGE
dolphinscheduler-alert 15 23m
dolphinscheduler-api 1 23m
dolphinscheduler-common 29 23m
dolphinscheduler-master 10 23m
dolphinscheduler-worker 7 23m
[[email protected] ~]# kubectl get service -n ds139
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
dolphinscheduler-api ClusterIP 10.43.238.5 <none> 12345/TCP 23m
dolphinscheduler-master-headless ClusterIP None <none> 5678/TCP 23m
dolphinscheduler-worker-headless ClusterIP None <none> 1234/TCP,50051/TCP 23m
[[email protected] ~]# kubectl get ingress -n ds139
NAME CLASS HOSTS ADDRESS
dolphinscheduler <none> ds139.abc.comYou can see everything pod All scattered kubernetes Different in the cluster host On , for example worker 1 and 2 On different nodes .


We've configured ingress, A pan domain name is configured inside the company to facilitate access by using the domain name ;

You can log in to the domain name to access .

The specific configuration can be modified value Contents of the file :
ingress:
enabled: true
host: "ds139.abc.com"
path: "/dolphinscheduler"
tls:
enabled: false
secretName: "dolphinscheduler-tls"It is convenient to view the internal logs of dolphin scheduling components :

Check the deployed system ,3 individual master,3 individual worker,zookeeper All configurations are normal ;



Use argo-cd It can be easily modified master,worker,api,alert Number of copies of such components , Dolphin helm Configuration is also reserved cpu And memory setting information . Here we modify value Copy value in . After modification ,git pish Inside the company gitlab.
master:
## PodManagementPolicy controls how pods are created during initial scale up, when replacing pods on nodes, or when scaling down.
podManagementPolicy: "Parallel"
## Replicas is the desired number of replicas of the given Template.
replicas: "5"
worker:
## PodManagementPolicy controls how pods are created during initial scale up, when replacing pods on nodes, or when scaling down.
podManagementPolicy: "Parallel"
## Replicas is the desired number of replicas of the given Template.
replicas: "5"
alert:
## Number of desired pods. This is a pointer to distinguish between explicit zero and not specified. Defaults to 1.
replicas: "3"
api:
## Number of desired pods. This is a pointer to distinguish between explicit zero and not specified. Defaults to 1.
replicas: "3"Only need argo-cd Click on sync Sync , Corresponding pods Have been increased according to the demand


[[email protected] ~]# kubectl get po -n ds139
NAME READY STATUS RESTARTS AGE
dolphinscheduler-alert-96c74dc84-72cc9 1/1 Running 0 43m
dolphinscheduler-alert-96c74dc84-j6zdh 1/1 Running 0 2m27s
dolphinscheduler-alert-96c74dc84-rn9wb 1/1 Running 0 2m27s
dolphinscheduler-api-78db664b7b-6j8rj 1/1 Running 0 2m27s
dolphinscheduler-api-78db664b7b-bsdgv 1/1 Running 0 2m27s
dolphinscheduler-api-78db664b7b-gsltq 1/1 Running 0 43m
dolphinscheduler-master-0 1/1 Running 0 43m
dolphinscheduler-master-1 1/1 Running 0 43m
dolphinscheduler-master-2 1/1 Running 0 43m
dolphinscheduler-master-3 1/1 Running 0 2m27s
dolphinscheduler-master-4 1/1 Running 0 2m27s
dolphinscheduler-worker-0 1/1 Running 0 43m
dolphinscheduler-worker-1 1/1 Running 0 43m
dolphinscheduler-worker-2 1/1 Running 0 43m
dolphinscheduler-worker-3 1/1 Running 0 2m27s
dolphinscheduler-worker-4 1/1 Running 0 2m27sDolphin scheduling and S3 Object storage technology integration
Many students ask questions in the dolphin community , How to configure s3 minio Integration of . Here is based on kubernetes Of helm To configure .
modify value in s3 Part of , It is recommended to use ip+ The port points to minio The server .
common: ## Configmap configmap: DOLPHINSCHEDULER_OPTS: "" DATA_BASEDIR_PATH: "/tmp/dolphinscheduler" RESOURCE_STORAGE_TYPE: "S3" RESOURCE_UPLOAD_PATH: "/dolphinscheduler" FS_DEFAULT_FS: "s3a://dfs" FS_S3A_ENDPOINT: "http://192.168.1.100:9000" FS_S3A_ACCESS_KEY: "admin" FS_S3A_SECRET_KEY: "password"minio Where dolphin files are stored bucket The name is dolphinscheduler, Here, create new folders and files to test .minio The directory of is under the tenant of the upload operation .


Dolphin scheduling and Kube-prometheus Technology integration of
We are kubernetes Use kube-prometheus operator technology , After deploying dolphins , It automatically realizes the resource monitoring of each dolphin component .
Please note that kube-prometheus Version of , Need corresponding kubernetes Main version .https://github.com/prometheus-operator/kube-prometheus



Dolphin scheduling and Service Mesh Technology integration of
adopt service mesh Technology can realize the service call inside dolphin , And dolphins api Observability analysis of external calls , So as to realize the self service optimization of dolphin dispatching products .
We use linkerd As service mesh Product integration ,linkerd It's also cncf One of the excellent graduation projects .

Just in the dolphins helm Of value Modification in file annotations, Redeployment , You can quickly achieve mesh proxy sidecar The injection of . It can be done to master,worker,api,alert And so on .
annotations: #{}
linkerd.io/inject: enabledThe quality of service communication between components can be observed , The number of requests per second, etc .


The future of dolphin scheduling based on cloud native technology
Dolphin scheduling is a native big data tool for the new generation cloud , The future can be in kubernetes Ecology integrates more excellent tools and features , Meet more user groups and scenarios .
and argo-workflow Integration of , Can pass api,cli Etc. in dolphin scheduling argo-workflow Single job ,dag Homework , And periodic operations ;
Use hpa The way , Automatic volume expansion and shrinkage worker, Realize the horizontal expansion mode without human intervention ;
Integrate kubernetes Of spark operator and flink operator Tools , Comprehensive Yunyuan biochemistry ;
Realize the distributed job scheduling of multi cloud and multi cluster , strengthening serverless+faas Class schema properties ;
use sidecar Realize periodic deletion worker Job log , Further realize worry free operation and maintenance ;

Previous recommendation
from 40% Fell to 4%,“ paste ” the Firefox Can you return to the top ?
Gartner Release 2022 Five major technological trends in the automotive industry in
Use this library , Let your service operate Redis Speed up
comic : What is? “ Low code ” Development platform ?

Share

Point collection

A little bit of praise

Click to see
边栏推荐
- keyboard entry.
- 机器学习笔记 - 深度学习技巧备忘清单
- Detailed explanation of this and static
- Exclusive interview with PMC member Liu Yu: female leadership in Apache pulsar community
- ERP体系的这些优势,你知道吗?
- ArcGIS 10.9.1 geological and meteorological volume metadata processing and service publishing and calling
- 【芯片方案】红外人体测温仪方案设计
- [scheme development] sphygmomanometer scheme pressure sensor sic160
- Opencv CEO teaches you to use oak (IV): create complex pipelines
- Sword finger offer II 036 Postfix Expression
猜你喜欢

openstack详解(二十二)——Neutron插件配置
![[C language - data storage] how is data stored in memory?](/img/cb/2d0cc83fd77de7179a9c45655c1a2d.png)
[C language - data storage] how is data stored in memory?

Video review pulsar summit Asia 2021, cases, operation and maintenance, ecological dry goods

Résumé de la méthode d'examen des mathématiques

Kubelet error getting node help

Machine learning notes - in depth Learning Skills Checklist

A summary of the problem type and method for proving the limit of sequence in postgraduate entrance examination

实现边充边OTG的PD芯片GA670-10

Talk about how to customize data desensitization

机器学习笔记 - 使用TensorFlow的Spatial Transformer网络
随机推荐
[scheme development] scheme of infrared thermometer
Error [detectionnetwork (1)][warning]network compiled for 6 shapes, maximum available 10, compiling for 5 S
报错Output image is bigger(1228800B) than maximum frame size specified in properties(1048576B)
MSF基于SMB的信息收集
MySQL啟動報錯“Bind on TCP/IP port: Address already in use”
Day 47 how to query a table
A summary of the problem type and method for proving the limit of sequence in postgraduate entrance examination
[TiO websocket] III. The TiO websocket server can send messages to users anywhere
What are the types of garment ERP system in the market?
Day45 storage engine data type integer floating point character type date type enumeration and set type constraints table to table relationships
山东大学增强现实实验四
Detailed explanation of the difference between construction method and method
【方案开发】血压计方案压力传感器SIC160
1854. the most populous year
Runtimeerror: blobreader error:the version of imported blob doesn't match graph_ transformer
PCBA方案定制,开发腕式血压计方案
企业决议时,哪个部分应该主导ERP项目?
MSF给正常程序添加后门
机器学习笔记 - 卷积神经网络备忘清单
Blinn Phong reflection model