当前位置:网站首页>Azkaban概述
Azkaban概述
2022-07-05 02:41:00 【一个正在努力的菜鸡】
什么是azkaban
1.术语
- 批量工作流任务调度器
2.解释
- 主要用于在一个工作流内以一个特定的顺序运行一组工作和流程,它的配置是通过简单的key:value对的方式,通过配置中的dependencies来设置依赖关系
- Azkaban使用job配置文件建立任务之间的依赖关系,并提供一个易于使用的web用户界面维护和跟踪你的工作流
为什么需要工作流调度系统
1.解决任务单元之间的依赖关系
- 一个完整的数据分析系统通常都是由大量任务单元组成(shell脚本程序,java程序,mapreduce程序、hive脚本等)
- 各任务单元之间存在时间先后及前后依赖关系
- 为了很好地组织起这样的复杂执行计划,需要一个工作流调度系统来调度执行
2.定时调度器
- 整个的执行过程都需要人工参加,并且得盯着各任务的进度。但是我们的很多任务都是在深更半夜执行的,通过写脚本设置crontab执行
- 其实,整个过程类似于一个有向无环图(DAG)
- 每个子任务相当于大任务中的一个节点,也就是,我们需要的就是一个工作流的调度器,而Azkaban就是能解决上述问题的一个调度器
Azkaban特点
1.兼容任何版本的hadoop
2.易于使用的Web用户界面,方便简单傻瓜化操作
3.模块化和可插拔的插件机制
4.认证/授权(权限的工作)
5.能够杀死并重新启动工作流
6.有关失败和成功的电子邮件提醒
常见工作流调度系统
1.简单的任务调度
- 直接使用crontab实现
2.复杂的任务调度
- 开发调度平台或使用现成的开源调度系统,比如ooize、azkaban等
Ooize和Azkaban特性对比
Azkaban的架构
1.架构图
2.解释
- AzkabanWebServer:AzkabanWebServer是整个Azkaban工作流系统的主要管理者,它用户登录认证、负责project管理、定时执行工作流、跟踪工作流执行进度等一系列任务
- AzkabanExecutorServer:负责具体的工作流的提交、执行,它们通过MySQL数据库来协调任务的执行
- 关系型数据库(MySQL):存储大部分执行流状态,AzkabanWebServer和AzkabanExecutorServer都需要访问数据库
边栏推荐
- 2021 Li Hongyi machine learning (2): pytorch
- Naacl 2021 | contrastive learning sweeping text clustering task
- Introduce reflow & repaint, and how to optimize it?
- Which common ports should the server open
- [understanding of opportunity -38]: Guiguzi - Chapter 5 flying clamp - warning one: there is a kind of killing called "killing"
- Hmi-32- [motion mode] add light panel and basic information column
- The most powerful new household god card of Bank of communications. Apply to earn 2100 yuan. Hurry up if you haven't applied!
- PHP cli getting input from user and then dumping into variable possible?
- Scientific research: are women better than men?
- Use the difference between "Chmod a + X" and "Chmod 755" [closed] - difference between using "Chmod a + X" and "Chmod 755" [closed]
猜你喜欢
Hmi-31- [motion mode] solve the problem of picture display of music module
[技术发展-26]:新型信息与通信网络的数据安全
Avoid material "minefields"! Play with super high conversion rate
Elfk deployment
【LeetCode】111. Minimum depth of binary tree (2 brushes of wrong questions)
ASP. Net core 6 framework unveiling example demonstration [01]: initial programming experience
Exploration of short text analysis in the field of medical and health (I)
Introduce reflow & repaint, and how to optimize it?
Practical case of SQL optimization: speed up your database
8. Commodity management - commodity classification
随机推荐
The steering wheel can be turned for one and a half turns. Is there any difference between it and two turns
Design and practice of kubernetes cluster and application monitoring scheme
Spark SQL learning bullet 2
Yuan universe also "real estate"? Multiple second-hand trading websites block metauniverse keywords
Using druid to connect to MySQL database reports the wrong type
TCP security of network security foundation
How to find hot projects in 2022? Dena community project progress follow-up, there is always a dish for you (1)
The perfect car for successful people: BMW X7! Superior performance, excellent comfort and safety
Returns the lowest common ancestor of two nodes in a binary tree
PHP cli getting input from user and then dumping into variable possible?
Action News
Master Fur
2021 Li Hongyi machine learning (3): what if neural network training fails
Design of KTV intelligent dimming system based on MCU
Kotlin - 协程 Coroutine
Application and Optimization Practice of redis in vivo push platform
Comparison of advantages and disadvantages between platform entry and independent deployment
Acwing第 58 场周赛【完结】
Elfk deployment
Naacl 2021 | contrastive learning sweeping text clustering task