当前位置:网站首页>Airflow 2.2.3 containerized installation
Airflow 2.2.3 containerized installation
2022-06-29 04:15:00 【Official account: Yunyuan ecosystem】
The above simple understanding airflow Concept and usage scenarios of , Passed today Docker Install it. Airflow, In the use of in-depth understanding airflow What are the specific functions .
1Airflow Containerized deployment
Alicloud's host environment :
- operating system :
Ubuntu 20.04.3 LTS - Kernel version :
Linux 5.4.0-91-generic
install docker
install Docker Refer to official documents [1], Pure system , There is no need to uninstall the old version , Because it is a cloud platform , To prevent the configuration from damaging the environment , You can take a snapshot in advance .
# to update repo
sudo apt-get update
sudo apt-get install \
ca-certificates \
curl \
gnupg \
lsb-release
# add to docker gpg key
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
# Set up docker stable Warehouse address
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
# View installable docker-ce edition
[email protected]:~# apt-cache madison docker-ce
docker-ce | 5:20.10.12~3-0~ubuntu-focal | https://download.docker.com/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:20.10.11~3-0~ubuntu-focal | https://download.docker.com/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:20.10.10~3-0~ubuntu-focal | https://download.docker.com/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:20.10.9~3-0~ubuntu-focal | https://download.docker.com/linux/ubuntu focal/stable amd64 Packages
# Install command format
#sudo apt-get install docker-ce=<VERSION_STRING> docker-ce-cli=<VERSION_STRING> containerd.io
# Install the specified version
sudo apt-get install docker-ce=5:20.10.12~3-0~ubuntu-focal docker-ce-cli=5:20.10.12~3-0~ubuntu-focal containerd.io
Optimize Docker To configure
/etc/docker/daemon.json
{
"data-root": "/var/lib/docker",
"exec-opts": [
"native.cgroupdriver=systemd"
],
"registry-mirrors": [
"https://****.mirror.aliyuncs.com" # Configure some accelerated addresses here , For example, Alibaba cloud and so on ...
],
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true"
],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m",
"max-file": "3"
}
}
Configure the boot itself
systemctl daemon-reload systemctl enable --now docker.service
Containerization installation Airflow
Database selection
According to the official website , The database recommends using MySQL8+ and postgresql 9.6+, stay Official docker-compose Script [2] It is used in PostgreSQL, So we need to adjust docker-compose.yml The content of
---
version: '3'
x-airflow-common:
&airflow-common
# In order to add custom dependencies or upgrade provider packages you can use your extended image.
# Comment the image line, place your Dockerfile in the directory where you placed the docker-compose.yaml
# and uncomment the "build" line below, Then run `docker-compose build` to build the images.
image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.2.3}
# build: .
environment:
&airflow-common-env
AIRFLOW__CORE__EXECUTOR: CeleryExecutor
AIRFLOW__CORE__SQL_ALCHEMY_CONN: mysql+mysqldb://airflow:[email protected]/airflow # Replace here with mysql How to connect
AIRFLOW__CELERY__RESULT_BACKEND: db+mysql://airflow:[email protected]/airflow # Replace here with mysql How to connect
AIRFLOW__CELERY__BROKER_URL: redis://:[email protected]:6379/0 # To ensure safety , We are right. redis Authentication enabled , So here xxxx Replace with redis password
AIRFLOW__CORE__FERNET_KEY: ''
AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
AIRFLOW__CORE__LOAD_EXAMPLES: 'true'
AIRFLOW__API__AUTH_BACKEND: 'airflow.api.auth.backend.basic_auth'
_PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:-}
volumes:
- ./dags:/opt/airflow/dags
- ./logs:/opt/airflow/logs
- ./plugins:/opt/airflow/plugins
user: "${AIRFLOW_UID:-50000}:0"
depends_on:
&airflow-common-depends-on
redis:
condition: service_healthy
mysql: # Change here to mysql service name
condition: service_healthy
services:
mysql:
image: mysql:8.0.27 # It is amended as follows mysql Latest image
environment:
MYSQL_ROOT_PASSWORD: bbbb # MySQL root Account and password
MYSQL_USER: airflow
MYSQL_PASSWORD: aaaa # airflow User's password
MYSQL_DATABASE: airflow
command:
--default-authentication-plugin=mysql_native_password # Specify the default authentication plug-in
--collation-server=utf8mb4_general_ci # According to the official character set
--character-set-server=utf8mb4 # According to the official character code
volumes:
- /apps/airflow/mysqldata8:/var/lib/mysql # Persistence MySQL data
- /apps/airflow/my.cnf:/etc/my.cnf # Persistence MySQL The configuration file
healthcheck:
test: mysql --user=$$MYSQL_USER --password=$$MYSQL_PASSWORD -e 'SHOW DATABASES;' # healthcheck command
interval: 5s
retries: 5
restart: always
redis:
image: redis:6.2
expose:
- 6379
command: redis-server --requirepass xxxx # redis-server Turn on password authentication
healthcheck:
test: ["CMD", "redis-cli","-a","xxxx","ping"] # redis Use a password to healthcheck
interval: 5s
timeout: 30s
retries: 50
restart: always
airflow-webserver:
<<: *airflow-common
command: webserver
ports:
- 8080:8080
healthcheck:
test: ["CMD", "curl", "--fail", "http://localhost:8080/health"]
interval: 10s
timeout: 10s
retries: 5
restart: always
depends_on:
<<: *airflow-common-depends-on
airflow-init:
condition: service_completed_successfully
airflow-scheduler:
<<: *airflow-common
command: scheduler
healthcheck:
test: ["CMD-SHELL", 'airflow jobs check --job-type SchedulerJob --hostname "$${HOSTNAME}"']
interval: 10s
timeout: 10s
retries: 5
restart: always
depends_on:
<<: *airflow-common-depends-on
airflow-init:
condition: service_completed_successfully
airflow-worker:
<<: *airflow-common
command: celery worker
healthcheck:
test:
- "CMD-SHELL"
- 'celery --app airflow.executors.celery_executor.app inspect ping -d "[email protected]$${HOSTNAME}"'
interval: 10s
timeout: 10s
retries: 5
environment:
<<: *airflow-common-env
# Required to handle warm shutdown of the celery workers properly
# See https://airflow.apache.org/docs/docker-stack/entrypoint.html#signal-propagation
DUMB_INIT_SETSID: "0"
restart: always
depends_on:
<<: *airflow-common-depends-on
airflow-init:
condition: service_completed_successfully
airflow-triggerer:
<<: *airflow-common
command: triggerer
healthcheck:
test: ["CMD-SHELL", 'airflow jobs check --job-type TriggererJob --hostname "$${HOSTNAME}"']
interval: 10s
timeout: 10s
retries: 5
restart: always
depends_on:
<<: *airflow-common-depends-on
airflow-init:
condition: service_completed_successfully
airflow-init:
<<: *airflow-common
entrypoint: /bin/bash
# yamllint disable rule:line-length
command:
- -c
- |
function ver() {
printf "%04d%04d%04d%04d" $${1//./ }
}
airflow_version=$$(gosu airflow airflow version)
airflow_version_comparable=$$(ver $${airflow_version})
min_airflow_version=2.2.0
min_airflow_version_comparable=$$(ver $${min_airflow_version})
if (( airflow_version_comparable < min_airflow_version_comparable )); then
echo
echo -e "\033[1;31mERROR!!!: Too old Airflow version $${airflow_version}!\e[0m"
echo "The minimum Airflow version supported: $${min_airflow_version}. Only use this or higher!"
echo
exit 1
fi
if [[ -z "${AIRFLOW_UID}" ]]; then
echo
echo -e "\033[1;33mWARNING!!!: AIRFLOW_UID not set!\e[0m"
echo "If you are on Linux, you SHOULD follow the instructions below to set "
echo "AIRFLOW_UID environment variable, otherwise files will be owned by root."
echo "For other operating systems you can get rid of the warning with manually created .env file:"
echo " See: https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html#setting-the-right-airflow-user"
echo
fi
one_meg=1048576
mem_available=$$(($$(getconf _PHYS_PAGES) * $$(getconf PAGE_SIZE) / one_meg))
cpus_available=$$(grep -cE 'cpu[0-9]+' /proc/stat)
disk_available=$$(df / | tail -1 | awk '{print $$4}')
warning_resources="false"
if (( mem_available < 4000 )) ; then
echo
echo -e "\033[1;33mWARNING!!!: Not enough memory available for Docker.\e[0m"
echo "At least 4GB of memory required. You have $$(numfmt --to iec $$((mem_available * one_meg)))"
echo
warning_resources="true"
fi
if (( cpus_available < 2 )); then
echo
echo -e "\033[1;33mWARNING!!!: Not enough CPUS available for Docker.\e[0m"
echo "At least 2 CPUs recommended. You have $${cpus_available}"
echo
warning_resources="true"
fi
if (( disk_available < one_meg * 10 )); then
echo
echo -e "\033[1;33mWARNING!!!: Not enough Disk space available for Docker.\e[0m"
echo "At least 10 GBs recommended. You have $$(numfmt --to iec $$((disk_available * 1024 )))"
echo
warning_resources="true"
fi
if [[ $${warning_resources} == "true" ]]; then
echo
echo -e "\033[1;33mWARNING!!!: You have not enough resources to run Airflow (see above)!\e[0m"
echo "Please follow the instructions to increase amount of resources available:"
echo " https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html#before-you-begin"
echo
fi
mkdir -p /sources/logs /sources/dags /sources/plugins
chown -R "${AIRFLOW_UID}:0" /sources/{logs,dags,plugins}
exec /entrypoint airflow version
# yamllint enable rule:line-length
environment:
<<: *airflow-common-env
_AIRFLOW_DB_UPGRADE: 'true'
_AIRFLOW_WWW_USER_CREATE: 'true'
_AIRFLOW_WWW_USER_USERNAME: ${_AIRFLOW_WWW_USER_USERNAME:-airflow}
_AIRFLOW_WWW_USER_PASSWORD: ${_AIRFLOW_WWW_USER_PASSWORD:-airflow}
user: "0:0"
volumes:
- .:/sources
airflow-cli:
<<: *airflow-common
profiles:
- debug
environment:
<<: *airflow-common-env
CONNECTION_CHECK_MAX_COUNT: "0"
# Workaround for entrypoint issue. See: https://github.com/apache/airflow/issues/16252
command:
- bash
- -c
- airflow
flower:
<<: *airflow-common
command: celery flower
ports:
- 5555:5555
healthcheck:
test: ["CMD", "curl", "--fail", "http://localhost:5555/"]
interval: 10s
timeout: 10s
retries: 5
restart: always
depends_on:
<<: *airflow-common-depends-on
airflow-init:
condition: service_completed_successfully
In the official docker-compose.yaml On this basis, only x-airflow-common,MySQL,Redis Related configuration , The next step is to start the container , Before startup , You need to create several persistent directories :
mkdir -p ./dags ./logs ./plugins echo -e "AIRFLOW_UID=$(id -u)" > .env # Be careful , Make sure that AIRFLOW_UID It's for ordinary users UID, And ensure that this user has the permission to create these persistent directories
If you are not an ordinary user , While running the container , Will report a mistake , Can't find airflow modular
docker-compose up airflow-init # Initialize database , And create tables docker-compose up -d # establish airflow Containers
When the status of the container is unhealthy When , To pass the docker inspect $container_name Check the reason for the error , thus airflow The installation of is complete .
Reference material
[1] Install Docker Engine on Ubuntu: https://docs.docker.com/engine/install/ubuntu/
[2] official docker-compose.yaml: https://airflow.apache.org/docs/apache-airflow/2.2.3/docker-compose.yaml
边栏推荐
- Establishment of small and medium-sized enterprise network
- 【C语言】 详解线程退出函数 pthread_exit
- 科技云报道:混合办公的B面:安全与效率如何兼得?
- Developer scheme · environmental monitoring equipment (Xiaoxiong school IOT development board) connected to graffiti IOT development platform
- Yangzhou needs one English IT Helpdesk Engineer -20220216
- If you choose the right school, you can enter Huawei as a junior college. I wish I had known
- 1016 部分A+B
- [C language] start a thread
- 基于可变参模板实现的线程池
- [WC2021] 斐波那契——数论、斐波那契数列
猜你喜欢

MySQL review materials (additional) case when

快速开发项目-VScode插件

PostgreSQL 出现cross-database references are not implemented的bug

Redis cache penetration, cache breakdown, cache avalanche

What is the dry goods microservice architecture? What are the advantages and disadvantages?

Hcie security day41: theoretical learning: information collection and network detection

【滤波器设计】根据设计指标使用matlab定制滤波器

SEAttention 通道注意力机制

JSX的基本使用

Libuv library overview and comparison of libevent, libev and libuv (Reprint)
随机推荐
String differences between different creation methods
云原生周报 | Grafana 9正式发布;云原生词汇表中文版现已上线
Build a simple website by yourself
How to merge upstream and downstream SQL data records
Here comes Wi Fi 7. How strong is it?
NotImplementedError: Could not run torchvision::nms
选对学校,专科也能进华为~早知道就好了
String不同创建方式的区别
Idea modifying JVM memory
Developer scheme · environmental monitoring equipment (Xiaoxiong school IOT development board) connected to graffiti IOT development platform
百度智能云服务网格产品CSM发布 | 火热公测中
Technology cloud report: side B of mixed office: how to have both security and efficiency?
iNFTnews | 元宇宙技术将带来全新的购物体验
Anaconda's own Spyder editor starts with an error
If you choose the right school, you can enter Huawei as a junior college. I wish I had known
1019 数字黑洞
1018 hammer scissors cloth
1015 德才论
1019 digital black hole
快速开发项目-VScode插件