当前位置:网站首页>Install nccl \ mpirun \ horovod \ NVIDIA tensorflow (3090ti)
Install nccl \ mpirun \ horovod \ NVIDIA tensorflow (3090ti)
2022-07-26 05:04:00 【zoetu】
Environmental statement
The graphics card :Nvidia 3090Ti
CUDA 11.1
Cudnn 8.0
OS:Ubuntu 20.x
Tensorflow:nvidia-tensorflow( The installation process follows )
tensorflow1.x Version does not provide A100/3090 Wait for the support of the new version of the graphics card , Therefore use nvidia-tensorflow( If you do not install this version tf, Even if the graphics card is detected, it will not call ). in addition ,nvidia-tensorflow Only support ubuntu20.04.
horovod Introduce
horovod yes Uber Distributed training framework developed by the team , It can make you modify the code as little as possible to expand the script of single card training to multi card parallel training , At the same time, it also takes into account the acceleration of training . At present, we support tensorflow/keras/pytorch/mxnet. The underlying communication mainly depends on NCCL/Gloo( After testing NCCL It's the fastest ), Support MPI(CPU Faster training ). Because its training acceleration effect is better than tensorflow Native distributedStrategy Much faster , So in distributed training , Recommended .
The following is mainly for tensorflow1.x Next, do distributed training to explain .
The environment mainly depends on tensorflow1.x/horovod/nccl/mpi , There are two ways to build the environment :local/docker, In this paper local Local installation .
NCLL
https://github.com/NVIDIA/nccl
install NCCL, Note that version , In order to balance A100, Recommended v2.8.3-1 This version .
# compile nccl
git clone https://github.com/NVIDIA/nccl.git
cd nccl && git checkout v2.8.3-1
make -j src.build
# If you install it for the first time , You need to install the dependency
# Install tools to create debian packages
sudo apt install build-essential devscripts debhelper fakeroot
# Build NCCL deb package
make pkg.debian.build
ls build/pkg/deb/
# install
sudo make install
Installation here is a time-consuming step , Just wait
Verify the installation
# confirm horovod Of the link nccl The version path is correct
ldd /usr/local/lib/python3.8/dist-packages/horovod/tensorflow/mpi_lib.cpython-38-x86_64-linux-gnu.so
mpirun
https://www.open-mpi.org/
# Installation dependency
apt-get install libnuma-dev
# Download installation package
wget https://www.open-mpi.org/software/ompi/v4.0/downloads/openmpi-4.0.0.tar.gz
# decompression
tar zxf openmpi-4.0.0.tar.gz
# install
cd openmpi-4.0.0
./configure --enable-orterun-prefix-by-default
make -j $(nproc) all
sudo make install # need root jurisdiction
sudo ldconfig # need root jurisdiction
View version
mpirun --version
nvidia-tensorflow
pip perhaps conda install
pip install nvidia-pyindex
pip install nvidia-tensorflow
Here, if the download fails , Change more networks ..

3090 Happy moments —— Support gpu
Enter the command to check whether it can be used gpu
import tensorflow as tf
print(tf.test.is_gpu_available())

horovod
horovod At the time of installation , Installation support required NCCL , It is also recommended to install the latest version .
If the installation reports an error , Please install the specified version , Following commands .
Before uninstalling horovod edition
pip show horovod && pip uninstall horovod

Installation command
HOROVOD_WITH_MPI=1 HOROVOD_WITHOUT_GLOO=1 HOROVOD_GPU_OPERATIONS=NCCL HOROVOD_WITH_TENSORFLOW=1 HOROVOD_NCCL_LINK=SHARED pip3 install horovod==0.24.3 --no-cache-dir

Check it out horovod
horovodrun --check-build
Check the box in front to support

Reference material :
https://www.jianshu.com/p/975f5cca88e4
https://xv44586.github.io/2022/05/25/horovod/index.html#huan-jing-zhong-cai-guo-de-keng
边栏推荐
猜你喜欢

Mysql主从同步及主从同步延迟解决方案

AXI协议(5):AXI协议的burst机制

嵌入式分享合集20

新导则下的防洪评价报告编制方法及洪水建模

Customer service relationship management based on SQL net enterprise messenger enterprise communications

The first open source MySQL native HTAP database in China will be released soon! Look at the three highlights first, limited to the surrounding areas, waiting for you~
![[wp][gwctf 2019] boring lottery](/img/e1/7b0238993aecd185b33e5447ba6720.png)
[wp][gwctf 2019] boring lottery

地球系统模式(CESM)实践技术

Axi protocol (4): signals on the Axi channel

Excel VBA:按日期汇总计算输出结果(sumif)
随机推荐
I talked with the interviewer about MySQL optimization in five dimensions
[Luogu] p1383 advanced typewriter
Seata两阶段提交AT详解
[acwing] 2983. Toys
can 串口 can 232 can 485 串口转CANbus总线网关模块CAN232/485MB转换器CANCOM
【ACWing】2983. 玩具
Axi protocol (4): signals on the Axi channel
Mysql主从同步及主从同步延迟解决方案
The elderly who claim alimony from other children after being supported by their widowed daughter-in-law should be supported
Five simple and practical daily development functions of chrome are explained in detail. Unlock quickly to improve your efficiency!
Excel VBA:将多个工作表保存为新文件
[mathematical modeling] analytic hierarchy process (AHP)
汉字风格迁移篇---通过生成对抗网络学习一对多程式化汉字的转换和生成
5个chrome简单实用的日常开发功能详解,赶快解锁让你提升更多效率!
Customer service relationship management based on SQL net enterprise messenger enterprise communications
科技论文翻译,俄语文档的语法有何特点
Interprocess communication
Date and time function of MySQL function summary
Vector explanation and iterator failure
安装NCCL\mpirun\horovod\nvidia-tensorflow(3090Ti)