当前位置:网站首页>Cluster task scheduling system lsf/sge/slurm/pbs based on HPC scenario
Cluster task scheduling system lsf/sge/slurm/pbs based on HPC scenario
2022-07-07 10:34:00 【Entering】
be based on HPC Cluster task scheduling system of scenario
List of articles
At present, there are four mainstream schedulers in the market :LSF/SGE/Slurm/PBS.
Different industries have different support for applications due to their usage habits and different schedulers , There are often different preferences : For example, universities and supercomputing often use Slurm, The most commonly used by semiconductor companies is LSF and SGE, Industrial manufacturing may use PBS More .
LSF Schools
Spectrum LSF、PlatformLSF、OpenLava
be based on LSF(Load Sharing Facility) The main schedulers are Spectrum LSF、PlatformLSF、OpenLava Three .
In the early LSF By Toronto University developed Utopia The system developed .
2007 year ,Platform Computing Based on earlier versions of LSF Open source has a simplified version Platform Lava.
This open source project 2011 The year ended , By OpenLava To take over .
2011 year ,Platform staff David Bigagli be based on Platform Lava The derived code of creates OpenLava 1.0.2014 year , some Platform Our employees set up Teraproc company , by OpenLava Provide development and business support .2016 year IBM Just LSF Copyright pair Teraproc The company initiated a lawsuit ,2018 year IBM Win a lawsuit ,OpenLava Disabled .
2011 year ,Platform Lava After the suspension of the open source project .2012 year 1 month ,IBM Acquired Platform Computing.Spectrum LSF Namely IBM The commercial version launched after the acquisition , Current update to 10.1.0, Support at the same time Linux and Windows, The maximum number of nodes exceeds 6000, Provide business support at home .
Platform LSF yes LSF Early versions , And Spectrum LSF It belongs to IBM, The current version is 9.1.3, Visual inspection has stopped updating, focusing on maintenance .
Among the three schedulers , have only Spectrum LSF Support Auto-Scale Cluster auto scaling function , At the same time, the scheduler can also use LSF resourceconnector Overflow to the cloud , Supporting cloud vendors include AWS、Azure、Google Cloud.
SGE Schools
UGE、SGE
be based on SGE(Sun Grid Engine) The scheduler includes UGE(Univa Grid Engine) and SGE(Son of Grid Engine).
1993 year ,Grid Engine Release as commercial software , Used... One after another CODINE(Computing in Distributed Networked Environments)、GRD(Global Resource Director) As name .1999 year , For the first time by Genias Software Launch the market , Then be Gridware Company purchase . until 2000 By the SUN Officially renamed after the acquisition Sun Grid Engine,2001 The open source version was released in .
2010 By the Oracle It was renamed after the acquisition Oracle Grid Engine, Change to closed source version , No source code . The original open source project database forbids users to modify .
therefore ,Grid Engine The community started the open source version of SGE(Son of Grid Engine) project . The scheduler was last updated to 2016 Year of 8.1.9, Due to copyright risks ,SGE There has been no maintenance and update for a long time .
2013 year Univa Acquired Oracle Grid Engine, Become the only commercial software **UGE(Univa Grid Engine)** provider .UGE The latest version is 8.6.15, Support at the same time Linux and Windows, There is no relevant information about commercial support in China .
2020 year 9 month ,Altair Acquired Univa.
User access Univa product Navops Launch Move the workload to the cloud , Support at the same time UGE and Slurm colony . meanwhile ,Navops Launch Support AWS、Azure、Google Cloud Wait for cloud vendors , And it can monitor the cloud expenses and Auto-Scale Cluster auto scaling .
Slurm- The only pure open source school among the four schools
Slurm Its full name is Simple Linux Utility for Resource Management, In the early stage, Lawrence Livermore National Laboratory 、SchedMD、Linux NetworX、Hewlett-Packard and Groupe Bull Responsible for the development of , By closed source software Quadrics RMS Inspired by the .
Slurm The latest version is 20.02, At present, it is composed of community and SchedMD Jointly maintained by the company , Keep open source and free , from SchedMD The company provides business support , Support only Linux System , The maximum number of nodes exceeds 12 ten thousand .
Slurm High fault tolerance 、 Support heterogeneous resources 、 Highly scalable and other advantages , More than... Can be submitted per second 1000 A mission , And because it's an open framework , Highly configurable , Have more than 100 Plug in , Therefore, the applicability is quite strong .
The global 60% Of TOP500 Supercomputing centers and super large-scale clusters ( Including China's Tianhe II, etc ) All use Slurm As a scheduling system . our TOP500 Just use Slurm Scheduling resources on the cloud .
We support in Slurm Automatic cluster scaling and cloud cost monitoring on , And support AWS、 Alibaba cloud 、Azure、 Tencent cloud 、 Hua Wei Yun 、Google Cloud Wait for cloud vendors .
fastone Of Auto-Scale The function can automatically monitor the number of tasks submitted by users and the demand for resources , Dynamically turn on the required computing resources on demand , Effectively reduce costs while improving efficiency .
PBS Schools
OpenPBS、PBS PRO、Moab/TORQUE
be based on PBS(Portable Batch System) The scheduler includes OpenPBS、PBS PRO、Moab/TORQUE.
PBS It was originally made by MRJ Technology Solutions On 1991 year 6 Month begins for NASA The job scheduling system ,MRJ On 20 century 90 By the end of the s Veridian Acquisition .2003 year ,Altair Acquired Veridian, To obtain the PBS Technology and intellectual property rights .
PBS Pro yes Altair its PBS WORKS Commercial version provided , Support visual interface , The number of nodes exceeds 50000 individual .
2016 year Altair be based on P****BS Pro An open source licensed version is available , And MRJ On 1998 The original open source version released in is roughly what it is now OpenPBS. And Pro Version than , A lot more restrictions , But they all support Linux and Windows.
**Moab/TORQUE Together, it is the function of a complete scheduler , Now belongs to the same company Adaptive Computing.**90 In the mid-s by MHPCC Of David Jackson Developed Maui, Later he founded Adaptive Computing.
Moab yes Adaptive Computing company ( Formerly known as Cluster Resources company-developed Maui Cluster Scheduler) Maintenance of OpenPBS Branch ,2003 Released in . The project was originally open source and free , Later it became commercial software Moab No longer free .
TORQUE(Terascale Open-source Resource and QUEue Manager) In the early Torque It's also open source free software , however 2018 year 6 Month begins TORQUE No more open source .
Both only support Linux System , Provide a visual interface , It has about thousands of nodes .
Cloud services ,PBS Pro Can pass Altair Control product Overflow from local to cloudy and Auto-Scale Cluster auto scaling , Supported cloud vendors include AWS、Azure and Google Cloud.
Moab/TORQUE You can go through NODUSCloud OS product Achieve local expansion to the cloud , Support TORQUE or Slurm Clustering and auto scaling , Supported cloud vendors include AWS、Azure、GoogleCloud And Huawei cloud , And pass Account Manager The product realizes cloud expense monitoring .
QUE or Slurm Clustering and auto scaling , Supported cloud vendors include AWS、Azure、GoogleCloud And Huawei cloud , And pass Account Manager The product realizes cloud expense monitoring .
边栏推荐
- 中级软件评测师考什么
- When there are pointer variable members in the custom type, the return value and parameters of the assignment operator overload must be reference types
- 成为优秀的TS体操高手 之 TS 类型体操前置知识储备
- Adb 实用命令(网络包、日志、调优相关)
- 2022.7.4DAY596
- Multisim -- software related skills
- php \n 换行无法输出
- Kotlin realizes wechat interface switching (fragment exercise)
- P1031 [NOIP2002 提高组] 均分纸牌
- Study summary of postgraduate entrance examination in July
猜你喜欢
小程序跳转H5,配置业务域名经验教程
High number_ Chapter 1 space analytic geometry and vector algebra_ Quantity product of vectors
使用U2-Net深层网络实现——证件照生成程序
【二开】【JeecgBoot】修改分页参数
Yarn的基础介绍以及job的提交流程
Use the fetch statement to obtain the repetition of the last row of cursor data
Applet jump to H5, configure business domain name experience tutorial
基于HPC场景的集群任务调度系统LSF/SGE/Slurm/PBS
mysql插入数据创建触发器填充uuid字段值
【acwing】789. Range of numbers (binary basis)
随机推荐
MySQL insert data create trigger fill UUID field value
XML configuration file parsing and modeling
gym安装踩坑记录
1323:【例6.5】活动选择
Socket communication principle and Practice
JMeter loop controller and CSV data file settings are used together
电表远程抄表拉合闸操作命令指令
Encrypt and decrypt stored procedures (SQL 2008/sql 2012)
table宽度比tbody宽度大4px
[daiy5] jz77 print binary tree in zigzag order
0x0fa23729 (vcruntime140d.dll) (in classes and objects - encapsulation.Exe) exception thrown (resolved)
2022.7.5DAY597
软考中级,软件设计师考试那些内容,考试大纲什么的?
字符串格式化
Schnuka: machine vision positioning technology machine vision positioning principle
A small problem of bit field and symbol expansion
CC2530 ZigBee iar8.10.1 environment construction
openinstall与虎扑达成合作,挖掘体育文化产业数据价值
JS实现链式调用
小程序跳转H5,配置业务域名经验教程