当前位置:网站首页>Cluster task scheduling system lsf/sge/slurm/pbs based on HPC scenario

Cluster task scheduling system lsf/sge/slurm/pbs based on HPC scenario

2022-07-07 10:34:00 Entering

be based on HPC Cluster task scheduling system of scenario


At present, there are four mainstream schedulers in the market :LSF/SGE/Slurm/PBS.

Different industries have different support for applications due to their usage habits and different schedulers , There are often different preferences : For example, universities and supercomputing often use Slurm, The most commonly used by semiconductor companies is LSF and SGE, Industrial manufacturing may use PBS More .

LSF Schools

Spectrum LSF、PlatformLSF、OpenLava

be based on LSF(Load Sharing Facility) The main schedulers are Spectrum LSF、PlatformLSF、OpenLava Three .

In the early LSF By Toronto University developed Utopia The system developed .
2007 year ,Platform Computing Based on earlier versions of LSF Open source has a simplified version Platform Lava.

This open source project 2011 The year ended , By OpenLava To take over .
2011 year ,Platform staff David Bigagli be based on Platform Lava The derived code of creates OpenLava 1.0.2014 year , some Platform Our employees set up Teraproc company , by OpenLava Provide development and business support .2016 year IBM Just LSF Copyright pair Teraproc The company initiated a lawsuit ,2018 year IBM Win a lawsuit ,OpenLava Disabled .

OpenLava Scheduler - Information

2011 year ,Platform Lava After the suspension of the open source project .2012 year 1 month ,IBM Acquired Platform Computing.Spectrum LSF Namely IBM The commercial version launched after the acquisition , Current update to 10.1.0, Support at the same time Linux and Windows, The maximum number of nodes exceeds 6000, Provide business support at home .
Platform LSF yes LSF Early versions , And Spectrum LSF It belongs to IBM, The current version is 9.1.3, Visual inspection has stopped updating, focusing on maintenance .

Platform LSF Scheduler - Information

 Scheduler -Spectrum LSF Information

Among the three schedulers , have only Spectrum LSF Support Auto-Scale Cluster auto scaling function , At the same time, the scheduler can also use LSF resourceconnector Overflow to the cloud , Supporting cloud vendors include AWS、Azure、Google Cloud.

SGE Schools

UGE、SGE

be based on SGE(Sun Grid Engine) The scheduler includes UGE(Univa Grid Engine) and SGE(Son of Grid Engine).

1993 year ,Grid Engine Release as commercial software , Used... One after another CODINE(Computing in Distributed Networked Environments)、GRD(Global Resource Director) As name .1999 year , For the first time by Genias Software Launch the market , Then be Gridware Company purchase . until 2000 By the SUN Officially renamed after the acquisition Sun Grid Engine,2001 The open source version was released in .

2010 By the Oracle It was renamed after the acquisition Oracle Grid Engine, Change to closed source version , No source code . The original open source project database forbids users to modify .
therefore ,Grid Engine The community started the open source version of SGESon of Grid Engine) project . The scheduler was last updated to 2016 Year of 8.1.9, Due to copyright risks ,SGE There has been no maintenance and update for a long time .

 Scheduler -SGE Information

2013 year Univa Acquired Oracle Grid Engine, Become the only commercial software **UGE(Univa Grid Engine)** provider .UGE The latest version is 8.6.15, Support at the same time Linux and Windows, There is no relevant information about commercial support in China .
2020 year 9 month ,Altair Acquired Univa.

 Scheduler -UGE Information

User access Univa product Navops Launch Move the workload to the cloud , Support at the same time UGE and Slurm colony . meanwhile ,Navops Launch Support AWS、Azure、Google Cloud Wait for cloud vendors , And it can monitor the cloud expenses and Auto-Scale Cluster auto scaling .

Slurm- The only pure open source school among the four schools

Slurm Its full name is Simple Linux Utility for Resource Management, In the early stage, Lawrence Livermore National Laboratory 、SchedMD、Linux NetworX、Hewlett-Packard and Groupe Bull Responsible for the development of , By closed source software Quadrics RMS Inspired by the .

Slurm The latest version is 20.02, At present, it is composed of community and SchedMD Jointly maintained by the company , Keep open source and free , from SchedMD The company provides business support , Support only Linux System , The maximum number of nodes exceeds 12 ten thousand .
Slurm High fault tolerance 、 Support heterogeneous resources 、 Highly scalable and other advantages , More than... Can be submitted per second 1000 A mission , And because it's an open framework , Highly configurable , Have more than 100 Plug in , Therefore, the applicability is quite strong .

 Scheduler -Slurm Information

The global 60% Of TOP500 Supercomputing centers and super large-scale clusters ( Including China's Tianhe II, etc ) All use Slurm As a scheduling system . our TOP500 Just use Slurm Scheduling resources on the cloud .

We support in Slurm Automatic cluster scaling and cloud cost monitoring on , And support AWS、 Alibaba cloud 、Azure、 Tencent cloud 、 Hua Wei Yun 、Google Cloud Wait for cloud vendors .
fastone Of Auto-Scale The function can automatically monitor the number of tasks submitted by users and the demand for resources , Dynamically turn on the required computing resources on demand , Effectively reduce costs while improving efficiency .

PBS Schools

OpenPBS、PBS PRO、Moab/TORQUE

be based on PBS(Portable Batch System) The scheduler includes OpenPBS、PBS PRO、Moab/TORQUE.

PBS It was originally made by MRJ Technology Solutions On 1991 year 6 Month begins for NASA The job scheduling system ,MRJ On 20 century 90 By the end of the s Veridian Acquisition .2003 year ,Altair Acquired Veridian, To obtain the PBS Technology and intellectual property rights .
PBS Pro yes Altair its PBS WORKS Commercial version provided , Support visual interface , The number of nodes exceeds 50000 individual .

 Scheduler -PBS PRO Information

2016 year Altair be based on P****BS Pro An open source licensed version is available , And MRJ On 1998 The original open source version released in is roughly what it is now OpenPBS. And Pro Version than , A lot more restrictions , But they all support Linux and Windows.

OpenPBS Scheduler - Information

**Moab/TORQUE Together, it is the function of a complete scheduler , Now belongs to the same company Adaptive Computing.**90 In the mid-s by MHPCC Of David Jackson Developed Maui, Later he founded Adaptive Computing.

Moab yes Adaptive Computing company ( Formerly known as Cluster Resources company-developed Maui Cluster Scheduler) Maintenance of OpenPBS Branch ,2003 Released in . The project was originally open source and free , Later it became commercial software Moab No longer free .

TORQUE(Terascale Open-source Resource and QUEue Manager) In the early Torque It's also open source free software , however 2018 year 6 Month begins TORQUE No more open source .
Both only support Linux System , Provide a visual interface , It has about thousands of nodes .

 Scheduler -Moab/TORQUE

Cloud services ,PBS Pro Can pass Altair Control product Overflow from local to cloudy and Auto-Scale Cluster auto scaling , Supported cloud vendors include AWS、Azure and Google Cloud.

Moab/TORQUE You can go through NODUSCloud OS product Achieve local expansion to the cloud , Support TORQUE or Slurm Clustering and auto scaling , Supported cloud vendors include AWS、Azure、GoogleCloud And Huawei cloud , And pass Account Manager The product realizes cloud expense monitoring .

QUE or Slurm Clustering and auto scaling , Supported cloud vendors include AWS、Azure、GoogleCloud And Huawei cloud , And pass Account Manager The product realizes cloud expense monitoring .

原网站

版权声明
本文为[Entering]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/188/202207070813139337.html