当前位置:网站首页>Typical case of xdfs & Aerospace Institute HPC cluster
Typical case of xdfs & Aerospace Institute HPC cluster
2022-07-29 06:06:00 【Taocloud Avenue】
Aerospace Information Research Center of the Chinese Academy of Sciences
Project background
The National Space Science Center of the Chinese Academy of Sciences is the overall research institution of China's space science and its satellite engineering projects , A space science innovation platform for all China , Be responsible for organizing and carrying out research on the national space science development plan , Specifically responsible for the organization and implementation of the space science pilot project of the Chinese Academy of Sciences , Carry out innovative scientific and technological research in space science and related application fields , Provide scientific and technological support for the space science pilot project and its future development , Lead the development of Space Science , Drive space technology innovation .
Project requirements
● Satisfy HPC The performance requirements of the cluster
Computing layer HPC The cluster has nearly 20 Compute nodes , For a large number of space information analysis 、 Calculation 、 Pretreatment and other applications , To the whole HPC Clusters have extremely high performance requirements . Include : Stable bandwidth support 、 Extremely low delay response , And enough concurrency .
● Meet the requirements of capacity and scalability
The first phase of the construction plan is about to reach PB Level usable capacity , With the generation of collected and calculated data , The future capacity is unpredictable , Therefore, the storage system should also have strong expansion capability , Capacity 、 Linear performance improvement , To meet the pressure demand of each stage .
● Meet the requirements of openness and compatibility
Adopt standard access protocols , by HPC Clusters and business platforms provide access interfaces , Convenient for later period HPC The calculation result of can be conveniently used by the front-end business platform . It also facilitates the docking of different computing model interfaces in the future .
● It has certain technological leadership
The technical scheme adopted should not only conform to the development direction of the industry in the future , We should also have certain technological leadership , Maintain a leading position in similar systems , It is beneficial to improve the calculation and processing capacity of the whole system .
Solution
Network topology

Program Overview
The scheme adopts 2 set XDFS Distributed storage clusters , Respectively : High performance SSD Storage pool and capacity HDD Storage pool . All adopt erasure code redundancy , On the premise of meeting the performance requirements, achieve a higher storage disk acquisition rate , Meet capacity and cost control .
● High performance SSD Storage pool
The storage medium is enterprise class SATA SSD Solid state disk , Offer about 1PB High performance storage pool capacity . With POSIX Protocol mount , comparison NFS Higher performance , coordination 100Gb IB Links and RDMA technology , Full assurance HPC The performance pressure and capacity demand of high load in the calculation process , Speed up the computing process , promote HPC Overall computational efficiency .
● Capacity type HDD Storage pool
The storage medium is enterprise class SATA disc , Offer about 5PB Flexible storage space , Easy to expand , Capacity and performance are improved synchronously . With 40GE Link coordination of RDMA technology , Ensure faster data access . use NFS Protocol mount , It has wider applicability , Simple deployment , It is convenient for the business server to HPC Data application of calculation results .
Program advantages
● The optimized erasure code algorithm ensures both space utilization and performance output
SSD and HDD All storage pools adopt 4+1 Erasure code , Better space utilization , At the same time, the optimized algorithm completely meets the near 20 Calculate the size of nodes HPC Access requirements .
● Give full play to hardware performance
XDFS Streamlined system kernel , File oriented data distribution , Distributed metadata management strategy , as well as xMate Acceleration module , It greatly improves the hardware processing efficiency of storage nodes , Better play SSD Hard disk and 100Gb IB Network performance . Through the multi gigabit network port Bound technology , Let users in 10GE In the network environment, you can also get 100Gb IB Network matching non core performance output .
● Deep integration with user application scenarios , Realize data flow
XDFS Deep integration with tape library backup system , Give Way XDFS The cold data flows to the tape library according to the strategy , Make data management simpler , Not only frees up online storage pool space , Improve the efficiency of data management , At the same time, the cost investment is saved , Achieve more efficient return on investment .
Application effect
Storage pools with different configurations correspond to different levels of application systems , It meets the performance requirements of different business units , Fully meet the design expectations , At the same time, it improves the return on investment , Help users really build on demand . Docking of data archiving module and tape library backup system , It simplifies the operation of user data management , It improves the collaboration efficiency of the user's overall business system , Complete computing tasks more efficiently .
边栏推荐
- Use of xtrabackup
- 五、图像像素统计
- Ribbon learning notes II
- C connect to SharePoint online webservice
- 预训练语言模型的使用方法
- Flink connector Oracle CDC synchronizes data to MySQL in real time (oracle12c)
- 关于Flow的原理解析
- 一、multiprocessing.pool.RemoteTraceback
- 虚假新闻检测论文阅读(一):Fake News Detection using Semi-Supervised Graph Convolutional Network
- 二、如何保存MNIST数据集中train和test的图片?
猜你喜欢

ASM piling: after learning ASM tree API, you don't have to be afraid of hook anymore

clion+opencv+aruco+cmake配置

虚假新闻检测论文阅读(五):A Semi-supervised Learning Method for Fake News Detection in Social Media

pip安装后仍有解决ImportError: No module named XX

第一周任务 深度学习和pytorch基础

性能优化之趣谈线程池:线程开的越多就越好吗?

【Transformer】SOFT: Softmax-free Transformer with Linear Complexity

tensorboard使用

MarkDown简明语法手册

研究生新生培训第一周:深度学习和pytorch基础
随机推荐
Flutter正在被悄悄放弃?浅析Flutter的未来
Markdown syntax
[target detection] KL loss: bounding box progression with uncertainty for accurate object detection
一、multiprocessing.pool.RemoteTraceback
Detailed explanation of atomic operation class atomicinteger in learning notes of concurrent programming
[network design] convnext:a convnet for the 2020s
[database] database course design - vaccination database
[target detection] 6. SSD
tensorflow中tf.get_variable()函数详解
Realize the scheduled backup of MySQL database in Linux environment through simple script (mysqldump command backup)
虚假新闻检测论文阅读(四):A novel self-learning semi-supervised deep learning network to detect fake news on...
[overview] image classification network
PyTorch基础知识(可入门)
迁移学习笔记——Adapting Component Analysis
【语义分割】SETR_Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformer
tensorboard使用
Exploration of flutter drawing skills: draw arrows together (skill development)
Wechat built-in browser prohibits caching
【卷积核设计】Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs
ANR优化:导致 OOM 崩溃及相对应的解决方案