当前位置:网站首页>Practice of curve replacing CEPH in Netease cloud music
Practice of curve replacing CEPH in Netease cloud music
2022-06-28 16:48:00 【InfoQ】

- Poor performance: Due to the poor performance of a single volume ( Mainly IO High delay ,IOPS Couldn't get on , And vulnerable to other high load volumes in the cluster ), Therefore, it can only be used for the system disk , Or it can be used to print logs for cloud disk supply , Unable to support the use of middleware business .
- IO shake: After our observation, we found that IO Delay exceeds 2s It may cause disk util 100%, The business will alarm in a large area , Request to pile up , In severe cases, it will cause avalanche effect ; According to the former 2 Years of observation ,Ceph Cloud disk IO Jitter is very frequent ( Basically every month ), The duration of jitter also reaches the level of minutes , Therefore, many core applications have switched to local storage to avoid similar problems .
- shake: Since using Curve After cloud disk , disk IO util Monitoring has never been caused by distributed storage systems 100% The alarm , The stability of business operation has been greatly improved , The core business has gradually moved back to Curve Cloud disk ( After all, the high space utilization of cloud disk 、 reliability 、 Portability 、 Rapid recovery capability is also highly valued by the business ).
- performance: Under the same hardware ,Curve Single volume performance is Ceph Roll up 2 times +, The delay is also much lower than Ceph, Refer to the following figure for specific performance comparison :

- Service upgrade: Common scenarios that require upgrading the client include bug Repair 、 New function enhancement and version upgrade , We met a Ceph Community message module 32 Bit sequence number overflow bug, The bug It will appear on the long-running client , cause IO hang, Both the client and the server need to be updated to solve the problem . There are two options for updating the client , First, restart the virtual machine QEMU process , Second, perform hot migration operations on virtual machines live migration, These two operations are relatively feasible in a few virtual machine scenarios , But if you do this for hundreds of virtual machines , Obviously, the operability is very low , Business is clearly unacceptable . In addition, the server upgrade is restarting OSD Process will also cause certain IO shake , It is necessary to operate in the low peak period of business , And the business needs to shut down the disk temporarily util The alarm .
- performance: The O & M personnel mainly focus on the overall performance of the storage cluster , If the total capacity of the cluster does not match the total performance , This can easily lead to insufficient performance when the capacity is sufficient , Or create fewer volumes, resulting in a waste of capacity , You can either continue to create volumes that will affect a single volume IO Delay and throughput , in addition Ceph When the number of cluster volumes reaches a certain scale , As the number of volumes increases , The overall performance of the cluster is also declining , This results in a greater impact on the performance of a single volume .
- Algorithm: Limited by CRUSH Algorithmic limitations ,Ceph Of OSD The data distribution is very uneven , Serious waste of space , According to our observation , The highest and lowest OSD The difference in space utilization can reach 50%, Data balancing is often required , However, a large number of data migration operations will occur in the process of data balancing , Lead to IO shake , In addition, data balance cannot be solved perfectly OSD Unbalanced capacity utilization .
- IO shake: Change the bad plate , The node is down , high IO load , Capacity expansion ( No new pool, Too many new pool It can lead to OpenStack Maintenance becomes complicated ) Data equalization , Network card packet loss , Slow disk, etc .
- Service upgrade: The client supports hot upgrade , In operation QEMU The process does not need to be restarted , There is no need to migrate , The millisecond impact is almost insensitive to the business in the virtual machine , For architecture design related to thermal upgrade, please refer to ①.Curve When upgrading the server , Thanks to the quorum Mechanism consistency protocol raft, As long as you upgrade by replica domain , It can guarantee the business IO The effect is on the second level ,IO The delay does not exceed 2s It won't lead to util 100%.
- performance:Curve The cluster can operate at the same capacity , Create more volumes , And maintain stable performance output .
- Algorithm :Curve Data distribution is centralized MDS The service is carried out , It can guarantee a very high balance , The highest and lowest chunkserver The deviation of space utilization rate shall not exceed 10%, There is no need to perform data balancing operation .
- IO shake:Ceph Cloud disks are prone to IO Jittery scene ,Curve Cloud disk performance is more stable ,Curve VS Ceph The details are as follows: :



- Exploration is based on Curve Cloud native middleware scenario of block storage , For example, the transformed Redis、Kafka、 Services such as message queuing run in Curve Block storage on a volume , Reduce failover time .
- Online based on CurveBS+PolarFS+MySQL Cloud native database .
- Other stock uses Ceph Cloud disk 、 The local storage virtual machine is switched to Curve Block storage volume .
- GitHub:https://github.com/opencurve/curve
- Wechat group :Please search add or search group assistant wechat OpenCurve_bot
边栏推荐
- DPDK 20.11编译安装运行程序
- 中能融合携手天翼云打造“能源大脑”
- 基数排序——【常见排序法(2/8)】
- 如何将你的 WordPress 网站置于维护模式
- Fs2k face sketch attribute recognition
- Please ask me, the queries written in my database account for 99%. Is it better to use pay as you go mode or reservation mode?
- PID control details [easy to understand]
- 6 - 字典
- QQ出现大规模盗号,为什么会这样?就没有解决方法了吗?
- Subscription publishing mode bus in JS
猜你喜欢

Azure Kinect Microsoft camera unity development summary

The new paradigm of AI landing is "hidden" in the next major upgrade of software infrastructure

天翼云Web应用防火墙(边缘云版)通过首批可信认证

【Hot100】1. 两数之和

AI落地的新范式,就“藏”在下一场软件基础设施的重大升级里

【TcaplusDB】祝大家端午安康!

LTspice 电路仿真入门

Slim gain (sgain) introduction and code implementation -- missing data filling based on generated countermeasure network

【TcaplusDB知识库】TcaplusDB技术支持介绍
![[tcapulusdb knowledge base] Introduction to tcapulusdb restrictions](/img/d3/27f09f7f5ab8e27d1ab87a35a9c0f3.png)
[tcapulusdb knowledge base] Introduction to tcapulusdb restrictions
随机推荐
Stm32cubemx usage and function introduction
中能融合携手天翼云打造“能源大脑”
O & M - unified gateway is very necessary
Have you ever encountered the error that the main key of this setting is consistent with the database?
Yesterday, metauniverse | Wal Mart set up an innovation department to explore metauniverse and Web3, and Dior released the metauniverse Exhibition
PostgreSQL exception handling
软件快速交付真的需要以安全为代价吗?
NOIP1998-2018年普及组 CSP-J2 2019 2020 解题报告及视频
6 - 字典
3. Caller 服务调用 - dapr
PID control details [easy to understand]
【Golang】安装 iris 的方法
How to install WordPress on a web site
【Hot100】2. Add two numbers
Written interview algorithm classic - longest palindrome substring
[force button] 35 Search insert location
Can Huawei become a "brother of lipstick" or a "Queen of goods"?
扎克伯格致投资者:不要对元宇宙有任何期待
C#/VB. Net to convert PDF to excel
np tips: random 创建随机矩阵 sample = np.random.random([19, 64 , 64, 3])