当前位置:网站首页>Analysis on the resumption of the most serious downtime in the history of Facebook on October 4, 2021

Analysis on the resumption of the most serious downtime in the history of Facebook on October 4, 2021

2022-06-09 13:18:00 AIwenIPgeolocation

1、 Related news
According to the BBC And other media reports ,UTC Time 2021 year 10 month 4 Japan 15 when 39 branch ( Beijing time 10 month 4 Japan 23 when 39 branch ), Social networks Facebook And its subsidiaries Messenger、Instagram and WhatsApp It is not available globally 7 Hours .

Facebook In its twitter Issued an official statement “Our engineering teams have learned that configuration changes on the backbone routers that coordinate network traffic between our data centers caused issues that interrupted this communication. This disruption to network traffic had a cascading effect on the way our data centers communicate, bringing our services to a halt” ( translate : The change in the configuration of the backbone router for scheduling traffic between data centers caused the communication interruption . This network traffic interruption has a knock on effect on the communication of the data center , Eventually, our services went down .)

It can be seen that the official reply did not clearly explain the reason for the error . Therefore, we give the root cause of the downtime accident .

2、Downdetector Detected Facebook Network fluctuations

chart 1 Downdetector Detected Facebook Network fluctuations

Downdetector Websites infer disconnection by collecting interruption information in social networks , Pictured 1 Shown .Downdetector stay EDT The time of the 10 month 4 Japan 11 when 44 branch ( Beijing time 10 month 4 Japan 23 when 44 branch ) detected Facebook Network fluctuation problem , The specific reason is not explained .

3、Facebook and WhatsApp Cause analysis of service interruption

Facebook Of AS by AS32934,WhatsApp Of AS by AS11917.

Beijing time. 10 month 5 Early morning 0 when (UTC Time 10 month 4 Japan 16 It's all right ) The observed Facebook(AS32934) There are network fluctuations , Its Prefix Quantity and sum IP The number has decreased . Until Beijing time 10 month 5 Day in the morning 7 It's all right ,Prefix Quantity and sum IP Quantity recovery , Pictured 2 Shown . among ,Prefix The quantity is from 10 month 4 Japan 23 when 30 Points of 129 One reduced to 10 month 5 Japan 0 At the time of the 103 individual ,Prefix The quantity has decreased 26 individual , total 5,888 individual IP. The loss of IP The block details are as follows :

129.134.25.0/24、129.134.26.0/24、129.134.27.0/24、129.134.28.0/24、129.134.29.0/24、129.134.30.0/23、129.134.30.0/24、129.134.31.0/24、129.134.65.0/24、129.134.66.0/24、129.134.67.0/24、129.134.68.0/24、129.134.69.0/24、129.134.70.0/24、129.134.71.0/24、129.134.72.0/24、129.134.73.0/24、129.134.74.0/24、129.134.75.0/24、129.134.76.0/24、129.134.79.0/24、157.240.207.0/24、185.89.218.0/23、185.89.218.0/24、185.89.219.0/24、69.171.250.0/24

  chart 2 The reticulum captures Facebook(AS32934) There are obvious fluctuations

Facebook Yes 4 An authority DNS The server , Namely a.ns.facebook.com(129.134.30.12)、b.ns.facebook.com(129.134.31.12)、c.ns.facebook.com(185.89.218.12) and d.ns.facebook.com(185.89.219.12),4 individual DNS The server IP All lost IP In block .

therefore , The cause of this failure is Dispatch network traffic between data centers The border gateway protocol was revoked due to the change of the backbone router configuration of Facebook Autonomous Region AS32934 Lower include Facebook Domain name server IP Of IP Address block , Erased Facebook Needed DNS Routing information , Then DNS Server offline , User cannot resolve Facebook And related domain names and access services .

meanwhile , Time in Beijing 10 month 5 Early morning 0 It was also monitored at the beginning WhatsApp(AS11917) All of them Prefix、IP And path loss , Pictured 3 Shown .

chart 3 The reticulum captures WhatsApp (AS11917) There are obvious fluctuations

WhatsApp The reason why the service cannot be accessed is : stay 2019 year Facebook Consolidate all its services and achieve centralization , So that the company can uniformly understand the Internet usage habits of users . however , This also makes this single point of failure affect the whole system Facebook Service system .

in summary , Evan science and technology network moved Capture to Facebook Of AS32934 and WhatsApp Of AS11917 Network fluctuation , The fluctuation time is also related to the Facebook The disconnection time coincides . The reason for the service interruption is that the configuration change on the backbone router leads to the border gateway protocol (BGP) Deregistered Facebook Domain name server IP Address prefix , A series of service exceptions are triggered .

原网站

版权声明
本文为[AIwenIPgeolocation]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/160/202206091221528828.html