当前位置:网站首页>"Learning Cloud Networking with Teacher Tang" - Problem Location - The host is working but the container is not working
"Learning Cloud Networking with Teacher Tang" - Problem Location - The host is working but the container is not working
2022-07-30 11:16:00 【HUAWEI CLOUD】
This network failure:The host can access the destination URL,But not in a container.The problem is a bit strange,Let us follow in the footsteps of Mr. Tang's analysis,See how the general network problems are located.
问题现象
首先,这个节点上面的Dockeris a fresh install,系统CentosAlso just installed.A small partner in the department reported the problem,启动Docker后,There is no network access inside the container;但是HostAccess on the host is obviously good.
报错:

The first red line is inside the container,第2The red line is on the host.
图示如下:

具体使用场景为,容器中用apt-getEven the image source of HUAWEI CLOUD(mirrors.tools.huawei.com)来安装软件包.But an error is reported in the container:Temporary failure resolving 'mirrors.tools.huawei.com'.
But the host is connected to this originOK的.
大家想一下,可能什么原因?
IP Forward转发
Get started with positioning,So I log into the environment,Be prepared to start a blank container and try it first.
But as soon as the container is started,Immediately saw such a warning:
docker run -it ubuntu:18.04 /bin/bash
WARNING: IPv4 forwarding is disabled. Networking will not work
可以看到,这个告警信息,Normal containers didn't exist before.所以很明显,This problem must be solved first.
查看:
cat /proc/sys/net/ipv4/ip_forward
0
事实也是这样,View the container from the host,它(容器)Just another one“电脑”.The message is sent to the container through the host,Equivalent to the host to help“转发”给容器.So the host must have it“报文转发能力”. ps,不好理解的话,You can look back at Mr. Tang's previous lessons《跟唐老师学习云网络 - 网络命名空间》篇.
所以,We start it on the host ip_forward开关:
修改/etc/sysctl.conf ,Add one to the filenet.ipv4.ip_forward = 1
然后sysctl -p /etc/sysctl.conf生效.
改完后,再启动容器,该Warning就消失了.可是发现DNS还是无法解析,错误Temporary failure resolving 'mirrors.tools.huawei.com'仍然存在.
DNS中search字段
No access in the container,Then let's grab the bag and see.So I am in the following position(The place where all container packets go out),即docker0Grab the bag above:

tcpdump -nn -i docker0 port 53
Then accommodate
ping mirrors.tools.huawei.com
Find outgoing queriesDNS的报文中,Why put a tail after the domain name I'm asking about?(下图红线)

还记得《跟唐老师学习云网络 - DNS电话簿》Chapter knowledge no,/etc/resolv.conf配置文件中,There is an advanced parametersearch字段.It can be used to help you when looking up domain names,Add something to the tail of the target domain name(That is, add the suffix)的.这个在Kubernetes的Service微服务里面,有用到.
So,我们将 /etc/resolv.conf 文件中的 search Fields are commented out,This parameter is temporarily useless.
# Generated by NetworkManager
#search novalocal
nameserver 100.79.1.250
nameserver 100.79.1.46
ps,Final verification found,这个search参数,It is not the reason why the current container network is blocked. But comment out this parameter,也是合理的,因为我们并不需要novalocal后缀.
DNS不通
继续分析,The container cannot be resolved“mirrors.tools.huawei.com”这个域名.Then look inside the container/etc/resolv.conf文件,其内容:
# Generated by NetworkManager
#search novalocal
nameserver 100.79.1.250
nameserver 100.79.1.46
可以看到是copy了主机Host里的/etc/resolv.conf文件.Since it is with the hostDNS配置一样,Let's see how the host solves this domain name first.
在Host中使用:
nslookup mirrors.tools.huawei.com
** server can't find mirrors.tools.huawei.com: NXDOMAIN
发现,The domain name is also unresolved...囧,on that hostapt-getHow to get the package?咱们回顾一下《跟唐老师学习云网络 - DNS电话簿》章节的知识,Unlock a domain name,is purely dependentDNSIs the server?

So,赶紧看下 /etc/hosts 文件,Found a dead letter IP记录.难怪...It turned out that my colleague added the domain name directly when he could not unlock it before“偷懒”路径.(ps,The container also wants to add this hard-coded domain name-IP解析路径,可以使用docker run --add-host 参数.But I don't want to just circumvent,But really get the network problem)
So find a machine that can solve the domain name,把DNSCopy the server address,放入 /etc/resolv.conf See the first one below for the configuration fileDNS服务器记录.
# cat /etc/resolv.conf
# Generated by NetworkManager
#search novalocal
nameserver 10.129.2.34
nameserver 100.79.1.250
nameserver 100.79.1.46
然后把 /etc/hostsThe records inside are deleted,再试了下,The host can unlock the domain name.
然后重新创建容器,and confirm that in the container /etc/resolv.conf The content of the file is also correct.
But the domain name is still not working...
混杂模式
Since it still doesn't work,That sacrificed a big killerTcpdump呗(可以回看《跟唐老师学习云网络 - tcpdump》章节),So continue on the hostdocker0Start packet capture analysis:
tcpdump -nn -i docker0 port 53
Then the container still executes:
ping mirrors.tools.huawei.com
但是好奇怪,Passed again this time.(So stoptcpdump,Talk to your colleagues,In the end, he said it was not possible.我试下,Indeed not yet)
试了几次发现,只要我tcpdump抓包,The network is on.一旦tcpdump停掉,网络就不通.大家想到了什么?
Before seeing Teacher TangtcpdumpChapter students should have an impression,tcpdump命令,Will put the network card into promiscuous mode,Make it accept messages that do not belong to itself.
没错,通过
cat /sys/class/net/docker0/flags
查询状态(右数第3位,0和1Indicates whether it is in promiscuous mode).
结果为0x1003
而当开启tcpdump时为0x1103
Then the problem is explaineddocker0在默认情况下,It is not caused by entering promiscuous mode.
所以,The fix is to put thisdocker0的网卡,Set directly to promiscuous mode:
ifconfig docker0 promisc
或
ip link set docker0 promisc on
然后验证,一切OK.
ip link set docker0 promisc off
再2time to confirm,After disabling promiscuous mode,The container continues to fail.
到此,The whole network problem is solved.
possible root cause
Guessing maybe this machine is newly installed,It didn't open at firstIP Forward转发开关.Then it was installed manuallyDocker离线包,导致docker0为非混杂模式.
在打开IP ForwardAfter forwarding the switch,如果执行如下操作:
ip link delete docker0
systemctl restart docker.service
按理,Should also fix the problem.
问题总结
This time the container network is not connected,总结一下,大概:
- 主机IP ForwardForwarding is not turned on. ==》原因1
- conf中有search字段. ==》干扰项
- 主机DNSThe server address is not set correctly. ==》原因2
- Docker0The bridge's promiscuous mode is not turned on. ==》原因3
This network failure,未涉及iptables相关问题,所以还算简单.感谢阅读.
边栏推荐
- 京东校招笔试题+知识点总结
- 系统设计精选 | 基于FPGA的CAN总线控制器的设计(附代码)
- 物联网技术概论:第6章
- 淘宝/天猫淘宝评论问答列表接口 API
- 数据库事务,JDBC操作和数据类型
- Basemap and Seaborn
- 【JZ64 求1+2+3+...+n】
- 4. yolov5-6.0 ERROR: AttributeError: 'Upsample' object has no attribute 'recompute_scale_factor' solution
- STM32F1读取MLX90632非接触式红外温度传感器
- Voltage relay h2d SRMUVS - 100 vac - 2
猜你喜欢

Adaptive Control - Simulation Experiment 1 Designing Adaptive Laws Using Lyapunov's Stability Theory

Selected System Design | Design of CAN Bus Controller Based on FPGA (with Code)

Voltage relay h2d SRMUVS - 100 vac - 2

【HMS core】【FAQ】HMS Toolkit Typical Questions Collection 1

Is it too late to apply for PMP now to take the September exam?Share agile full-true mock questions
![[AGC] Growth Service 2 - In-App Message Example](/img/fa/9190e45c1532aec908a6c68706629a.png)
[AGC] Growth Service 2 - In-App Message Example

Verilog之数码管译码

RY-D1/1电压继电器

API 网关 APISIX 在Google Cloud T2A 和 T2D 的性能测试

Introduction to IoT Technologies: Chapter 6
随机推荐
Vim plugin GrepIt
Scrapy爬虫之网站图片爬取
加密和安全
类和对象—6个默认成员函数
【C和指针第七章】可变参数列表
Swift common extension classes and simple encapsulation
oracle export dmp file type as "crash dump file"
干货|语义网、Web3.0、Web3、元宇宙这些概念还傻傻分不清楚?(中)
【梦想起航】
Some commands of kubernetes
RandLA-Net复现记录
OC-ARC (Automatic Reference Counting) automatic reference counting
typescript入门之helloworld
JSP 语法简介说明
【HMS core】【FAQ】HMS Toolkit Typical Questions Collection 1
ODrive应用 #4 配置参数&指令「建议收藏」
数据库脏读、不可重复读、幻读以及对应的隔离级别
【云筑共创】华为云携手鸿蒙,端云协同,培养创新型开发者
[ASP.NET Core] Dependency Injection for Option Classes
神经网络学习笔记3——LSTM长短期记忆网络