当前位置:网站首页>Flannel's host GW and calico
Flannel's host GW and calico
2022-06-26 11:13:00 【Famine - Yu Xi】
Modify cluster as flannel host-gw Pattern
Configure the cluster to use :
modify configmap
kubectl edit -n kube-system configmaps kube-flannel-cfg
Modify the content :
...
net-conf.json: | { "Network": "10.244.0.0/16", "Backend": { "Type": "vxlan" } }
...
"Type": "vxlan" Change it to "Type": "host-gw"
Restart the service
kubectl rollout restart -n kube-system daemonset kube-flannel-ds
Check whether the startup is successful
kubectl logs -n kube-system kube-flannel-ds-467p2|grep "host-gw"
Check the node routing table :
[[email protected] ~]# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.88.8.254 0.0.0.0 UG 100 0 0 ens192
10.88.8.0 0.0.0.0 255.255.252.0 U 100 0 0 ens192
10.244.0.0 0.0.0.0 255.255.255.0 U 0 0 0 cni0
10.244.1.0 10.88.10.182 255.255.255.0 UG 0 0 0 ens192
10.244.2.0 10.88.10.183 255.255.255.0 UG 0 0 0 ens192
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
host-gw In the pattern , The subnetwork gateway in the routing table corresponds to the host of the subnetwork IP. And in the VXLAN in , In the routing table flannel The devices corresponding to the subnet are flannel.1.
host-gw
host-gw It is a three-layer cross host network solution , differ VXLAN Virtual layer-2 network of mode , He is based on IP Address to judge .
Different in the same host pod Communication directly bypasses , Start on different hosts pod signal communication .
The last... Is also used during the test pod, A packet from master Node pod(IP10.244.0.5) Sent to worker1 Node pod IP10.244.1.17.
[[email protected] ~]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
network-tools-66b6674fd9-pjpz6 1/1 Running 0 27h 10.244.0.5 master <none> <none>
network-tools-66b6674fd9-tmgcs 1/1 Running 0 27h 10.244.2.13 worker2 <none> <none>
network-tools-66b6674fd9-zk58f 1/1 Running 0 27h 10.244.1.17 worker1 <none> <none>-
First of all to enter master Node pod Check the routing table below :
[email protected]:/# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.244.0.1 0.0.0.0 UG 0 0 0 eth0
10.244.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
10.244.0.0 10.244.0.1 255.255.0.0 UG 0 0 0 eth0
It's no different from before ,0.0.0.0 The default rules ,10.244.0.0 gateway 0.0.0.0 Is this machine pod The subnet segment direct connection rule is also skipped , Next, we will directly follow the third route , Go to the host computer to query the network card and see if it contains this IP.
[[email protected] ~]# ifconfig |grep "10.244.0.1" -B 5 -A 3
cni0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
inet 10.244.0.1 netmask 255.255.255.0 broadcast 10.244.0.255
inet6 fe80::f03d:adff:fea2:563e prefixlen 64 scopeid 0x20<link>
ether f2:3d:ad:a2:56:3e txqueuelen 1000 (Ethernet)
RX packets 1988416 bytes 269346410 (256.8 MiB)
Is still cni0.
cni0 yes Kubernetes Automatically replace docker0 A device created by the bridge . So no matter what network ,flannel Good calico Let it be , Just face cni0 This bridge , That is to say, it conforms to Kubernetes The rules can be , You don't need to care what containers are used at the bottom to match what networks .
Follow the same routine as before , Flow from cni0 It flows out to the host network , Further match the host route , View the host routing :
[[email protected] ~]# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.88.8.254 0.0.0.0 UG 100 0 0 ens192
10.88.8.0 0.0.0.0 255.255.252.0 U 100 0 0 ens192
10.244.0.0 0.0.0.0 255.255.255.0 U 0 0 0 cni0
10.244.1.0 10.88.10.182 255.255.255.0 UG 0 0 0 ens192
10.244.2.0 10.88.10.183 255.255.255.0 UG 0 0 0 ens192
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
It's different from here . Of the corresponding target subnet Use Iface No more flannel.1 Virtual devices , It's the host's ens192 The network card , At the same time, the gateway has become 10.88.10.182, This is the node machine of a host network segment IP.
adopt ip route Command for more detailed results :
[[email protected] ~]# ip route
default via 10.88.8.254 dev ens192 proto static metric 100
10.88.8.0/22 dev ens192 proto kernel scope link src 10.88.10.181 metric 100
10.244.0.0/24 dev cni0 proto kernel scope link src 10.244.0.1
10.244.1.0/24 via 10.88.10.182 dev ens192
10.244.2.0/24 via 10.88.10.183 dev ens192
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
You can see each flannel The subnet of also corresponds to a via 10.88.10.182,via That means " Next jump " The address of . And from ens192 Send out the network card .
This is a rule that the host can read directly , So packets from cni0 When the Internet comes out , It will be directly packaged by the host ,ens192 The device will use the next hop mac Address to encapsulate layer 2 data frames ( The reason why there is no IP The layer is because the packets sent from the host are IP package , So there is no need to encapsulate ), Then the data packets will come to the network card of the host through the physical network .
After the peer host receives the packet , Unpack the second layer , Direct basis IP The address matches its own route :
[[email protected] ~]# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.88.8.254 0.0.0.0 UG 100 0 0 ens192
10.88.8.0 0.0.0.0 255.255.252.0 U 100 0 0 ens192
10.244.0.0 10.88.10.181 255.255.255.0 UG 0 0 0 ens192
10.244.1.0 0.0.0.0 255.255.255.0 U 0 0 0 cni0
10.244.2.0 10.88.10.183 255.255.255.0 UG 0 0 0 ens192
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
Of the destination address 10.244.1.17 More in line with the fourth rule , default gateway 0.0.0.0, This means that this is a direct connection rule , The corresponding equipment is cni0.
Packets naturally pass through cni0 Sent to the corresponding pod.
Because the host node's IP As a packet " Next jump " Address , therefore host-gw The mode requires that the host layer-2 network must be interconnected
Three layers IP Source of layer package IP And purpose IP It's actually a container IP. So this package should be written on the fourth floor worker2 Node mac Address , To forward it . And in order to make worker1 The node receives this packet , The address of the next hop is used .
Conversion to Kubernetes We have to add iptables The rules of , So the forwarding at this time is :
Obviously, a layer is missing flannel.1 Forwarding , therefore host-gw Performance comparison VXLAN It has been improved , According to the rumor host-gw Compared with the direct transmission performance loss of the host computer, the mode has a loss of about 10%, and VXLAN It's in 20%~30% Between .
calico
calico Your network is even worse . He doesn't even use the bridge , Use it directly Veth Pair equipment , Dock the container to the host ( Virtual devices created cali start ).
Or suppose master node pod(IP:10.244.235.130) visit worker1 Of pod(IP:10.244.235.129).
[[email protected] ~]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
network-tools-66b6674fd9-kf77w 1/1 Running 0 7m37s 10.244.235.130 worker1 <none> <none>
network-tools-66b6674fd9-rqf8x 1/1 Running 0 7m37s 10.244.189.67 worker2 <none> <none>
network-tools-66b6674fd9-wppm4 1/1 Running 0 7m37s 10.244.235.129 worker1 <none> <none>
After the packet is transmitted to the host node , Start routing directly according to the host node
[[email protected] ~]# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.88.8.254 0.0.0.0 UG 100 0 0 ens192
10.88.8.0 0.0.0.0 255.255.252.0 U 100 0 0 ens192
10.244.0.0 0.0.0.0 255.255.255.0 U 0 0 0 cni0
10.244.1.0 10.88.10.182 255.255.255.255 UGH 0 0 0 tunl0
10.244.1.0 10.88.10.182 255.255.255.0 UG 0 0 0 ens192
10.244.2.0 10.88.10.183 255.255.255.255 UGH 0 0 0 tunl0
10.244.2.0 10.88.10.183 255.255.255.0 UG 0 0 0 ens192
10.244.189.64 10.88.10.183 255.255.255.192 UG 0 0 0 tunl0
10.244.219.64 0.0.0.0 255.255.255.192 U 0 0 0 *
10.244.219.65 0.0.0.0 255.255.255.255 UH 0 0 0 cali88526110d1d
10.244.235.128 10.88.10.182 255.255.255.192 UG 0 0 0 tunl0
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
He obviously fits the penultimate rule :
10.244.235.128 10.88.10.182 255.255.255.192 UG 0 0 0 tunl0
His gateway is 10.88.10.182, That is to say worker1 Node host IP, This is very similar to flannel Of host-gw The way , It also configures the address of the next hop to specify .
But this package has to go through a tunl0 Send out your equipment . Here comes the bag tunl0 Then it will be encapsulated again , The container will be sent out at this time IP package As a packet , Reseal a IP layer ( Is to send out the container with the source IP And the target IP This layer, together with its own data, is treated as a packet , Reseal three layers on top of him ) But this time the package IP layer , It directly covers up the original IP layer ," camouflage " Become a slave master To worker1 Communication package of the host , from ens192 The network card sends out the data transmitted through the host network .
and worker1 After receiving it , Similarly, the host unpacks first , Next, I'll give it to tunl0 equipment , He will restore the three-layer packets sent by the original container , That is, to the target IP10.244.235.129 This floor .
Next, where will the packet be forwarded , It depends on worker1 The routing of the node host .
calico In mode , For every one created pod after calico Will create a corresponding virtual network connection to the host , At the same time, add a route to the host , Record the contents of the container IP Relationship with corresponding equipment , This is similar to a " Border gateway "
[[email protected] ~]# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.88.8.254 0.0.0.0 UG 100 0 0 ens192
10.88.8.0 0.0.0.0 255.255.252.0 U 100 0 0 ens192
10.244.0.0 10.88.10.181 255.255.255.255 UGH 0 0 0 tunl0
10.244.0.0 10.88.10.181 255.255.255.0 UG 0 0 0 ens192
10.244.1.0 0.0.0.0 255.255.255.0 U 0 0 0 cni0
10.244.2.0 10.88.10.183 255.255.255.255 UGH 0 0 0 tunl0
10.244.2.0 10.88.10.183 255.255.255.0 UG 0 0 0 ens192
10.244.189.64 10.88.10.183 255.255.255.192 UG 0 0 0 tunl0
10.244.219.64 10.88.10.181 255.255.255.192 UG 0 0 0 tunl0
10.244.235.128 0.0.0.0 255.255.255.192 U 0 0 0 *
10.244.235.129 0.0.0.0 255.255.255.255 UH 0 0 0 calia5b07904eb8
10.244.235.130 0.0.0.0 255.255.255.255 UH 0 0 0 calif1f8fa46c64
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
Next, hit the penultimate route
10.244.235.129 0.0.0.0 255.255.255.255 UH 0 0 0 calia5b07904eb8
The packets will go directly through calia5b07904eb8 This device forwards , This device is connected to the container at the other end of the host (Veth Pair equipment ), therefore , Packets are forwarded directly into the container .
This needs to be used tunl0 The mode for unpacking packets is called IPIP Pattern ( It's very figurative , One IP There is another layer behind it IP layer ), The performance and flannel Of VXLAN It's almost the same . But its advantage is that it can communicate in an environment that requires routing and forwarding ( The nodes in the cluster are distributed in two LANs ).
flannel VXLAN The design idea is to cover an existing three-layer network On the second floor The Internet , So he needs to be based on mac Address to forward packets , and host-gw The mode needs to use the peer host in the routing table IP To configure your own packets " Next jump " Address , It is also required that the layer-2 network must be unobstructed . and calico Of IPIP After the mode is encapsulated, the data packet is disguised as a communication transmission initiated by one node to another node , So his packets can be forwarded through the router . Although the performance is somewhat degraded , But more nodes can be supported ( It is necessary to ensure that two nodes in different network segments can communicate through the router ).calico There is another mode that does not require IPIP Further packets of , But it also uses the peer host IP The address is configured to " Next jump " Of IP How to address .
calico The second mode of
Node-to-Node Mesh Pattern
calico Used a **BGP(Border Gateway Protocol, Border gateway protocol )** To maintain the routing information of each node .
BGP A small program will be run at each node , They transmit their routing table information to other nodes . The program of other nodes will analyze it after receiving it , Then add it to the routing table of your own node .
For each node BGP The program and other constituent clusters synchronize their routing table information with each other
In this mode , The container's packets pass through Veth Pair The device goes directly to the host computer , No further encapsulation is required to directly match the routing table of the host , The host routing table will be directly BGP Add a container network segment to the end host , The gateway is for the end-to-end host IP Address , The sending device is a route of the physical network card of the host , It's like this :
10.244.1.0 10.88.10.182 255.255.255.0 UG 0 0 0 ens192
such , Packets directly match this route , Treat the network card of the other host as a " Router "( Next hop address ), After arriving at the opposite host, continue to match the routing table of the opposite host ,calico Each container of the current host IP And the corresponding Veth Pair Add the device name to the routing table , It's like this :
10.244.235.130 0.0.0.0 255.255.255.255 UH 0 0 0 calif1f8fa46c64
such , The packet is directly matched with this route and forwarded to pod in .
Border gateway

Pictured , If you want to from 10.10.0.2 Host access 172.17.0.2 The host can access the past . Because when he sends this packet, he finds that the peer host and himself are not in the same network segment through mask operation , So send this packet to the gateway .
When the packet arrives at the gateway , The router unpacks and gets to the third floor IP package , In the routing table of the gateway 172.17.0.2 Route2 Indicates that the packet of this network segment is to be sent to another router Route2, So this bag is route1 Forwarded to route2, meanwhile route2 This is recorded in the routing table of IP The corresponding port . Directly forward to the corresponding host through this port .
But the other way around ,172.17.0.2 To visit 10.10.0.2, It won't work at all . Because his gateway does not have any information from another LAN . therefore as1 This LAN accesses as2 There is no problem with the host in the LAN , And the reverse is completely impassable . This kind of router connects two network segments together , Or there is a subnet of another router in the list of routers , Call it " Border gateway ".
on top calico In the network , Each node is treated as a " Border gateway ". Together they form a large network , Each border gateway passes through BGP Protocol to synchronize routing with each other . But the more nodes , The more routing information needs to be synchronized , therefore calico One more
Route ReflectorPattern , In this mode calico Several nodes will be selected separately to establish contact with all border gateways and synchronize routing information . Other nodes only need to synchronize information with these nodes .
边栏推荐
- Redux related usage
- laravel-admin 非自增ID获取, 及提交隐藏表单
- Work report (3)
- 一键部署属于自己的社区论坛
- Group by is used in laravel to group and query the quantity
- Is it safe for compass software to buy stocks for trading? How to open an account to buy shares
- 机器学习深度神经网络——实验报告
- 2020.7.6 interview with fence network technology company
- 中国十大证券app排名 开户安全吗
- 深度学习中的FLOPs和Params如何计算
猜你喜欢

JWT (SSO scheme) + three ways of identity authentication

机器学习深度神经网络——实验报告

Easyx-----c语言实现2048

FastRCNN

10 years' experience in programmer career - for you who are confused

Sqli labs range 1-5

Machine learning LDA - Experimental Report

April 13, 2021 interview with beaver family

FasterRCNN

PC QQ大厅 上传更新 修改versionInfo
随机推荐
Huawei secoclient reports an error "accept return code timeout" [svn adapter v1.0 exclamation point]
laravel 安装报错 Uncaught ReflectionException: Class view does not exist
QT连接MySql数据查询失败
基于slate构建文档编辑器
[work details] March 18, 2020
Jasperreports - print PDF (project tool)
最牛X的CMDB系统
Windows and Linux regularly backup MySQL database
Redux related usage
[Beiyou orchard microprocessor design] 10 serial communication serial communication notes
JWT (SSO scheme) + three ways of identity authentication
我想知道同花顺是炒股的么?手机开户安全么?
MySQL模糊查询详解
一键部署属于自己的社区论坛
mysql性能监控和sql语句
深圳市福田区支持文化创意产业发展若干措施
2021 Q3-Q4 Kotlin Multiplatform 使用现状 | 调查报告
Code specification & explain in detail the functions and uses of husky, prettier, eslint and lint staged
Implementing MySQL master-slave replication in docker
Nacos2.x.x start error creating bean with name 'grpcclusterserver';