当前位置:网站首页>Iptables cause heartbeat brain fissure
Iptables cause heartbeat brain fissure
2022-07-04 11:02:00 【Brother Xing plays with the clouds】
Will be heartbeat Apply to the production environment , There are still many things to pay attention to , Carelessness may lead to heartbeat Unable to switch or brain crack , Now let's introduce the reason iptables The phenomenon that causes brain cracking .
Lord :192.168.3.218
192.168.4.218 heartbeat ip
usvr-218 Host name
To prepare :192.168.3.128
192.168.4.128 heartbeat ip
usvr-128 Host name
The phenomenon : When to start heartbeat After the Lord ,VIP stay 218 Effective on ; And then it starts heartbeat To prepare ,VIP stay 128 Also effective on ; At this time, the brain fissure produces , Cause access exception .
Solutions :
1. Check the logs of the host and standby
host 218 The log is as follows ( Only some logs are listed ):
heartbeat[27330]: 2015/01/27_09:05:29 ERROR: Message hist queue is filling up (500 messages in queue)
heartbeat[27330]: 2015/01/27_09:05:30 ERROR: Message hist queue is filling up (500 messages in queue)
heartbeat[27330]: 2015/01/27_09:05:30 ERROR: Message hist queue is filling up (500 messages in queue)
heartbeat[27330]: 2015/01/27_09:05:31 ERROR: Message hist queue is filling up (500 messages in queue)
heartbeat[27330]: 2015/01/27_09:05:32 ERROR: Message hist queue is filling up (500 messages in queue)
heartbeat[27330]: 2015/01/27_09:05:32 ERROR: Message hist queue is filling up (500 messages in queue)
heartbeat[27330]: 2015/01/27_09:05:33 WARN: node usvr-128: is dead
heartbeat[27330]: 2015/01/27_09:05:33 info: Cancelling pending standby operation
heartbeat[27330]: 2015/01/27_09:05:33 info: Dead node usvr-128 gave up resources.
heartbeat[27330]: 2015/01/27_09:05:33 info: all clients are now resumed
heartbeat[27330]: 2015/01/27_09:05:33 ERROR: lowseq cannnot be greater than ackseq
heartbeat[27330]: 2015/01/27_09:05:33 info: hist->ackseq =74575, old_ackseq=0
heartbeat[27330]: 2015/01/27_09:05:33 info: hist->lowseq =74576, hist->hiseq=74824, send_cluster_msg_level=1
heartbeat[27333]: 2015/01/27_09:05:34 CRIT: Emergency Shutdown: Master Control process died.
heartbeat[27333]: 2015/01/27_09:05:34 CRIT: Killing pid 27330 with SIGTERM
heartbeat[27333]: 2015/01/27_09:05:34 CRIT: Killing pid 27334 with SIGTERM
heartbeat[27333]: 2015/01/27_09:05:34 CRIT: Killing pid 27335 with SIGTERM
heartbeat[27333]: 2015/01/27_09:05:34 CRIT: Killing pid 27336 with SIGTERM
heartbeat[27333]: 2015/01/27_09:05:34 CRIT: Killing pid 27337 with SIGTERM
heartbeat[27333]: 2015/01/27_09:05:34 CRIT: Emergency Shutdown(MCP dead): Killing ourselves.
Standby machine 128 The log is as follows ( Only some logs are listed ):
Jan 27 10:11:35 heartbeat: [15999]: info: glib: ucast: bound receive socket to device: eth0
Jan 27 10:11:35 heartbeat: [15999]: info: glib: ucast: set SO_REUSEPORT(w)
Jan 27 10:11:35 heartbeat: [15999]: info: glib: ucast: started on port 694 interface eth0 to 192.168.4.218
Jan 27 10:11:35 heartbeat: [15999]: info: glib: ping heartbeat started.
Jan 27 10:11:35 heartbeat: [15999]: info: G_main_add_TriggerHandler: Added signal manual handler
Jan 27 10:11:35 heartbeat: [15999]: info: G_main_add_TriggerHandler: Added signal manual handler
Jan 27 10:11:35 heartbeat: [15999]: info: G_main_add_SignalHandler: Added signal handler for signal 17
Jan 27 10:11:35 heartbeat: [15999]: info: Local status now set to: 'up'
Jan 27 10:11:35 heartbeat: [15999]: info: Link 192.168.3.1:192.168.3.1 up.
Jan 27 10:11:35 heartbeat: [15999]: info: Status update for node 192.168.3.1: status ping
Jan 27 10:13:35 heartbeat: [15999]: WARN: node usvr-218: is dead
Jan 27 10:13:35 heartbeat: [15999]: info: Comm_now_up(): updating status to active
Jan 27 10:13:35 heartbeat: [15999]: info: Local status now set to: 'active'
Jan 27 10:13:35 heartbeat: [15999]: info: Starting child client "/usr/lib64/heartbeat/ipfail" (498,498)
Jan 27 10:13:35 heartbeat: [15999]: WARN: No STONITH device configured.
Jan 27 10:13:35 heartbeat: [15999]: WARN: Shared disks are not protected.
Jan 27 10:13:35 heartbeat: [15999]: info: Resources being acquired from localsv218.
As shown above , Both sides check each other's node Die , To take over VIP, Lead to brain fissure .
2. It is preliminarily concluded that it is caused by the failure of communication between the active and standby parties or network delay , Is it because the time is not synchronized , Although the time is different, it's wrong heartbeat The impact is small , But there is a lot of difference , There are bound to be problems , So both sides time .
/usr/sbin/ntpdate ntp.api.bz&&hwclock -w
echo "0 23 * * * root /usr/sbin/ntpdate ntp.api.bz&&hwclock -w > /dev/null 2>&1" >>/etc/crontab
3. The timing is over , Still report the error in the log , Check the active and standby configuration files again , No problem found , The only difference is that there are firewalls on both the active and standby , because heartbeat Set by udp 694 Port communication , So will udp 694
Let the port pass in the fire wall .
In the main 218 Add on :
/sbin/iptables -A INPUT -i eth0 -p udp -s 192.168.4.128 --dport 694 -m comment --comment "heartbeat-slave" -j ACCEPT
In preparation 128 Add on :
/sbin/iptables -A INPUT -i eth0 -p udp -s 192.168.4.218 --dport 694 -m comment --comment "heartbeat-master" -j ACCEPT
Be careful :1. If the firewall policy is strict , To beat your heart ip Let go of , otherwise udp Communication will still fail
2. The entrance network card is aimed at the heartbeat ip Network card of
After firewall configuration , The active and standby can communicate normally , Under normal circumstances, the master node takes over VIP Work , When the master node down Drop or master node heartbeat Service stopped , The standby node will take over VIP
边栏推荐
- regular expression
- Quick sort (C language)
- Safety testing aspects
- Network connection (II) three handshakes, four waves, socket essence, packaging of network packets, TCP header, IP header, ACK confirmation, sliding window, results of network packets, working mode of
- [test theory] test process management
- Post man JSON script version conversion
- 51 data analysis post
- Application and Optimization Practice of redis in vivo push platform
- What if the book written is too popular? Author of "deep reinforcement learning" at Peking University: then open the download
- Notes on writing test points in mind mapping
猜你喜欢
Send a request using paste raw text
Canoe - the third simulation project - bus simulation-1 overview
[Galaxy Kirin V10] [desktop] printer
Canoe - the second simulation engineering - xvehicle - 2panel design (principle, idea)
Fundamentals of software testing
JMeter Foundation
Installation of ES plug-in in Google browser
Quick sort (C language)
Notes on writing test points in mind mapping
Canoe - the third simulation project - bus simulation - 2 function introduction, network topology
随机推荐
Using SA token to solve websocket handshake authentication
Canoe-the second simulation project-xvehicle-1 bus database design (idea)
Communication layer of csframework
Polymorphic system summary
Regular expression
On binary tree (C language)
Solaris 10网络服务
Cacti主机模板之定制版
Recursive method to achieve full permutation (C language)
3W word will help you master the C language as soon as you get started - the latest update is up to 5.22
The last month before a game goes online
[advantages and disadvantages of outsourcing software development in 2022]
Safety testing aspects
Application and Optimization Practice of redis in vivo push platform
Ten key performance indicators of software applications
Jemeter plug-in technology
2022 AAAI fellow release! Yan Shuicheng, chief scientist of sail, and Feng Yan, Professor of Hong Kong University of science and technology, were selected
2022 ape circle recruitment project (software development)
JMeter correlation technology
Common system modules and file operations