当前位置:网站首页>Iptables cause heartbeat brain fissure
Iptables cause heartbeat brain fissure
2022-07-04 11:02:00 【Brother Xing plays with the clouds】
Will be heartbeat Apply to the production environment , There are still many things to pay attention to , Carelessness may lead to heartbeat Unable to switch or brain crack , Now let's introduce the reason iptables The phenomenon that causes brain cracking .
Lord :192.168.3.218
192.168.4.218 heartbeat ip
usvr-218 Host name
To prepare :192.168.3.128
192.168.4.128 heartbeat ip
usvr-128 Host name
The phenomenon : When to start heartbeat After the Lord ,VIP stay 218 Effective on ; And then it starts heartbeat To prepare ,VIP stay 128 Also effective on ; At this time, the brain fissure produces , Cause access exception .
Solutions :
1. Check the logs of the host and standby
host 218 The log is as follows ( Only some logs are listed ):
heartbeat[27330]: 2015/01/27_09:05:29 ERROR: Message hist queue is filling up (500 messages in queue)
heartbeat[27330]: 2015/01/27_09:05:30 ERROR: Message hist queue is filling up (500 messages in queue)
heartbeat[27330]: 2015/01/27_09:05:30 ERROR: Message hist queue is filling up (500 messages in queue)
heartbeat[27330]: 2015/01/27_09:05:31 ERROR: Message hist queue is filling up (500 messages in queue)
heartbeat[27330]: 2015/01/27_09:05:32 ERROR: Message hist queue is filling up (500 messages in queue)
heartbeat[27330]: 2015/01/27_09:05:32 ERROR: Message hist queue is filling up (500 messages in queue)
heartbeat[27330]: 2015/01/27_09:05:33 WARN: node usvr-128: is dead
heartbeat[27330]: 2015/01/27_09:05:33 info: Cancelling pending standby operation
heartbeat[27330]: 2015/01/27_09:05:33 info: Dead node usvr-128 gave up resources.
heartbeat[27330]: 2015/01/27_09:05:33 info: all clients are now resumed
heartbeat[27330]: 2015/01/27_09:05:33 ERROR: lowseq cannnot be greater than ackseq
heartbeat[27330]: 2015/01/27_09:05:33 info: hist->ackseq =74575, old_ackseq=0
heartbeat[27330]: 2015/01/27_09:05:33 info: hist->lowseq =74576, hist->hiseq=74824, send_cluster_msg_level=1
heartbeat[27333]: 2015/01/27_09:05:34 CRIT: Emergency Shutdown: Master Control process died.
heartbeat[27333]: 2015/01/27_09:05:34 CRIT: Killing pid 27330 with SIGTERM
heartbeat[27333]: 2015/01/27_09:05:34 CRIT: Killing pid 27334 with SIGTERM
heartbeat[27333]: 2015/01/27_09:05:34 CRIT: Killing pid 27335 with SIGTERM
heartbeat[27333]: 2015/01/27_09:05:34 CRIT: Killing pid 27336 with SIGTERM
heartbeat[27333]: 2015/01/27_09:05:34 CRIT: Killing pid 27337 with SIGTERM
heartbeat[27333]: 2015/01/27_09:05:34 CRIT: Emergency Shutdown(MCP dead): Killing ourselves.
Standby machine 128 The log is as follows ( Only some logs are listed ):
Jan 27 10:11:35 heartbeat: [15999]: info: glib: ucast: bound receive socket to device: eth0
Jan 27 10:11:35 heartbeat: [15999]: info: glib: ucast: set SO_REUSEPORT(w)
Jan 27 10:11:35 heartbeat: [15999]: info: glib: ucast: started on port 694 interface eth0 to 192.168.4.218
Jan 27 10:11:35 heartbeat: [15999]: info: glib: ping heartbeat started.
Jan 27 10:11:35 heartbeat: [15999]: info: G_main_add_TriggerHandler: Added signal manual handler
Jan 27 10:11:35 heartbeat: [15999]: info: G_main_add_TriggerHandler: Added signal manual handler
Jan 27 10:11:35 heartbeat: [15999]: info: G_main_add_SignalHandler: Added signal handler for signal 17
Jan 27 10:11:35 heartbeat: [15999]: info: Local status now set to: 'up'
Jan 27 10:11:35 heartbeat: [15999]: info: Link 192.168.3.1:192.168.3.1 up.
Jan 27 10:11:35 heartbeat: [15999]: info: Status update for node 192.168.3.1: status ping
Jan 27 10:13:35 heartbeat: [15999]: WARN: node usvr-218: is dead
Jan 27 10:13:35 heartbeat: [15999]: info: Comm_now_up(): updating status to active
Jan 27 10:13:35 heartbeat: [15999]: info: Local status now set to: 'active'
Jan 27 10:13:35 heartbeat: [15999]: info: Starting child client "/usr/lib64/heartbeat/ipfail" (498,498)
Jan 27 10:13:35 heartbeat: [15999]: WARN: No STONITH device configured.
Jan 27 10:13:35 heartbeat: [15999]: WARN: Shared disks are not protected.
Jan 27 10:13:35 heartbeat: [15999]: info: Resources being acquired from localsv218.
As shown above , Both sides check each other's node Die , To take over VIP, Lead to brain fissure .
2. It is preliminarily concluded that it is caused by the failure of communication between the active and standby parties or network delay , Is it because the time is not synchronized , Although the time is different, it's wrong heartbeat The impact is small , But there is a lot of difference , There are bound to be problems , So both sides time .
/usr/sbin/ntpdate ntp.api.bz&&hwclock -w
echo "0 23 * * * root /usr/sbin/ntpdate ntp.api.bz&&hwclock -w > /dev/null 2>&1" >>/etc/crontab
3. The timing is over , Still report the error in the log , Check the active and standby configuration files again , No problem found , The only difference is that there are firewalls on both the active and standby , because heartbeat Set by udp 694 Port communication , So will udp 694
Let the port pass in the fire wall .
In the main 218 Add on :
/sbin/iptables -A INPUT -i eth0 -p udp -s 192.168.4.128 --dport 694 -m comment --comment "heartbeat-slave" -j ACCEPT
In preparation 128 Add on :
/sbin/iptables -A INPUT -i eth0 -p udp -s 192.168.4.218 --dport 694 -m comment --comment "heartbeat-master" -j ACCEPT
Be careful :1. If the firewall policy is strict , To beat your heart ip Let go of , otherwise udp Communication will still fail
2. The entrance network card is aimed at the heartbeat ip Network card of
After firewall configuration , The active and standby can communicate normally , Under normal circumstances, the master node takes over VIP Work , When the master node down Drop or master node heartbeat Service stopped , The standby node will take over VIP
边栏推荐
- Common system modules and file operations
- Open the neural network "black box"! Unveil the mystery of machine learning system with natural language
- Canoe: the difference between environment variables and system variables
- Locust installation
- /*Write a loop to output the elements of the list container in reverse order*/
- Safety testing aspects
- VPS安装Virtualmin面板
- Canoe - the second simulation project -xvihicle1 bus database design (operation)
- Canoe - the third simulation project - bus simulation - 3-2 project implementation
- Iterator generators and modules
猜你喜欢
20 minutes to learn what XML is_ XML learning notes_ What is an XML file_ Basic grammatical rules_ How to parse
F12 clear the cookies of the corresponding web address
Performance test method
Postman interface test
[Galaxy Kirin V10] [server] iSCSI deployment
[Galaxy Kirin V10] [desktop] can't be started or the screen is black
Elevator dispatching (pairing project) ②
[Galaxy Kirin V10] [desktop] FTP common scene setup
[machine] [server] Taishan 200
Article publishing experiment
随机推荐
Software testing related resources
[Galaxy Kirin V10] [server] FTP introduction and common scenario construction
Huge number (C language)
Usage of case when then else end statement
MBG combat zero basis
software test
Canoe - the third simulation project - bus simulation - 3-1 project implementation
Heartbeat报错 attempted replay attack
试题库管理系统–数据库设计[通俗易懂]
Polymorphic system summary
Dictionaries and collections
JMeter correlation technology
JMeter assembly point technology and logic controller
Terms related to hacker technology
[Galaxy Kirin V10] [desktop] cannot add printer
1. Circular nesting and understanding of lists
[Galaxy Kirin V10] [server] grub default password
[Galaxy Kirin V10] [desktop] printer
Common system modules and file operations
Canoe-the second simulation project-xvehicle-1 bus database design (idea)