Cluster Random Failover
Our MSP upgraded our firmware on the 22/23rd of June AUG to 12.6.2 and since then the cluster has been randomly flipping over. In the last few days it's been limited to once per day but it's frustrating and at the moment we don't have the time to rebuild the cluster which was the last response we got from watchguard support. I am afraid it may come to that unless someone here has some advise.
Is there any recommendations on how to narrow this glitch down? I have increased the lost heartbeat threshold to 10 which may or may not have helped.
Dimension Server shows that the cluster ports changed status which caused the switchover. We've change these cable to new ones as well.
2020-09-01 14:27:01 networkd 4 [eth2 (Optional-1)] Interface link status changed to up
2020-09-01 14:27:01 networkd 4 [eth3 (Optional-2)] Interface link status changed to up
Here is today's Cluster HA Events:
Tue Sep 1 14:27:00 2020 Role: Member 801003B4F806D becomes IDLE. (devSt=14)
Tue Sep 1 14:27:00 2020 Formation: On 801003B4F806D, HA port eth2 is DOWN
Tue Sep 1 14:27:03 2020 Formation: On 801003B4F806D, HA port eth2 is UP
Tue Sep 1 14:27:09 2020 Election: cluster election event, Master, rcvd. Current opState=IDLE
Tue Sep 1 14:27:09 2020 Role: Member 801003B4F806D becomes MASTER. (devSt=14)
Tue Sep 1 14:27:26 2020 Formation: On 801003B4F806D, HA port eth2 is DOWN
Tue Sep 1 14:27:40 2020 Formation: On 801003B4F806D, HA port eth2 is UP
Tue Sep 1 14:27:48 2020 Formation: Member 801003D57354C: Device has joined the cluster.Device State=14
Tue Sep 1 14:28:09 2020 Role: Master 801003B4F806D assigns Member 801003D57354C as BACKUP Master, Mode PASSIVE