Very weird bug with interfaces reassigning themselves??
We've got a few WG's, have been using them for years with good results.
Last night I was planning on doing a firmware update to one of our M370's. It's on 12.6.2.
It had been running for a few months without a reboot, so before doing the upgrade, I decided to reboot it.
I rebooted and waited, and waited, and waited. It never came back online. We have two WAN's coming into this WG, and neither was functioning.
I was doing all of this from 400 miles away, so got ahold of someone local and asked them to go in. They connected their laptop to their cellphone hotspot and then also ethernet. I was able to get into their laptop via our remote support app.
So I logged into the WG and found it was up and running internally just fine. However our primary internet showed the interface as "down", this was eth5. Our Secondary one showed as "up" on eth4. However it was not receiving any data on this eth4, could not reach the gw from the ISP.
I asked the onsite person to send me a photo of the unit, and when I got it, I was more confused. The photo showed eth4 with no link lights, and eth5 with link lights. This is opposite to what was reported in the WG web interface.
The onsite person powercycled the routers from the ISP's, but that didn't solve anything. We rebooted the WG again, and that didn't solve anything.
So grasping at straws, and a bit of a hunch, I moved the IP info over from the ISP that was was physically connected to eth5, to the eth4 section in the WG, and immediately the connection came up.
So now we have a connection that is physically in Eth5, but configured in software as eth4. All this happened after a reboot, the upgrade still hasn't even been attempted yet.
Anyone ever see anything like this before? it seems like the WG interface count might be shifted somehow.
The only other bit of oddness I can think of, was the reason we were going to do the upgrade in the first place. We've noticed now that staff are returning to the office, that when they were on Zoom/Teams/Meet calls, they'd semi-frequently get a warning their connection wasn't stable, and others reported they would freeze up briefly. This wasn't limited to any one user. In looking into it, we were seeing odd latency spikes, like between two servers which would be sub 1ms normally, then for a few seconds, shoot up to over 2000ms. after some digging we decided we thought it was the WG. The WG is acting as our router between our different VLAN's too. If we bypassed the WG, we didn't see those latency spikes. The CPU and ram were all normal low usage on the WG. We're only talking a couple dozen people in the office at this point, and only using maybe 30Mbps of our 200Mbps bandwidth. The latency is all internal, not internet related at all. That latency is still there after tonight's issues. Maybe the upgraded firmware will resolve that, but I'm not going to test it out at 1am and risk breaking the whole unit again.