V12.8 issues - firewall hangs after 5 - 10 days

This has been going on since the V12,. beta.
Every 5-8 days or so, my T20 hangs or has very slow to no Internet access.
Symptoms: can't connect with the Web UI, FSM or create a CLI session - thus it is very hard to debug.
Web UI and Policy Manager access results in "Unable to add session"
Sometimes I have looked at FSM Status report prior to hangs, and see very low memory (MemFree, MemAvailable & Cached).
I've had a support case open for it but no update from support since 4/18, with a hang on 4/19 & another today - the longest time between reboots so far.
I reported at least 3 hangs during the beta and now 7 with 12.8 gold.
It was recommended to disable RED, but doing so has not resolved the issue.

Seems like GAV memory use could be an issue here, but I am reluctant to disable GAV on my policies.

Anyone else seeing this?

And I know that scheduling a nightly reboot would be a workaround, but I prefer to get this issue properly understood and permanently resolved.

Case 01673241

Comments

  • Support is back on the case.
    I will post any useful info found.

  • Are running Fireware 12.8 Update 1? There is no mention of this issue in it, but it may not hurt anything.

    I noticed the same thing on my T20-W and thought I was just losing my mind (more).

    Gregg Hill

  • Not running Update 1.
    I still think that this is related to the scand process - GAV

    I have had a few Failed Assertion SCAND Fault Reports, which I have been told are memory issues resulting in scand being killed & restarted by some sort of free memory recovery process in Fireware.

  • Yep, this was happening with my T70 during the beta. However, I understand that it is not directly related to the beta and it is a known problem that they are working on. I was given this code to track -> FBX-22080.

    Adrian from Australia

  • Hmmmmm - this may explain a few things for me.....typical for WatchGuard these days. I opened a case three days ago....they never did take it out of "case reported" stage....then they wonder why we change security devices.

    But, I have a number of T20 out there that seem to be having this issue. I was blaming the ISP. Will change to nightly reboot and see if that resolves my issues for the time being. The next step is to replace with Meraki.

  • Odd. I have 35 T20´s running 12.8 U1 and so far i have not had issues. They are running basic security suite.

  • edited May 2022

    Available memory seems to get to some critical level, and then something happens which causes relatively high CPU, very limited throughput, and waiting a while doesn't seem to resolve itself.
    The scand process seems to part of this.
    My T20 recently make it to 12 days before needing a reboot.
    I have not applied U1 yet. The Release Notes nor support suggests that doing so will make a difference.

    I asked about FBX-22080 on my case and got this reply:
    Low memory condition leads to memory corruption and eventually a crash.

    The T20, T40 & T70 all have 2 GB of memory.

  • I use sd-wan on nearly all my t20´s to route http(s) traffic via my M370 cluster for scanning, so my smaller devices might not be using as much resources on scanning as yours.
    Only Office 365 and other very specific http(s) traffic is routed directly through the t20 with scanning.

  • V12.8.1 resolves this issue.
    Memory issues related to AV scans have been addressed which helps smaller memory firewall models.

Sign In to comment.