SSO Authentication gateway agent and pending user list
Hi
Using WG SSO authentication gateway together with SSO windows client and firebox integration with SSO enabled through vpn tunnels. I am on version 12.7.2 on both agent and client.
I have pretty tight firewall rules only allowing SSO traffic (tcp/4114)from my trusted networks where there is a vpn route to the SSO agent plus i have excluded many networks in the SSO firebox configuration.
More and more often i experience the authentication gateway agent is using higher and higher cpu usage and memory. After a long time i discovered, it is duo to the fact the pending list on the SSO agent is growing and growing and growing.
This morning i had a pending list at 528000 which was causing the agent to use 40-50% cpu time. Eventually the agent will stall and hang at 100% cpu usage if not stopped.
As far as i can see, it seems everytime it´s the agent sending traffic on port tcp/4116 (tcp/445 is allowed) to a remote sso client through a vpn tunnel where the sso client is not active anymore, so the agent never gets a reply (verified with traffic monitor). When i look at the remote networks firebox the ip address is a valid address from a trusted network where a vpn route exists, but the firebox dhcp server lease list do not show a ip address lease with the specific ip (i have 24 hours lease time on these networks as they are quite small).
Somehow the sso agent keeps sending packets to a orphaned sso client which may, or may not, have been connected to a remote network.
I think so, and as we know, the sso agent are also sending successfull sso logins to all fireboxes authentication list, so maybe this also has something to say here.
Anyway, when the SSO agent sends traffic to a non-existing sso client, instead of just retry with the same pending list item, i bet the SSO agent adds a new entry for every call send to the missing sso client causing a huge pending list again causing very high cpu usage and memory footprint.
There is someting wrong in the code causing the pending list to grow forever and some client enever to time out.
Restarting the gateway agent service solves the issue and causing all fireboxes to failover to the standby gateway agent server service.
Regards
Robert
Comments
Hi Robert,
The Agent shouldn't be querying unless it's getting requests from the firewall to identify traffic from that host. If it's not purging, it might be getting it from elsewhere? Or it's just a bug, like you're indicating.
If you're looking at the pending list already, you're in the logs -- at your convenience can you please open a case with those logs. We can look into it and get it bugged.
Also worth noting that if you haven't updated the agent recently, there was an update somewhere along 12.7 to add some opcodes, and in 12.7.2 to add an AD mode on/off feature. If you're running an older agent I'd suggest upgrading, it'll still work with the older sso clients.
-James Carson
WatchGuard Customer Support
@james.carson
Thank you. I am already on 12.7.2 so it do not get any never.
The reason for this post was more for others, if they expirienced the same issues, but i have had a cas eopened for this issue, 01639955.
To this case i have uploaded debug logs and pcitures.
I some point i though it could be caused by a a client connecting to a guest wifi where the sso traffic is not allowed to the sso agent. But after looking deeper into this, i have policies only allowing sso traffic from trusted networks and Windows GPO policies not allowing client machines connecting to those wifi networks.
The case, 01639955, was opened a long time ago and as i write in the case, it could be a issue, i first saw after upgrading my devices to the 12.6.x firmware versions and never. I think my setup was very stable when running on fireware 12.5.x versions, but this is a "feeling" so i´m not sure about anything here.
If the traffic flow is client to firewall to sso agent then it very well could be a issue with stalled/orpaned connections from the remote firewalls.
Regards
Robert
Hi @rv@kaufmann.dk
I see that you closed the case in this instance -- I've asked the tech that's assigned the case to re-open it and get it escalated to the team that can get the bug investigated for you.
Thank you,
-James Carson
WatchGuard Customer Support
Thank you
Seeing similar large pending IP list and high CPU. Also seeing IP addresses in the Pending IP List in SSO Tools > Information > Status that are within SSO Exclusions Network Ranges on the firebox. Which is even more confusing. Have an open case 01980883 because ELM stopped working after the most recent update to the Firebox firmware. Have been forced to roll out SSO Client to as many clients as possible. Any updates on this? using SSO agent 12.10.1.19656 and SSO Client
@NickDaGeek
If you made the same error as me setting up remote fireboxes as SSO clients then this is not supported. This was what caused my CPU to spike and the program in th end to halt. This was even when all required ports was open between the networks.
In fact WG SSO solution is only a local firebox solution and do not work over vpn tunnels.
/Robert