Authentication failures with newer Watchguard Authentication Gateway (SSO eventlogmonitor + AD PCs)

Some months ago we upgraded the Watchguard Authentication Gateway to 12.7.2 and then 12.10. We also removed the service user from "Domain Admin" groups, but we gave him permission to access event logs of PCs as described in the guides.

SSO Agent is configured to use only Event Log Monitor, and is installed on the primary domain controller server. We use it to authenticate firebox users from their windows 10/11 Professional PCs joined to a local AD Domain.

Now PC are randomly unable to authenticate.

As you can see from the test below, all requests after the 1st one are failing:

giovanni@mypc:~$ telnet busitsrv01.mydomain.net 4114
Connected to busitsrv01.mydomain.net.
Escape character is '^]'.
EVENT 350 log info Connected to the WatchGuard Authentication Gateway SSO agent. Version Build 685243. Connected at:11/15/2023 12:36:53
 To log in to the SSO Agent, type your user credentials. Or, type "help" to see the list of available log in commands.
 After you log in, type "help" to see all of the commands available for the SSO Agent.
login admin xxxxxxx
User admin logged in
get user
443 N=1 IP= user="johnsmith" domain=mydomain.net server= group="CN=CERTSVC_DCOM_ACCESS," group="CN=Domain Users," group="CN=Photo View," group="CN=Netpon Users, group="CN=STEX Admin Group," group="CN=All people bus," group="CN=Users," group="CN=Video Internal,"
get user
45 N=1 IP= ERROR="unknown user"
get user
45 N=1 IP= ERROR="unknown user"
get user
45 N=1 IP= ERROR="unknown user"

Restarting the "Watchguard Authentication Event Log Monitor" service makes the 1st request works again, but all subsequent requests will fail again.

Adding the user to the Domain Admin groups and restarting services did not solved it, also moving the gateway installation to a different server, not a domain controller, was not a solution.

This is error I can find in eventlogmonitor.log when querying a different IP:

2023-11-15T11:33:48Z [tid:9380] INFO: [Thread Body]  [Thread: Response to Get User Command] send message: async-29099-000A- 79 N=0 IP= ERROR="Remote host "" in logoff status"
 (to socket[1])

... but the user is logged on and working at his PC.

I had to rollback to WG-Authentication-Gateway 12.5.4 and re-add the service user do Domain Admins group to make AD SSO works.

Why does only the 1st "get user" succeed ?


  • Options
    james.carsonjames.carson Moderator, WatchGuard Representative

    Hi @giox069
    If the older versions aren't working, there was likely a change somewhere else.
    -Were any OS updates installed during that time?
    -Have any security policies/group policies changed?

    For example, Event Log Monitor relies on pieces of windows file/print sharing to be able to access the event logs on remote machines. If this gets disabled due to a security policy change, ELM will not be able to pull logs. If this is the case, re-enabling, or using the SSO client should allow this to work.

    -James Carson
    WatchGuard Customer Support

  • Options
    edited November 2023

    Older versions are working (but requires the user be PC administrator). Newer version aren't working starting from the 2nd authentication attempt.

    Windows file/print sharing is obviously enabled, and the proof is that older versions of GW are working, and newer versions work at the 1st attempt after eventlogmonitor service restart.

    We would not like to distribute the SSO client: it's an extra management cost which we would like to avoid if possilble. Spending a huge amount of money distributing an extra software to manage, just to workaround a misconfiguration or a bug, is not a wise choice.

  • Options
    james.carsonjames.carson Moderator, WatchGuard Representative

    @giox069 Best I can suggest here is to open a support case. Based on what I can see there's an issue preventing the ELM from pulling the correct user logs (or it being able to at all.) Our support team should be able to sort out what's happening and assist.

    -James Carson
    WatchGuard Customer Support

  • Options

    @giox069 , you have pretty much duplicated what I did with exactly the same result. Problem became critical not long after I followed the tip not to have the service user as a member of domain admins and instead give it permissions to access logs. I may have solved it by putting the SSO Client on all our end user machines. However I also had the problem of high CPU usage and a massively growing pending IP list shown here

    I still cannot get LDAPS to work on the Firebox either again this whole thing kicked off after I updated SSO Agent and Firebox to the latest version. Release notes for the Firebox 12.10.1 seem to indicate a major change to SSO functionality to include Azure Active Directory as an SSO source if I read it correctly.

  • Options

    @NickDaGeek I also gave "Manage auditing and security log" user right to the service user. But now, due to the fact I had to downgrade WAG to 12.7.2, I made service user is Domain Admin too. Maybe one day I will try to remove "Manage auditing and security log" and retry.
    I also noticed that when the user is connected via RDP, all is working fine, the "get user xxx.xxx.xxx.xxx" never fails. It fails only when the user is connected to the console.
    We also use Atera as RMM, no other special software is installed on our PCs.
    I have a case open with Watchguard, but they are still requesting the same info to me again and again (WAG version, activated options, OS versions, screenshots of WAG). What a waste of time providing them the same info every 2-4 days.

Sign In to comment.