Hi,
One of my vCenters, VCSA 6.7U3b, has been having issue that it "randomly" stops responding & when it happens
* HTML5 client just hangs there after entering username/password
* SSH to the appliance hangs after entering root password; it does not return with authentication error
* the appliance's console has no error messages
Sometimes, if we wait long enough in HTML5 client, it will error out with: "503 Service Unavailable (Failed to connect to endpoint: [N7Vmacore4Http16LocalServiceSpecE:0x00007f0ac0075b50] _serverNamespace = /websso action = Allow _port = 7080)".
In /var/log/vmware/messages, it has frequent occurrences of:
2020-04-01T18:13:06.145359+00:00 vcenter lsassd[8349]: 0x7f3876565700:Failed to find user, group, or domain by name (name = 'user1@DOMAIN.COM', searched host = 'DC.domain.com') -> error = 40098, symbol = LW_ERROR_RPC_OPENPOLICY_FAILED
In the same log file, it sometimes has:
2020-04-01T18:13:06.145809+00:00 vcenter lsassd[8349]: 0x7f3876565700:Domain 'domain.com' is now offline
2020-04-01T18:13:06.148560+00:00 vcenter lsassd[8349]: 0x7f383ffff700:Detected domain 'domain.com' offline. Some group information from this domain might be missing.
2020-04-01T18:13:06.235264+00:00 vcenter lsassd[8349]: 0x7f383ffff700:Domain 'domain.com' is now online
The vCenter is on the same network as the three domain controllers it uses so latency should not cause such issue. I've bumped up the resources 16 vCPUs and 48GB RAM but it still hangs. When it happened, the resources did not look stressed out at the VCSA VM level.
Has anyone had similar issues or any suggestions? Thanks,