Sometimes PAM displays the following message in the main page:
"Synchronization Error! One or more cluster member databases is unsynchronized. "
Unless an LDAP user group or a CSV import is in progress, this error indicates database synchronization issues.
This is a frequent question or issue that users don't know what to do to fix this or what it means? How should we proceed?
If a cluster is down because a member node has failed , PAM will output various errors.
Could this be caused due to the cluster configuration?
What do I need to do to receive more information about this condition? How can I analyse the possible reasons why a node failed? How can we re-enable the synchronization?
Unless an LDAP user group or a CSV import is in progress, this error indicates database synchronization issues. To determine which database contains the correct data, export the database with the system log for each cluster member and contact Support for assistance. You can also see the database status which shows the Password Authority status.
Once the reference (good) database is identified, the cluster should be stopped and its members reordered so that the cluster member with the reference database IS first in the member list. The cluster can then be restarted.
Troubleshooting will depend of the node:
- The cluster primary member's Credential Management database is inactive.
This is a special case of an inactive credential management database. Choose a new primary which should be an active cluster member. Stop the cluster. As above, reorder cluster members ensuring the new primary with an active Credential Management database IS the first in the list. The cluster can then be restarted. Note: only the master node will have permissions to turn the cluster on.
- One or more Credential Management databases are inactive.
This condition can be associated with network connectivity problems, including latency, between Xsuite nodes. When a node becomes unreachable, its Credential Management database is marked as inactive. Deactivation prevents corruption of stored credentials. Restoring network connectivity allows the cluster member to participate using Credential Management data from other cluster members. To re-join the credential management database, the cluster must be stopped and checked to ensure the inactive database is NOT first in the member list. The cluster can then be restarted.
A reason why the synchronization fails can be because a member was unreachable and comes back online (either by restarting or network connectivity being restored). It attempts to rejoin the cluster and you will need to rebuild the cluster.
You can also define the Database Replication Connection Timeout and the Database Replication Socket Timeout before turning on the cluster. This is defined in the master node.
Note:These two settings are only available when Cluster Tuning mode is on. Do not change these settings unless directed to do so by CA Support:
- Database Replication Connection Timeout: Set the time for a primary site member to wait to connect to a peer primary site database before failing. Timeouts can lead to deactivation of the member Credential Manager database.
- Database Replication Socket Timeout: Set the time for a primary site member to wait for a response from a peer primary site database before failing. Timeouts can lead to deactivation of the member Credential Manager database.