When one agent holds the cluster lock and doesn't release it you will run into a cluster lock scenario which prevents new agents to join the cluster. Every engine needs a lock on MasterId space to make an entry announcing their joining of the cluster, the same applies while leaving the cluster. If they do not get a lock on the MasterId space then the agent trying to join keeps retrying to lock the cluster which we term cluster lock scenario. So, in this scenario typically one of the running engines is holding the lock on master space or MasterId space and the joining member cannot get the lock on this space and it keeps waiting for the lock, even after killing and restarting the agent.
In ASAdmin and ASMM you will be able to see that the stale agent is still there. To release this lock in a specific engine explore one of the following options:
Option #1: Connect to the cluster from as-admin, run "show space '<your MasterId_Space>' locks". This command will give you an idea on specific member holding the cluster lock. We have seen that restarting the agent holding the lock resolves the issue. If cluster is under intense load then restart would shift the lock to a different running agent so you might have to look towards option 2.
Option #2 Another option is to manually unlock the space by executing the command below "unlock space "<your masterId space>" all".
Option #3: Clean cluster restart involves stopping all engines and starting them. This would mean you have system outage for the entire restart period.
Issue/Introduction
This article explains three options to release the lock of specific agent in the AS metaspace.