unrecoverable client-server communication problem
book
Article ID: KB0083751
calendar_today
Updated On:
Description
Resolution
=========
Adding a secondary admin server in an existing file based Administrative domain results in the following error in the administrator.log upon startup of a secondary server:
Error [TIBRepoEntityStore] AESDKJ-0000 [main] com.tibco.infra.repository.RepoException: Encountered unrecoverable client/server communication problem
at com.tibco.repo.RemoteRepoClient.a(RemoteRepoClient.java:1931)
at com.tibco.repo.RemoteRepoClient.a(RemoteRepoClient.java:1929)
at com.tibco.repo.RemoteRepoClient.a(RemoteRepoClient.java:1929)
Issue/Introduction
Adding a secondary admin server in an existing file based Administrative domain results in the following error in the administrator.log upon startup of a secondary server.
Error [TIBRepoEntityStore] AESDKJ-0000 [main] com.tibco.infra.repository.RepoException: Encountered unrecoverable client/server communication problem at com.tibco.repo.RemoteRepoClient.a(RemoteRepoClient.java:1931) at com.tibco.repo.RemoteRepoClient.a(RemoteRepoClient.java:1929) at com.tibco.repo.RemoteRepoClient.a(RemoteRepoClient.java:1929)
Environment
Product: TIBCO Administrator
Version:
OS: all
--------------------
Resolution
Workaround
==========
In case the directory structure for the primary server and secondary server are the same.
1). Stop the primary admin server and all hawkagents in the domain.
2). Copy the entire data folder from the location <tibco_home>/administrator/domain/<domain name>/data on the primary machine over to the secondary machine after creating a backup of the above mentioned folder on the secondary server.
3). Restart the primary admin server and the secondary admin servers along with the hawkagents.
In case the directory structure for the primary server and secondary server are different:
1). Add the secondary server to the domain.
2). If you see the stack trace above upon starting the secondary server after some time, then let it run for 20-30 minutes to give it sufficient time to try to sync other repositories, i.e., application repositories and AUTH_<domainname>.dat repo.
3). Stop the secondary server and the hawkagent that was started on this machine. Stop hawkagents on all machines that are part of the domain and do not perform any activity such as deployment/undeployment on this admin server.
4). Use the RepoExport utility to export the SYS_<domain name>.dat from the primary server.
5). Stop the primary server and edit the secondary server's .tra file, i.e., tibcoadmin_<domain name>.tra file and change the repo.isMaster=true
6). Start the secondary server and once it is started perform a RepoImport of the SYS_<domain name> repository. Once the import is complete, manually edit the commitCount value in the SYS_<domain name>.dat file created on the secondary server. The user can look at the master and set the secondary to match. After that, the secondary servers should be able to do their own synchronization.
Note: Each dat file in the data directory will have its own commitCount value and they can be different.
7). After the import is completed and the commit count has been modified, stop the secondary server and revert the changes made to the repo.isMaster.
8). Start the primary and secondary server along with the hawkagents.
NOTE: Before performing this operation, create a full backup of the entire domain data. This ensures that the domain can be restored to the previous state.
Feedback
thumb_up
Yes
thumb_down
No