unrecoverable client-server communication problem

unrecoverable client-server communication problem

book

Article ID: KB0083751

calendar_today

Updated On:

Products Versions
TIBCO Administrator -

Description

Resolution
=========
Adding a secondary admin server in an existing file based Administrative domain results in the following error in the administrator.log upon startup of a secondary server:

Error [TIBRepoEntityStore] AESDKJ-0000 [main] com.tibco.infra.repository.RepoException: Encountered unrecoverable client/server communication problem
           at com.tibco.repo.RemoteRepoClient.a(RemoteRepoClient.java:1931)
           at com.tibco.repo.RemoteRepoClient.a(RemoteRepoClient.java:1929)
           at com.tibco.repo.RemoteRepoClient.a(RemoteRepoClient.java:1929)

Issue/Introduction

Adding a secondary admin server in an existing file based Administrative domain results in the following error in the administrator.log upon startup of a secondary server. Error [TIBRepoEntityStore] AESDKJ-0000 [main] com.tibco.infra.repository.RepoException: Encountered unrecoverable client/server communication problem at com.tibco.repo.RemoteRepoClient.a(RemoteRepoClient.java:1931) at com.tibco.repo.RemoteRepoClient.a(RemoteRepoClient.java:1929) at com.tibco.repo.RemoteRepoClient.a(RemoteRepoClient.java:1929)

Environment

Product: TIBCO Administrator Version: OS: all --------------------

Resolution

Workaround
==========
In case the directory structure for the primary server and secondary server are the same.

1). Stop the primary admin server and all hawkagents in the domain.

2). Copy the entire data folder from the location &lttibco_home>/administrator/domain/&ltdomain name>/data on the primary machine over to the secondary machine after creating a backup of the above mentioned folder on the secondary server.

3). Restart the primary admin server and the secondary admin servers along with the hawkagents.

In case the directory structure for the primary server and secondary server are different:

1). Add the secondary server to the domain.

2). If you see the stack trace above upon starting the secondary server after some time, then let it run for 20-30 minutes to give it sufficient time to try to sync other repositories, i.e., application repositories and AUTH_&ltdomainname>.dat repo.

3). Stop the secondary server and the hawkagent that was started on this machine. Stop hawkagents on all machines that are part of the domain and do not perform any activity such as deployment/undeployment on this admin server.

4). Use the RepoExport utility to export the SYS_&ltdomain name>.dat from the primary server.

5). Stop the primary server and edit the secondary server's .tra file, i.e., tibcoadmin_&ltdomain name>.tra file and change the  repo.isMaster=true

6). Start the secondary server and once it is started perform a RepoImport of the SYS_&ltdomain name> repository. Once the import is complete, manually edit the commitCount value in the SYS_&ltdomain name>.dat file created on the secondary server. The user can look at the master and set the secondary to match.  After that, the secondary servers should be able to do their own synchronization.

Note: Each dat file in the data directory will have its own commitCount value and they can be different.

7). After the import is completed and the commit count has been modified, stop the secondary server and revert the changes made to the repo.isMaster.

8). Start the primary and secondary server along with the hawkagents.

NOTE: Before performing this operation, create a full backup of the entire domain data. This ensures that the domain can be restored to the previous state.