book
Article ID: KB0092675
calendar_today
Updated On:
Description
Resolution:
This problem only occurs occasionally, but typically following a Gridlib release. What happens is the customer Releases a gridlib, updates version in Service Type Registry to new version. Engines restart to load updated Lib, and the following error occurs:
Notice the gridlib manager has not finished loading the gridlibs:
/06/06 10:52:45.523 Info: [GridLibManager] Added: VenusAnalytics-1.1.280
12/06/06 10:52:45.882 Info: [JSTaskRunnerPlugin] Logged in to the Broker, will start running tasks
12/06/06 10:52:45.960 Info: [ServiceEvent] Executing:EquityGenericPricer-460417840864027220-154
12/06/06 10:52:45.960 Severe: [TaskExecutor] while executing task: EquityGenericPricer-460417840864027220-154
Engine start failure: Could not launch Engine with specified JRE
at com.livecluster.engine.TaskExecutor.checkForStartupFailure(TaskExecutor.java:222) at com.livecluster.engine.TaskExecutor.executeTask(TaskExecutor.java:77) at com.livecluster.engine.JSTaskRunnerPlugin$WorkerRunner.runTask(JSTaskRunnerPlugin.java:449) at com.livecluster.engine.JSTaskRunnerPlugin$WorkerRunner.run(JSTaskRunnerPlugin.java:398) at com.livecluster.util.threadpool.ReusableThreadPool$InternalThread.run(ReusableThreadPool.java:335)
12/06/06 10:52:46.023 Info: [ServiceEvent] Finished:EquityGenericPricer-460417840864027220-154 12/06/06 11:12:03.875 Info: [JSTaskRunnerPlugin] Engine got a wakeup from the scheduler 12/06/06 11:12:03.890 Info: [ServiceEvent] Executing:EquityGenericPricer-5944146634633863147-78 12/06/06 11:12:03.906 Info: [GridLib] Grid Library does not match this OS or compiler, so ignored: cppbridge-hpux64-4.2 12/06/06 11:12:03.906 Info: [GridLibManager] Skipped: cppbridge-hpux64-4.2 12/06/06 11:12:03.906 Info: [GridLib] Grid Library does not match this OS or compiler, so ignored: cppbridge-linux-gcc2-4.2
The workaround is to increase the "Synchronization Timeout" in Broker resource synchronization configuration page before synchronizing large resources.
The definition of a large gridlib depends on the number of simultaneous downloads and the time for download. For example, if a 100 MB GL needs 20 seconds to download over WAN and a 200 MB GL only needs 10 seconds to download over LAN, the first one is actually relatively larger.
As a rule of thumb, if a resource download takes more than 120 seconds to download, it's considered a large GL as our synchronization timeout is 300 seconds by default. Please note that the timeout value is for a set of resources, which might be one of the following: 1. engine updates, 2. all grid libs, 3. platform specific resources.
Issue/Introduction
Grid libraries are not fully loaded before executing the task