EMS server high CPU utilization /hang problem after destroying consumers.

EMS server high CPU utilization /hang problem after destroying consumers.

book

Article ID: KB0091131

calendar_today

Updated On:

Products Versions
TIBCO Enterprise Message Service -
Not Applicable -

Description

Resolution:
Description:
= = = = = =
We noticed that the EMS server hangs about 4 hours after destroying consumers with a lot of unacknowledged messages. At the same time, the EMS server was experiencing high CPU utilization. In the pstack &lttibemsd pid> Results, the EMS server was busy on thread:

#0  0x00000000004a4d7b in ?? ()
#1  0x00000000004a53c0 in _emsdConsumer_DestroyWithList ()
#2  0x00000000004f9320 in _emsd_CloseConsumer ()
#3  0x00000000004dc555 in ?? ()
#4  0x00000000004dd0ad in _emsd_HandleMetaIOSuccess ()
#5  0x0000000000506f77 in ?? ()
#6  0x000000000050338a in _emsdPhasedMsgList_CatchUpHereReal ()
#7  0x00000000004dd32f in _emsd_StorageThread ()
#8  0x000000393640673d in start_thread () from /lib64/libpthread.so.0
#9  0x00000039358d40cd in clone () from /lib64/libc.so.6


1). We can reproduce the problem by simulate your production environment with an EMS 5.1.3 EMS server:

1,000,000 pending messages on one queue test1.
500,000 pending messages on queue test1 and test2.
Start 4 consumers with CLIENT ack mode, without ack the messages, consume 100,000 messages.
Close the 4 consumers.

The key points in reproducing the problem is running multiple consumers and closing the multiple consumers  high “delivered” message count.

2). We can not reproduce the same problem with EMS 6.3.0+

We cannot reproduce this problem with EMS 6.3.0+  due to EMS engineering improving  the implementation for recovery unacknowledged messages. That is, the messages were delivered to the consumer client but the consumer exits before acking those messages.

Resolution:
= = = = = = =
To resolve the hang issue, we would suggest upgrading to EMS 7.0 or at least EMS 6.3. Also, investigate there are so many pending messages and high unacknowledged messages from the application.

Issue/Introduction

EMS server high CPU utilization /hang problem after destroying consumers.