book
Article ID: KB0090995
calendar_today
Updated On:
Description
Resolution:
Here’s an example of a situation when you could receive duplication of messages in a distributed queue implementation:
Consider a scenario in which we have a CM sender and 9 DQ members (1 will be a scheduler and other 8 are DQ workers).
Assume the worker task is 1 for all worker members and the CM sender has sent out 8 messages, which the scheduler has dispatched in order (i.e. message 1 to worker 1, message 2 to worker 2, and so on and so forth).
All 8 workers have sent back their message confirmations but somehow the confirmation of worker 4 got lost in the network. In this situation, the scheduler will hold on to confirmations 5, 6, 7, and 8 (and not pass them on to the CM sender) until it receives confirmation 4. At the moment, even though all workers have completed/confirmed their tasks, the last confirmation actually received by the CM sender is 3. In effect, messages 1, 2, 3 have been deleted from the ledger and messages 4, 5, 6, 7, and 8 are still intact in the CM sender’s ledger. If for some reason, the scheduler gets restarted, the CM sender will be requested to retransmit messages 4, 5, 6, 7, and 8 causing message duplication.
Additional notes:
- Retransmission should not occur if any of the workers (not the scheduler) are restarted. Retransmission should not occur if the CM sender is restarted. As a result, duplication would not happen in the above situations.
- Although certified messages always dispatch from the queue in order of sequence number, it is still possible that a program might process them out of order. For example if a program dispatches the queue from several threads, the thread processing message 14 might return from its callback function before the thread processing message 13.
Issue/Introduction
What can cause duplication of messages when using distributed queues?Please give an example.