Understanding thread crossings and queuing across container connections
book
Article ID: KB0075983
calendar_today
Updated On:
Description
Consider a Producer and a Consumer each running in separate containers.
The Producer sends tuples out via a set of streams received by the Consumer via cross-container connections.
In Case A, there is a single stream out of the Producer, and on receipt of a tuple, the Consumer examines various fields in the tuple to route it to the appropriate processing logic.
In Case B there are 5 streams out of the Producer, and the Consumer knows how to process each tuple by virtue of the stream by which it arrived.
Is this correct? In Case A, if the Producer generates 10 tuples and sends them all out the same stream, then they will all be marshaled across the thread/container boundary into the Consumer together when the Consumer is ready to receive them.
Issue/Introduction
Understanding thread crossings and queuing across container connections
Resolution
The thread hops from the Main processing thread are:
1. Producer's Queue handler, which queues tuples until the Consumer says it is ready, and
2. Consumer's Main processing thread, which takes the full queue as currently filled (the Producer's Queue Handler then starts queuing a new set).
So, yes. When the Consumer is ready to dequeue tuples, all queued tuples for that stream are marshaled to the next thread in one packet and one atomic operation.
And, "In Case B there are 5 streams out of the Producer, and the Consumer knows how to process each tuple by virtue of the stream by which it arrived.
...Question 2: In Case B, if we had 2 tuples each in all 5 of the output streams, when the Consumer is ready to dequeue, will there be 5 separate thread boundary crossings, or can all 5 streams be dequeued with a single marshaling operation?"
That means five separate container connections, and a separate output queue for each output stream. There will be five separate thread boundary crossings, each with a packet of tuples from the Producer's output stream's queue. This is why order cannot be maintained across multiple parallel streams, the queues are independent and the Consumer gets each input stream's next packet of tuples independently.
In the normal/ideal case the Consumer is faster than the Producer so the sets of queued tuples are small. Temporary backlogs are somewhat mitigated in that the set of queued tuples grows but is still sent in one thread crossing per packet per connection.
Feedback
thumb_up
Yes
thumb_down
No