Thursday, June 27, 2019

Poison messages on Replication Queues



So what does it take for a replication queue to get blocked. Well… Just one bad item on the queue. 

Yes. One item that has issue would completely block the queue.

This is because of the way the items in the queue gets processed. The items in the queue are ordered and are processed strictly in the first-in-first-out (FIFO) order. 

This order needs to be preserved to make sure that there are no overlapping writes of the content and the data integrity is preserved.

So what happens when a single item could not be processed?

Simply it will not get removed from the replication queue and will continue to remain at the head of the queue. Meaning it will continue to remain as the next item to be processed. When the retry happens, this item again fails thus continuing to remain at the head of the queue forever. 

Unless this item gets processed and removed from the queue, the next items will not get a chance to get processed

This can be checked by looking at the queue item details in JCR. Look under the path /var/eventing/jobs/assigned in JCR to locate the queue item.


The first item in a blocked queue would have undergone multiple retries
















While all the subsequent items would have retry count as 0, indicating the processing has not happened for it













No comments:

Connected Assets

This is a feature introduced in 6.5 release.  To understand the concept of connected assets clearly, it is essential to understand th...