When we click the activate button, we know that the item
activated gets replicated to all the publish instances that have an active
replication agent configured.
The status of the replication is reflected as
yellow icon while the replication is in progress which subsequently turns green
or red depending on the final replication status.
But what happens internally during this process?
A sequence of steps happens on the sender side before
an item gets placed in the replication queue of each applicable replication
agent after which the content gets transferred to the receiver where it gets processed to complete the replication.
At a high level, the following are the steps that are performed
- Version creation
- Activation process on sender and placement of queue item
- Transfer of replicated content to receiver
- Processing replication on receiver
- Status update and retry if applicable
The high level flow is depicted in the below diagram
Version creation
The first step performed in activation is the creation of a
frozen version of the content being replicated. This ensures that subsequent
edits to the content while the replication is in progress do not impact the
content being replicated.
The frozen version thus created gets attached to the
replication process and it’s this frozen content that would get replicated.
Activation process on sender
When the activation process kicks in, a sequence of steps
happens and results in an item getting placed in the replication queue of each
associated agents. These steps are
Configuration
Collation
One or more replication agents could be configured as active
depending on the no. of publish instances to replicate to. The configuration of
these replication agents could defer to the extent that the content to be
replicated for a given activation could be different. For this reason, the
replication package is created separately for each of the replication agent
configured.
The first step in the activation process on the sender, is the
identification of all the active replication agents that are configured. For each
of the agent identified, its configuration is collated and kept as a ReplicationOptions
object.
Permission check
Then a check is performed to validate if the user performing
the activate action has the required permission to replicate the content that
are selected for activation. Only the nodes that the activating user has
replication permission on gets included for replication.
Preprocessing
After collating the replication agent configuration and
performing the permission check, all the preprocessors (optional, if there are
any), gets applied.
A custom preprocessor can be implemented by creating a class
that implements the Preprocessor interface. Providing an implementation for its
preprocess method in this class and configuring it as a service will register
it as a preprocessor.
All the preprocessors that are thus configured gets called
at this stage before proceeding to the next step.
Replication package
creation
After applying all the preprocessors, the next step is the creation
of the replication package for the activated item.
A different replication
package gets created for each replication agent depending on its configuration
(based on the ReplicationOptions object created in the previous steps) and
using the serializer as per the serialization type configured.
The replication package contains all the information needed
for replication process to complete on the receiver end.
Queuing (Persist in
JCR)
The created replication package gets persisted in JCR along
with other metadata information like the item on which the activation is
performed, type of action, user id, and so on. It gets stored in JCR under the
node /var/eventing/jobs.
Once the replication package gets persisted in JCR, the activation
process steps on the sender side is complete and is ready for transport over
the network on to the receiver.
Also at this stage, the item is visible in the queue
associated with the replication agent. The items in the queue are shown by
querying for pending items under /var/eventing/jobs for the queue id associated
with that replication agent.
Transfer to receiver
The responsibility of transferring the replication package
over the network lies with the sender AEM instance. The sling job monitors for
items that becomes available for replication and kicks off the process to
transfer it to the receiver.
The sling job for a queue is synchronous. It processes the
first item in the queue and only when its complete, it picks up the next item
for processing.
Processing on receiver
The listener on the receiving end on receiving a replication
package, performs deserialization of the received package and installs it to
get the content replicated on to itself.
A success status is sent back if all
the steps are successful on the receiver side. If any of the step fails, a
failure status is returned prompting the sender to retry
Status update and retry
After transferring the replication package, the sling job on
the sender waits for the response from the receiver indicating the status of
the replication at the receiving end.
If the response is successful, it removes
the item from the queue (deletes the item from JCR), marking the status of
replication as successful.
In case the replication is not successful on the receiving
end, the item on the queue is retained for reprocessing. The sling job then
waits for the ‘Retry Delay’ duration to elapse before retrying to send the item
again to the receiver.