When configuring replication agents, one of the options for
the serialization type setting is “Binary less”.
We will see what a “Binary
less” replication mean, how it behaves, the use cases where its applicable and an approach that can be used for configuring it in this blog.
What is Binary less replication?
Binary less replication means that the binaries are left out
from the content being replicated. When replicating an asset for example, only
the metadata of the asset gets replicated. The binaries of the asset which
comprises of the original asset and all its renditions are not included
in the replicated content.
When can it be used?
Binary less replication is useful when multiple AEM
instances are sharing a common datastore. The binaries are shared through the
common datastore and hence there is no need to replicate them to all the instances.
How it works?
It’s important to understand the way binary less replication
works to properly configure it for your scenario.
While creating a replication package for the content being replicated, the
binaries are replaced with their hash code reference. The package with hash code references for the binaries is then sent to the receiver. When the receiver
installs the received replication package, it tries to resolve this hash code
reference to the binary in its datastore.
Since the receiver shares the same datastore as that of the
sender, it resolves the hash code reference to the actual binary in the
datastore and links it where the hash code references are included in the replicated
content.
If the datastore is not configured properly or for some
reason, the receiver is not able to resolve the binary based on the hash
code reference, it falls back to the default replication mode and redoes the replication including the
binaries in the replication package.
The overall replication does not fail in this case. The status if
the binary less replication was successful can be checked in the logs.
Check for log statements with text patterns “FAILED PATHS START”, “FAILED PATHS
END” for details on failed binary less replications and text pattern “set using
a reference” for successful binary less replication.
Use cases of Binary less replication
Binary less replication is useful in setups using shared
datastore across instances. One common use case is when all AEM instances uses a common
datastore (author and all publishers), the replication agents from author to
all the publishers can be configured with binary less serialization type as the asset upload to the author would place the binaries in the datastore - which is shared by all the publish instances as well.
Special cases - Approach for configuring Binary less replication
Replication configuration should be carefully thought-out
when we have setups where all instances do not share the same datastore, but
instead distinct groups of servers have shared datastore.
Some common setup
configurations where this applies are
- A shared datastore for all publishers, but a separate data store for author
- Separate shared datastore for Primary and DR environments
Let us take the case 1 where the author has a separate
datastore and all publish instances share a common datastore.
This
configuration is illustrated below
In this case, configuring binary less for replication to all publish instances from author would cause the replication to fail and fallback to default
replication in an ad-hoc manner. The asynchronous nature of replication to all
publishers simultaneously causes the no. of failed binary less replication to
be high.
To overcome this situation, its ideal to designate one publish
instance as a Gateway instance in such scenarios. Configure author to perform default replication to
this gateway publish instance. This step would make sure that the binary gets
replicated to the single publish instance and gets persisted in its datastore (which is also shared by other publish instances)
Now configure the gateway instance to chain replicate the
content to the other publisher instances in binary less mode. Chain
replication starts after the successful install of the content on the gateway
instance. This ensures that the binaries are replicated and persisted in the
shared datastore through the gateway instance before the binary less
replication kicks in for replicating content to other instances in the cluster.
This configuration is illustrated in the below diagram
The author in this case should be configured to perform default replication to the
gateway instances of both the Primary and DR cluster. The gateway instances can then chain
replicate the content to other instances within its cluster.
Limitations with this approach:
Using this approach discussed in the above section to
leverage binary less replication through gateway instances does come with a few
limitations that we need to be aware of and plan for
Introduces delay in replication completion
It introduces delay in the replication
completion. Also during the interval between replication to gateway and the completion
of replication, the content between the gateway instance and other instances
within a cluster would be out of sync.
Usually this gets completed in few seconds but could go
higher depending on the load on the system and the no. of concurrent replications performed.
In cases
where content sync mismatch for even such short duration is not acceptable,
the gateway could be removed out of the instances serving the content to the
end users. This way only the other publish instances which have their content
in near real-time sync would be serving content for the end users.
Replication status reflected on author:
Note that with this approach, the status turns green
(stating replication successful) as soon as the replication to the gateway
instance is successful. This could send a wrong signal to the content
publishers, especially when there is delay or issues encountered in chain
replication.
Gateway instance failure
Another aspect that must be planned for with the above configuration
is in the event of gateway instance failures.
When a gateway instance fails for some
reason, another instance within the cluster should be promoted as the new
gateway with replication agents on the author and the new gateway reconfigured
to have a working setup. Be ready with scripts to reconfigure the instances which
can be run in the event of a gateway failure.
No comments:
Post a Comment