Saturday, June 08, 2019

AEM Storage – Terms


When talking about AEM storage, we keep referring to multiple terms. Though TarMK and MongoMK form the basic choices to choose from when architecting an AEM solution, when we dig a little deeper, we hear terms like Nodestore, Segmentstore, Documentstore, Blogstore and Datastore. I am trying to define these terms in as simply and clearly as possible in this blog.

Nodestore forms the basis of AEM storage structure. We can choose to deploy AEM with just the nodestore. This is the default configuration and in this configuration, all the content of AEM (which includes almost everything in AEM) is persisted in the nodestore. As of now, there are two types of nodestore implementations supported – Segmentstore and Documentstore.

Segmentstore – is a type of nodestore implementation using file system to store the content and this forms the basis for TarMK implementation

Documentstore – is another type of nodestore implementation using document storage of NoSQL databases. MongoMK is one realization of the documentstore implementation.

For an AEM installation, a nodestore is mandatory and can choose between a segmentstore (available as TarMK) of Documentstore (available as MongoMK)

Now instead of having all content in nodestore, we can choose to store the binary data (technically all content greater than a specified size limit) separately outside of the nodestore. This storage is referred to as Datastore.

Datastore is simply a file system storage under which the binary files are stored and are referred to from the nodestore. The checksum value of the binary content is used as the name of the file for the storage and the storage itself is organized into folder structure based on starting characters of this checksum value. 

This way the file storage gets organized for easy retrieval and not all the files are stored under the same folder. Also this avoids storing of duplicate binary data when the same file is uploaded under multiple paths in AEM

AEM currently supports 3 implementations of datastore:
  • FileDataStore – Places the datastore on the local file system or on a mapped volume
  • Amazon S3 – Places the datastore on S3 storage
  • Microsoft Azure - Places the datastore on Azure storage


Blobstore is just another name for datastore and both refer to the same storage mechanism for binary



No comments:

Connected Assets

This is a feature introduced in 6.5 release.  To understand the concept of connected assets clearly, it is essential to understand th...