Magnolia 5.6 reached end of life on June 25, 2020. This branch is no longer supported, see End-of-life policy.
Magnolia can be configured to run in a clustered environment to provide high availability and load balancing:
If you consider to use clustering - make sure the following conditions are true for your project and the environment you are running it:
When above conditions are not met, it is in all likelihood better to avoid using cluster and distribute data by other means instead.
(For instance use REST services to write data directly from client or using Magnolia Public instances as proxy writing to such externalized service and being accessible from/via all instances).
We use Jackrabbit's clustering feature to share content between Magnolia instances. Clustering in Jackrabbit works on the principle that content is shared between all cluster nodes. This means that all Jackrabbit cluster nodes need access to the same persistent storage (persistence manager and data store). The persistence manager must be clusterable. Any database is clusterable by its very nature as it stores content by unique hash IDs. However, each cluster node needs its own (private) file system and search index. For more details see Jackrabbit clustering documentation.
Individual workspaces can be mapped to different repositories. The repository that holds shared content can reside in clustered storage. This is useful for content that needs to be in sync across all instances.
In the diagram, each Magnolia instance uses its own persistent storage for storing the content of
config workspaces. However, the
comments workspace has shared content that is stored in a clustered storage, the database in the middle.
User generated content such as comments written by site visitors is a typical clustering case. Imagine that users John and Jane request the same web page. A load balancer redirects John to public instance A. When John leaves a comment on the page, his comment is stored in a workspace that resides in clustered storage. Now Jane requests the same page. The load balancer redirects her to public instance B. John's comment is immediately visible to Jane since both instances tap into the same clustered storage.
Other examples of shared content are user accounts, permissions of public users and forum posts. They need to be available regardless of the instance that serves the page.
Cluster nodes write their changes to a journal that can become very large over time. By default, old revisions are not removed automatically so that you could easily add new cluster nodes. Jackrabbit 1.5 introduced automatic cleaning of the database-based journal using a process called Janitor.
To enable Janitor, see Removing Old Revisions and Add new instances to your Jackrabbit cluster.
Note the following about using Janitor:
LOCAL_REVISIONStable should be removed manually.