Magnolia CMS can be configured to run in a clustered environment to provide high availability and load balancing:
- High-availability clusters are also known as fail-over clusters. Their purpose is to ensure that content is served at all times. They operate by having redundant instances which are used to provide service when a public instance fails. The most common size for a high-availability cluster is two public instances, the standard Magnolia CMS setup. In such a setup the redundant instance may even be dormant (not actively serving content) until it is called to service.
- Load-balancing clusters connect many instances together to share the workload. From a technical standpoint there are multiple instances but from the website visitor's perspective they function as a single virtual instance. A load balancer distributes requests from visitors to instances in the cluster. The result is a balanced computational workload among different instances, improving the performance of the site as a whole.
We use Jackrabbit's clustering feature to share content between Magnolia CMS instances. Clustering in Jackrabbit works on the principle that content is shared between all cluster nodes. This means that all Jackrabbit cluster nodes need access to the same persistent storage (persistence manager and data store). The persistence manager must be clusterable. Any database is clusterable by its very nature as it stores content by unique hash IDs. However, each cluster node needs its own (private) file system and search index. For more details see Jackrabbit clustering documentation.
As discussed in Workspaces, individual workspaces can be mapped to different repositories. The repository that holds shared content can reside in clustered storage. This is useful for content that needs to be in sync across all instances.
In the diagram below, each Magnolia CMS instance uses its own persistent storage (gray databases) for storing the content of
data workspaces. However, the
comments workspace has shared content that is stored in a clustered storage (blue database).
User generated content such as comments written by site visitors is a typical clustering case. Imagine that users John and Jane request the same web page. A load balancer redirects John to public instance A. When John leaves a comment on the page, his comment is stored in a workspace that resides in clustered storage. Now Jane requests the same page. The load balancer redirects her to public instance B. John's comment is immediately visible to Jane since both instances tap into the same clustered storage.
Other examples of shared content are user accounts, permissions of public users and forum posts. They need to be available regardless of the instance that serves the page.
When updating a clustered installation, cluster nodes should be updated individually. Update the first cluster node and wait for it to finish before updating subsequent nodes.