The Synchronization module synchronizes a target Magnolia instance with a source instance. The module allows you to publish a large amount of content selectively. Only previously published content is transferred to the target instance. You can use the module to create new public instances without shutting down existing instances and impacting their ability to serve content.
The module traverses the node tree and publishes only previously published content. Content that was never published is not published during synchronization either. If content was versioned when it was published (Magnolia default behavior), the module publishes the last known version, making it possible to recover modified content.
The Synchronization module installs the Sync Instance app that allows you to manually synchronize content between author and one public instance.
The Synchronization module is not bundled with Magnolia. You can download it from our Nexus repository
Synchronization is a Enterprise Edition module. It is typically not installed.
Create a backup of your system before you install a module. Uninstalling a module is not as simple as removing the .jar file. Modules add and change configurations and may change the content. Try new modules in a test environment first. A module consists of a JAR file and may include dependent JAR files. Modules are responsible for updating their own content and configurations across versions. Be sure to keep only one version of each module and its dependencies.
To install a module:
- Stop the application server.
- Copy the module JAR files into the
WEB-INF/libdirectory. The location of this directory depends on the application server.
- Restart the server.
- Go to the AdminCentral URL.
- Start the Web update process.
- Click Start up Magnolia.
Repeat the steps for each author and public instance.
To uninstall a module, remove the module JAR file from the
/WEB-INF/lib folder and restart Magnolia.
However, this is rarely enough. Modules add and modify configuration during installation. The use of a module also changes content. Removing all of these changes is difficult without knowing the installation tasks in detail.
To test a module, use the embedded Derby database and take a backup of your
repositories folder. Install the module and try it. When you are done testing, remove the module JAR and restore the
repositories folder from the backup. This way you can go back to square one.
We also recommend that you segregate the development and production environments. Take regular backups for disaster recovery so you can revert to a prior state in a routine fashion.
Public instance failure
Imagine that you have two public Magnolia instances. Due to hardware failure one of them is out of operation. As you try to publish content during the outage, Transactional Activation tells you that the content cannot be activated because one of the public instances is down.
Since you really need to publish new content to the remaining public instance, you make a conscious decision to switch off the subscriber that suffered the hardware failure. Now you can publish the content while waiting for the failed hardware to be replaced.
A few days later hardware on the failed public instance has been replaced and the server is up again. You re-enable the subscriber so that all new content is activated to both public instances. But you still have a problem what to do with all the content that was published while the instance was down. Your options are:
- Republish everything to both instances. This is a time-consuming process, generates high load, and slows down your site during publishing.
- If you kept a list of the published pages, you know exactly what to activate to get them on both instances. This works well for small deployments with infrequent updates and a single editor.
- Use the Synchronization module. Set the previously broken public instance as a synchronization subscriber and Magnolia will take care of synching the missing content.
Public instance blackout
All public instances are corrupted, broken or compromised. No instances exist to serve content. A small site can deal with this by creating a new public instance and publishing all content to it. This is difficult in large deployments that have many editors, where content has already been modified since the blackout took effect, and where some pages are not yet ready to be published across the site. Use the Synchronization module to activate any previously published versions of content, even if the content was modified further, and skip any pages that were not previously published.
You have a sudden high load on your site. You need to add a new public instance to deal with the load.
- You cannot shut down any of the existing public instances because you need them to deal with the load. This prevents taking a snapshot for cloning.
- You cannot activate all content from the author instance to public instances since this would unnecessarily flush the cache on them and increase load when the servers are already busy.
The solution is to create a new empty public instance and use the Synchronization module to publish content only to that instance while the existing public instances keep serving content.
The Synchronization module is configured at Configuration >
Synchronization is controlled by the
BaseActivationCommand) registered at Configuration >
syndicator: Registers the syndicator class.
SilentXASSyndicatorperforms synchronization of content without update to metadata
subscriber: Synchronization configurations (see below).
The Sync Instance tool allows you to manually synchronize content between author and a single public instance. The operation is performed asynchronously.
To perform a manual synchronization:
- Select the workspace.
- Type the path to be activated.
- Optionally select the checkbox to activate all child nodes recursively
- Click Start. Synchronization starts within one minute of execution.
- Click Refresh to see the current status of the synchronization.
All website pages.
All pages under
All admin level users.
Before synchronizing the
data workspace, make sure the data type root path ( / ) is activated.
You can schedule synchronization jobs using the Scheduler module. The purpose of scheduling is not to synchronize an instance repeatedly, because this leads to unnecessary flushing of the cache and increases load. The aim is to schedule the sync to occur at a convenient later time such as during low traffic volume.
/modules/scheduler/config/jobs/demo node and edit its properties.
|0 0 1 5 2 ? 2014|
|Synchronize news at midnight|
<job name>: Name of the job,
sync-newsin our example.
params: Parameters passes to the command.
SynchronizationCommandexpects/allows the following parameters
path: Path to the content, for example
recursive: Set to
trueto synchronize the node and subnodes.
repository: Workspace where the content resides, for example
active: Set to
trueto enable the job.
catalog: Name of the catalog where the command resides. The
synchronizecommand resides in the
command: Name of the command definition node,
cron: Schedule that indicates the execution time, written as a CRON expression. In our example
0 0 1 5 2 ? 2014will run the job on February 5 at 01:00 am.
description: Job description.
Test the synchronize command un-scheduled first. If it runs correctly, schedule it to publish to a new public instance after one minute (CRON expression:
0 * * * * ?). If this works correctly too, point the subscriber configuration to the out-of-sync target instance and modify the CRON schedule as required.