Page tree
Skip to end of metadata
Go to start of metadata
The Backup Extended module uses Tasks to manage backups. Both immediate and scheduled backup are possible. It provides an app for managing backups. Group configuration for scheduled backups. Allows for multiple target configurations in a light module. Backup from any database type to any other database type directly.

Installation

Maven is the easiest way to install the module. Add the following dependency to your bundle:

<dependency>
  <groupId>info.magnolia.backup</groupId>
  <artifactId>magnolia-module-backup-extended</artifactId>
  <version>${backupExtendedVersion}</version>
</dependency>
<dependency>
  <groupId>info.magnolia.backup</groupId>
  <artifactId>magnolia-module-backup-extended-ui</artifactId>
  <version>${backupExtendedVersion}</version>
</dependency>

Versions

0.4 Magnolia 6.2

Usage

The module has two options for creating backups. Backup now and Schedule backup. Ultimately they end up calling the same action as all backups end up as scheduled backups. The difference with Backup now is an addition of a slight, configurable, delay before executing begins.

Create Backup

Initiate a backup on demand using the Backup now action. None of the fields are required. The default backup will produce a copy of the current instance's magnolia repository in the /${magnolia.home}/backups directory using H2 database.

The groups to be notified can be selected on the second tab. All selected groups will be able to see the backup data from the Pulse. Only members of the backup group can abort backups from the Pulse.

When saving the dialog the backup is started with a slight delay (10 secs). You can still abort the task from the Pulse before the expiration of the delay.

Same fields as Backup now with an additional field for a future date time and the option to set a frequency. With a frequency the scheduled backup will automatically schedule the next backup as part of the job.

Create public backup

Each dialog allows for the option to initiate a public instance backup from author. Typically backing up a single public instance is enough but it's possible to add more options to the dialogs.

In your light module create a rest client which calls backupx. Be sure the backup extended core module has been installed on the target instance.

magnoliaPublic.yaml
baseUrl: http://localhost:8080/magnoliaPublic
securitySchemes:
  basic:
    $type: basic
    username: superuser
    password: superuser
restCalls:
  backupx:
    headers:
      Content-Type: application/json; charset=UTF-8
    method: post
    entityClass: com.fasterxml.jackson.databind.JsonNode
    path: /.rest/commands/v2/backup/backupx
    securityScheme: basic

The name of the rest client should be synchronized with the value of the radio button on the dialog. Decorate the backup extended dialogs to add more options. 

Backup Tasks app

Tasks are used to schedule the backups. First a task is created to hold all the parameters of the backup. The end result of the backup (successful or failed) will be recorded to the task. All metadata is visible through the detail subapp including the content (the parameters submitted) and the results of execution.

Task Browser

The browser subapp shows all tasks. Find backup tasks by the date they were created. The name column will help to differentiate between backup tasks and all other tasks. 

 

Backup Details

Every backup, whether success or failure, will have a message with start and finish times in the Results section.

Groups

The module installs a new backup group for Backup Admins. The backup group is added to the superuser by default upon installation. Users in this group can access the Backup Tasks app and see the scheduled backups in the Pulse. They can abort scheduled backups from the Pulse.

Backupx Command

The backup extended command is used to trigger backups without the need of the UI. The command can execute on events such as workflow steps or rest calls. The backupx command is located within the backup catalog.

Properties:

directory

optionaldefault ${magnolia.home}/backup/{epoch_second_count}

The file system directory where the backup can place files. If no directory is supplied the magnolia home directory is used.

targetConfig

optionaldefault /backup-extended/target-confs/jackrabbit-bundle-h2-search.xml

The repository configuration file for the backup. This can be a file in a light module or a file reachable through an http address. The default is an H2 backup since this typically produces the quickest backups. 

comment

optionaldefault "Backup triggered through command."

A comment regarding the backup, such as, why and/or how it was created. 

repository

optionaldefault magnolia

The repository to be backed up. Typically this is the magnolia repository but backups could be made more efficient by splitting the repository into content and configuration repos. With separate repos they can be backed up separately at different rates and/or times. See Jackrabbit Repository Splitting.   

delay

optionaldefault 10 seconds

The delay is used when a backupDate is not supplied. Simply adds a slight delay before the backup executes. All backups end up as scheduled tasks. 

groupIds

optionaldefault backup group

A list of group Ids which can see and/or administer the backup from the pulse. In some cases users in these groups could abort the scheduled backup. 

backupDate

optionaldefault now plus the delay

A concrete date and time to initiate the backup.

REST based backup

By default the module will make the backupx command available within the rest-services module. Backups can be created via REST for a variety of reasons.

Example cURL calls:

curl http://localhost:8080/magnoliaAuthor/.rest/commands/v2/backup/backupx \
-H "Content-Type: application/json" \
-X POST --user superuser:superuser
curl http://localhost:8080/magnoliaPublic/.rest/commands/v2/backup/backupx \
-H "Content-Type: application/json" \
-X POST --user superuser:superuser \
--data \
'{
  "comment": "Backed up via REST",
  "directory": "/opt/backups/todaysDate",
  "targetConfig": "/backup-extended/target-confs/jackrabbit-bundle-derby-disabled-search.xml",
  "backupDate": "2021-08-13 21:53:00.000"
}'
curl http://localhost:8080/magnoliaAuthor/.rest/commands/v2/backup/backupx \
-H "Content-Type: application/json" \
-X POST --user superuser:superuser \
--data \
'{
  "comment": "Backed up via REST",
  "directory": "/opt/backups/todaysDate",
  "targetConfig": "http://localhost:8080/magnoliaAuthorBackup/docroot/jackrabbit-bundle-h2-search.xml",
  "backupDate": "2021-08-13 21:53:00.000"
}'

Autoscaling

Multiple target configurations are possible with this module. Any reachable resource can potentially become a configuration for a target instance. Supply the configuration files in a light module. It would only be practical in the most optimized of deployments with small to medium size repositories. 

Change database types

Using this module you can migrate from one database type to another. For example, go from MySQL to PostgreSQL or vice versa. Install the module into the current system and create a target configuration for the desired system.

Warnings

  • This module is at INCUBATOR level.
  • This module is designed for use with small to medium sized repositories.
  • This module uses the RepositoryCopier from JackRabbit. Taking backups in a running system may never be 100% reliable. This module should used as another layer of backup protection.
  • Backups created by the module should be routinely tested for reliability.
  • When using file system datastore there should be enough file descriptors available to copy the entire datastore. See Too Many Open Files
  • Backed up repositories will need to be reindexed. Comment out the indexing configuration of the target to avoid generating an index during the backup. An example is provided in the module: /backup-extended/target-conf/jackrabbit-bundle-derby-disabled-search.xml
  • Failed backups should be aborted to avoid a rerun on the next startup.

Best practices

Ideally the backup needs to complete as soon as possible. The process should be refined over time to achieve the quickest possible backup times. Statistics are kept on the task itself to track performance and success. Email notifications about backups can be integrated using Task Email Notifications.

  • Binaries: This type of data would best be handled by a third party asset provider. Storing binary files inside JCR is not efficient for fast backups.
  • Indexing: In the target configuration further reduce noise and overhead by commenting out the indexing configuration for the backup process.
  • Databases: Backing up directly to the target database type further reduces friction as a restore procedure is no longer required. 
  • Externalize: Try to reduce over all dependence on JCR when it makes sense. For example, delegate user management to a third party provider.
  • Optimize: Remove any unnecessary modules which might bloat the repository with unused workspaces.

Troubleshooting

  • Increase logging levels to see finer detail:

    <Logger name="info.magnolia.module.backup" level="ALL"/>
    <Logger name="info.magnolia.task.schedule" level="ALL"/>
  • Common error codes

    MessageExample
    Failed to copy content
    Caused by: org.apache.jackrabbit.core.data.DataStoreException: 
    Record 1973a5c3c21db3b70e903236d33a526709a49646ea6ad1d34eed094f2565819f does not exist

    Solution: See MGNLDAM-970

    Caused by: javax.jcr.RepositoryException: Could not read from stream
    Caused by: java.io.FileNotFoundException:
    /opt/magnolia/repositories/magnolia/repository/datastore/ae/1f/4e/
    ae1f4e5ecf56bf36299f92796f7c3e463ebc87c697f48d316e00945ebd2c40f9 
    (Too many open files)

    Solution: See Too Many Open Files.

    Configuration file could not be read
    Caused by: java.io.FileNotFoundException: 
    /jackrabbit-bundle-derby-disabled-search.xml (No such file or directory)

    Solution: Most likely the configuration file couldn't be copied due to a permissions issue.

    Configuration file syntax error
    Configuration file syntax error. (Line: 155 Column: 3)
    Caused by: org.xml.sax.SAXParseException: 
    The element type "Cluster" must be terminated by the matching end-tag "</Cluster>".

    Solution: The configuration file is not well formed. The error log should point directly to the issue. Typically a parsing issue.

    Could not load JDBC driver
    Caused by: com.mysql.jdbc.Driver

    Solution: You must install the driver for the target instance. When using multiple database types don't forget any extra drivers required by the config files. 

Changelog

  • Version 0.5 - beta
    • Dialog fields are optional with descriptions about default values
    • Initiate public instance backups from app
  • Version 0.4 - beta
    • Decouple the ui from the core functionality 
  • Version 0.3 - beta 
    • Frequency control for scheduled backup
    • Multiple repository support
    • Dynamic group configuration
  • Version 0.2 - beta
  • Version 0.1 - Initial beta release of the extensions version of the module.