Page tree
Skip to end of metadata
Go to start of metadata

Importing and exporting content and data is useful when migrating content from one system to another, taking a backup or bringing in a custom taxonomy. The default file format for content exports is JCR System View XML. The tools available range from a quick right-click export to scripting and writing custom import handlers, depending on the frequency of use.

Export file formats

JCR System View XML

Magnolia exports JCR data to JCR System View XML format by default. Imported XML files must also adhere to the same format. The exported XML reflects the hierarchy of the data in the repository. File names match the path of the data in the repository, such as website.travel.tour.xml. When you export a node its children are also exported.

YAML

You can export JCR data from the config workspace to YAML. This is useful for moving from repository-based configuration to file-based configuration.

YAML export is available for all definition items that can be configured in YAML:

The Export to YAML action is disabled for other items.

When using YAML files originating from export, you might have to change the file name if you want to use it within a module; for instance the name of the YAML file defining an app must reflect the app name with the pattern <app-name>.yaml.

Import and export actions

You can import and export nodes from most workspaces with the Import and Export actions. Export and Import actions are available in the action bar and in the context menu (right-click).

When you import a file the imported nodes become children of the selected parent node. 

The Import and Export actions use the import and export commands to export and import XML. The commands are configured in the ui-admincentral module and implemented in ImportCommand and ExportCommand classes.

To export XML:

  1. Select a node to export the node and its children.
  2. Click Export.

To import XML:

  1. Select a parent node under which you want to import the nodes.
  2. Click Import.

By default new UUIDs are generated for nodes that have the same ID as an existing node in the repository. You can change this behavior by using the Import tool.

Import and export tools

The import and export tools in the JCR Tools app allow you to operate on data in all Magnolia workspaces, including those where the Export and Import actions are not available in the workspace-specific app, for example in the Security app.  

To export:

  1. Select the Workspace where the content resides.
  2. In Base path, type the path to the node to export.
  3. Select Format XML if you need to format an XML file.
  4. Select the type of Compression: XML (no compression), ZIP or GZIP.
  5. Click Execute.

To import:

  1. Select the Workspace into which the content should be imported.
  2. In Base path, type the path into which content should be imported.
  3. Browse to the file to import.
  4. Select how to handle conflicting UUIDs. These options only apply when an identical UUID already exists in the repository.
    • Generate a new id for imported nodes will result in a new UUID being generated for nodes being imported.
    • (warning) Magnolia 5.4.6 +. Only import if no existing node only import and generate a new UUID if the node does not already exist.
    • Remove existing nodes with the same id will result in nodes with the same UUID as those imported being deleted before the import.
    • Replace existing nodes with the same id will result in nodes with the same UUID as those imported being replaced with the imported nodes.
  5. Click Execute.
     

Example: Import behavior

You can see node UUIDs in the JCR Browser by selecting the Display system properties option.

To observe the import behavior options replicate this example:

  1. In the Pages app create a new page, test-page, at the root level, and export the page to XML (website.test-page.xml). Note the jcr:uuid property in the XML. 

     Expand website.test-page.xml
    <?xml version="1.0" encoding="UTF-8"?>
    <sv:node sv:name="test-page" xmlns:sv="http://www.jcp.org/jcr/sv/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
      <sv:property sv:name="jcr:primaryType" sv:type="Name">
        <sv:value>mgnl:page</sv:value>
      </sv:property>
      <sv:property sv:name="jcr:uuid" sv:type="String">
        <sv:value>cdca6945-2e8b-438a-b299-be3db5e718dc</sv:value>
      </sv:property>
      <sv:property sv:name="hideInNav" sv:type="Boolean">
        <sv:value>false</sv:value>
      </sv:property>
      <sv:property sv:name="jcr:createdBy" sv:type="String">
        <sv:value>admin</sv:value>
      </sv:property>
      <sv:property sv:name="mgnl:created" sv:type="Date">
        <sv:value>2017-01-18T07:38:38.654+02:00</sv:value>
      </sv:property>
      <sv:property sv:name="mgnl:createdBy" sv:type="String">
        <sv:value>superuser</sv:value>
      </sv:property>
      <sv:property sv:name="mgnl:lastModified" sv:type="Date">
        <sv:value>2017-01-18T07:38:38.654+02:00</sv:value>
      </sv:property>
      <sv:property sv:name="mgnl:lastModifiedBy" sv:type="String">
        <sv:value>superuser</sv:value>
      </sv:property>
      <sv:property sv:name="mgnl:template" sv:type="String">
        <sv:value>mtk:pages/basic</sv:value>
      </sv:property>
      <sv:property sv:name="title" sv:type="String">
        <sv:value>Test Page</sv:value>
      </sv:property>
    </sv:node> 
  2. Test these options in the Import tool.
    1. Generate a new id for imported nodes

      This is the default behavior used by the Import action in apps. 

      1. In the XML change the node name to <sv:node sv:name="test-page-newID"</sv:node> to avoid same name siblings
      2. In the import tool set:
        1. Set:
          1. Workspace: website.
          2. Path: /.
          3. Behavior: Generate a new id for imported nodes.
        2. Upload the file and execute the import.
      3. In the JCR Browser browse the website workspace: 
        1. Original test-page node still exists.
        2. There is a new page node, test-page-newID, with a new jcr:uuid property. 
    2. Only import if no existing node:
      1. In the XML change the node name back to <sv:node sv:name="test-page"</sv:node>The file is now in the same state as when originally exported. Both the node name and the UUID exist in the repository
      2. In the import tool:
        1. Set:
          1. Workspace: website
          2. Path: /
          3. Behavior: Only import if no existing node.
        2. Upload the file and execute the XML import.
        3. The import fails because the node already exists. 
           
    3. Remove existing nodes with the same id / Replace existing nodes with the same id:
       (warning) These options produce the same behavior, but what happens under the hood differs. The first option first deletes the node and then replaces it, while the second option simply overrides the existing node. 
      1. In the XML in the previous example change any property, for example change the title from Test Page to Test Page Remove and ReplaceBoth the node name and the UUID exist in the repository
      2. In the import tool:
        1. Set
          1. Workspace: website.
          2. Path: /.
          3. Behavior: Remove existing nodes with the same id or Replace existing nodes with the same id.
        2. Upload the file and execute the XML import.
      3. In the JCR Browser browse the website workspace:
        1. Original test-page node has been removed/replaced.
        2. New test-page node has been created with the same UUID as the original page.
        3. title property is Test Page Remove and Replace.

Importing from the file system

You can use the Content Importer module to import content from the file system into the JCR. The module adds bootstrapping capabilities for light modules. See Content Importer module for instructions on how to use the module.

Importing Zip files

You can upload a ZIP file in the Asset app

To import a ZIP file:

  1. Click Upload ZIP archive.
  2. Browse to the file.
  3. In Encoding, select UTF-8 (Windows) or CP437 (Mac) depending on what system the ZIP file was created on.
  4. Click Save.

Content Translation app

In the Content Translation app you can import and export page content in Excel, CSV, ZIP and Google Spreadsheet formats. See Content Translation app for instructions on how to use the module.

Importing and exporting with Groovy scripts

You can export content from Magnolia using a Groovy script. This example exports the about page and its children. The script is equivalent to selecting the about page and using the Export action.

import info.magnolia.importexport.DataTransporter
hm = ctx.getHierarchyManager('website')
aboutRoot = hm.getContent('/travel/about')
xmlFileOutput = new FileOutputStream('C:/test/export/about.xml')
DataTransporter.executeExport(xmlFileOutput, false, false, 
  hm.getWorkspace().getSession(), aboutRoot.getHandle(), 'website', 
  DataTransporter.XML)
 xmlFileOutput.close()

Similarly, you can import content with a Groovy script. This example imports the XML for the careers page and its children. The script is equivalent to selecting the parent page about and using the Import command.

import info.magnolia.importexport.DataTransporter
import javax.jcr.ImportUUIDBehavior
hm = ctx.getHierarchyManager('website')
aboutRoot = hm.getContent('/travel/about')
xmlFile = new File('C:/test/export/about.xml')
DataTransporter.importFile(xmlFile, 'website', aboutRoot.getHandle(), false, 
  ImportUUIDBehavior.IMPORT_UUID_CREATE_NEW, true, true) 

For more information on Groovy see Groovy module. The DataTransporter utility class explains the parameters for the executeExport and importFile methods.

Exporting and importing programmatically

The import and export functionality is implemented in the info.magnolia.importexport package. This implementation is mostly contained in the DataTransporter and PropertiesImportExport classes. You can invoke methods in these classes from your own class.

Here is an example of implementing the executeExport method:

File xmlFile = new File(folder.getAbsoluteFile(), xmlName);
FileOutputStream fos = new FileOutputStream(xmlFile);
try {
  DataTransporter.executeExport(fos, false, false,
  MgnlContext.getHierarchyManager(repository).getWorkspace().getSession(),
  node.getHandle(), repository, DataTransporter.XML);
 }
finally { IOUtils.closeQuietly(fos);}

These classes will not complete the import for any UUIDs that are identical to existing UUIDs.

Use cases

Here are cases when importing and exporting is useful.

Site migration

You can accomplish site migration in a number of ways.

  • For smaller sites (less than 300 pages), simply copy the page content and paste into the editor.
  • For larger sites, scripting is better than copying and pasting. You can use the script examples above to export from one site and import into another. The script can also add Magnolia-specific metadata such as whether a page should be visible in navigation.
  • If you are importing non-Magnolia content, store the content in XML files that adhere to the JCR System View XML Mapping format. If the XML file does not adhere to this format, convert it first. You can do this with a conversion script. The conversion script should identify content types in the file and transform them into the format that Magnolia can import.
  • For importing data, you can create a content app to manage structured data that is independent from page content, such as addresses, employees and client references.

Backup

You can back up content by exporting it to XML and store the files in a disaster recovery system. The file name is the path of the exported data, making identification easier.

The Backup module is an alternative to file system and database backup solutions. With the module you can take manual and scheduled backups.

Importing a taxonomy

How to import tags depends on the size and format of the taxonomy. It also depends on whether you need to do it once or repeatedly. If the taxonomy does not need to be added repeatedly and its size is reasonable, create the tags manually in the Categories app. If the taxonomy is large, import the tags as mgnl:category content type into the category workspace and use the Categories app to manage them, or create your own content app and content types.

Taxonomy sizeImport frequencyRecommendation
SmallOnceCreate the taxonomy by hand. Use the existing mgnl:category content type in the Categories app. If that content type does not work for you, register a new content type in your module descriptor. While you are at it, register a new workspace too.
LargeOnceWrite a groovy script.
LargeRepeatedlyWrite a groovy script and create a command that executes it so that editors can run the process at will or that you can schedule it.

Copying production data to a test environment

Copying production data to a test or development environment is a task you may need to do regularly. You should test new templates and features with realistic content before releasing them production. Here are strategies for prod-to-test exporting.

Option 1: Clone the production instance

Transfer the data and the JCR Datastore (all binaries in the file system) to the test instance.

In production:

  1. Dump the SQL database.
  2. Copy the JCR Datastore folder
  3. If needed, copy the repository folder to preserve the Apache Lucene search index. It is generally not needed as the index and the repository folder are recreated on startup. But it would save time.

In test:

  1. Load the database dump file.
  2. Copy the JCR Datastore folder to the configured place.
  3. Replace the repository folder.
  4. Start the instance.
(tick) Pros(error) Cons
  • 1:1 copy of all data. Everything is identical to production.
  • Very fast. A SQL dump is much faster than JCR export.
  • No running instance is needed as the data is loaded directly into the SQL database.
  • It is not possible to get only a part of the data.
  • Data could be to big for test. All data is usually needed in a staging environment but not in test.
  • Configuration is also an identical copy so it needs to be changed after startup. For example, subscribers still point to the production instance. This is usually solved with an additional Magnolia module that is specific to the test environment. Add such a module .jar to the WEB-INF/lib directory or even Tomcat lib folder. The module jar changes all configurations needed for the test environment.

Option 2: Use the backup and restore JSP scripts

Use the Backup and restore JSP scripts. This option is a quick win as you can be use it without any development and without having to restart the system. Export only the data you need in test. Define the exported content in the JSP, for example:

backup.jsp
public void run() {
   MgnlContext.setInstance(MgnlContext.getSystemContext());
      try{
         backupChildren(ContentRepository.WEBSITE, "/travel");
         backupChildren("dms", "/");
         backupChildren("data", "/products");
         backupChildren(ContentRepository.USERS, "/admin");
         backupChildren(ContentRepository.USERS, "/system");
         backupChildren(ContentRepository.USER_GROUPS, "/");
         backupChildren(ContentRepository.USER_ROLES, "/");
         // backupChildren(ContentRepository.CONFIG, "/modules");
         // backupChildren(ContentRepository.CONFIG, "/server");
         }
      catch(Exception e) {
         logMsg("can't backup", e);
      }
      finally {
         // nothing to do here
      }
      logMsg("backup completed");
   }

In production:

  1. Place the backup JSP script in for example docroot.
  2. Execute the script by requesting its URL. The script creates JCR XML exports into the webapp/backup folder.
  3. Use for example a shell script to copy the exports from the production server to the test server.

In test:

  1. Place the restore JSP script in docroot.
  2.  Execute the script by requesting its URL. The script will import all XMLs automatically. 
(tick) Pros(error) Cons
  • Can be easily be configured to export only parts of the content or only certain workspaces.
  • Is a very stable option as the script can call the garbage collector explicitly and import smaller export files.
  • Easy to use since JSP can be executed by its URL.
  • Easy to extend for more specific needs.
  • Creates order files for importing the content in the right order again.
  • Needs a running system.
  • Much slower than a DB dump
  • It is a JSP script.

Option 3: Use Magnolia's export command

Magnolia's own export command is more flexible but still very comparable to the Jackrabbit JCR import/export tool. You can also use the tool in combination with the Scheduler module or trigger it from any other Java process.

In production:

  1. Trigger the export command regularly with the scheduler and export into the file system. Extend the command so that it transfers the exported data to the test machine.

In test:

  1. Trigger the ImportCommand which imports the transferred XML files.
(tick) Pros(error) Cons
  • The command is available out of the box and well tested since we use it extensively in many places.
  • Can be used in various combinations such as with scheduler or with workflow.
  • More flexible than the backup JSP as it is fully integrated and aware of the Magnolia system.
  • Slower than a database dump.
  • Needs development.