Magnolia 5.3 reached end of life on June 30, 2017. This branch is no longer supported, see End-of-life policy.

Page tree
Skip to end of metadata
Go to start of metadata

Importing and exporting content and data is useful when migrating content from one system to another, taking a backup or bringing in a custom taxonomy. The default file format for content exports is JCR System View XML. The tools available range from a quick right-click export to scripting and writing custom import handlers depending on the frequency of use.

File format

The XML that Magnolia understands adheres to JCR System View XML Mapping. Magnolia exports into this format by default. Imported XML files must also adhere to it. When used to export data, the XML structure reflects the hierarchy of the data in the repository. The name of the export file reflects the path of the data in the repository such as website.demo-project.about.xml. When used to export a node, the node's child nodes are also included.

Tools

Import and export commands

You can import and export nodes from most workspaces with the Import and Export commands. They are available in the action bar and in the context menu (right-click).

The commands export and import XML. When you import a file the imported nodes become children of the selected parent node. The commands are configured in the ui-admincentral module and implemented in  ImportCommand  and  ExportCommand  classes.

To export XML:

  1. Select a node to export the node and its children.
  2. Click Export.

To import XML:

  1. Select a parent node under which you want to import the nodes.
  2. Click Import and browse to the XML file.

If a node in the incoming file has an ID (UUID) that is already used in the repository, the imported UUID will be changed.

Zip files

You can upload a ZIP file in the Asset app. See the Export tool on how to export content to a ZIP file.

To import a ZIP file:

  1. Click Upload ZIP archive.
  2. Browse to the file.
  3. In Encoding, select UTF-8 (Windows) or CP437 (Mac) depending on what system the ZIP file was created on.
  4. Click Save.

Import and export tool

Use the import tool to import XML files to all repositories including those not available with the context menu.

To export:

  1. Go to Tools > Export.
  2. Select the Repository (workspace) where the content resides.
  3. In Base path, type the path to the node to export.
  4. Select Format XML if XML should be formatted. Relevant only if Keep versions is selected; otherwise Format XML has no affect, the XML is automatically formatted into the JCR Specification xml Mapping format.
  5. Select the type of Compression: XML (no compression), ZIP or GZIP.
  6. Click Export.

To import:

  1. Go to Tools > Import.
  2. Select the Repository into which the content should be imported.
  3. In Base path, type the path into which content should be imported.
  4. Browse to the file to import.
  5. Select how to handle conflicting UUIDs. These options only apply when an identical UUID already exists in the repository.
    • Generate a new id for imported nodes will result in a new UUID being generated for nodes being imported.
    • Remove existing nodes with the same id will result in nodes with the same UUID as those imported being deleted before the import.
    • Replace existing nodes with the same id will result in nodes with the same UUID as those imported being replaced with the imported nodes.
  6. Click Import.

Content Translation app

In the Content Translation app can import and export page content in Excel, CSV, ZIP and Google Spreadsheet formats. When open the exported Google Spreadsheet in Google Drive is machine-translated automatically.

Scripting

You can export content from Magnolia using a Groovy script. This example exports the news-overview page and its children. The script is equivalent to selecting news-overview page and using the Export command.

import info.magnolia.importexport.DataTransporter
hm = ctx.getHierarchyManager('website')
newsRoot = hm.getContent('/demo-project/news-and-events/news-overview')
xmlFileOutput = new FileOutputStream('C:/test/export/news.xml')
DataTransporter.executeExport(xmlFileOutput, false, false, 
  hm.getWorkspace().getSession(), newsRoot.getHandle(), 'website', 
  DataTransporter.XML)
 xmlFileOutput.close()

Similarly, you can import content with a Groovy script. This example imports the XML for the news-overview page and its children. The script is equivalent to selecting the parent page news-and-events and using the Import command.

import info.magnolia.importexport.DataTransporter
import javax.jcr.ImportUUIDBehavior
hm = ctx.getHierarchyManager('website')
newsRoot = hm.getContent('/demo-project/news-and-events')
xmlFile = new File('C:/test/export/news.xml')
DataTransporter.importFile(xmlFile, 'website', newsRoot.getHandle(), false, 
  ImportUUIDBehavior.IMPORT_UUID_CREATE_NEW, true, true) 

For more information on Groovy see Groovy module. The  DataTransporter  utility class explains the parameters for the executeExport and importFile methods.

Programmatically

The import and export functionality is implemented in the package info.magnolia.importexport. This implementation is mostly contained in the DataTransporter and PropertiesImportExport classes. You can invoke methods in these classes from your own class.

Here is an example of implementing the executeExport method:

File xmlFile = new File(folder.getAbsoluteFile(), xmlName);
FileOutputStream fos = new FileOutputStream(xmlFile);
try {
  DataTransporter.executeExport(fos, false, false,
  MgnlContext.getHierarchyManager(repository).getWorkspace().getSession(),
  node.getHandle(), repository, DataTransporter.XML);
 }
finally { IOUtils.closeQuietly(fos);}

These classes will not complete the import for any UUIDs that are identical to existing UUIDs.

Use cases

Here are cases when importing and exporting is useful.

Site migration

You can accomplish site migration in a number of ways.

  • For smaller sites (less than 300 pages), you can simply copy the page content and paste into the editor.
  • For larger sites, scripting is better than copying and pasting. The script examples export from one site and import to another. The script can also add Magnolia specific metadata such as whether a page should be visible in navigation.
  • Import non-Magnolia content. Store the content in XML files that adhere to the JCR System View XML Mapping format. If the XML file does not adhere to this format, convert it first. You can do that with a conversion script. The conversion script should identify content types in the file and transform them into the format that Magnolia can import.
  • Import data. Create a content app to manage structured data that is independent from page content, such as addresses, employees and client references.

Backup

You can back up content by exporting it to XML and store the files in a disaster recovery system. The file name is the path of the exported data, making identification easier.

Importing a taxonomy

How to import tags depends on the size and format of the taxonomy. It also depends on whether you need to do it once or repeatedly. If the taxonomy does not need to be added repeatedly and its size is reasonable, create the tags manually in the Categories app. If the taxonomy is large, import the tags as mgnl:category content type into the category workspace and use the Categories app to manage them, or create your own content app and content types.

Taxonomy sizeImport frequencyRecommendation
SmallOnceCreate the taxonomy by hand. Use the existing mgnl:category content type in the Categories app. If that content type does not work for you, register a new content type in your module descriptor. While you are at it, register a new workspace too.
LargeOnceWrite a groovy script.
LargeRepeatedlyWrite a groovy script and create a command that executes it so that editors can run the process at will or that you can schedule it.

Copying production data to a test environment

Copying production data to a test or development environment is a task you may need to do regularly. You should test new templates and features with realistic content before releasing them production. Here are strategies for prod-to-test exporting.

Option 1: Clone the production instance

Transfer the data and the JCR Datastore (all binaries in the file system) to the test instance.

In production:

  1. Dump the SQL database.
  2. Copy the JCR Datastore folder
  3. If needed, copy the repository folder to preserve the Apache Lucene search index. It is generally not needed as the index and the repository folder are recreated on startup. But it would save time.

In test:

  1. Load the database dump file.
  2. Copy the JCR Datastore folder to the configured place.
  3. Replace the repository folder.
  4. Start the instance.
(tick) Pros(error) Cons
  • 1:1 copy of all data. Everything is identical to production.
  • Very fast. A SQL dump is much faster than JCR export.
  • No running instance is needed as the data is loaded directly into the SQL database.
  • It is not possible to get only a part of the data.
  • Data could be to big for test. All data is usually needed in a staging environment but not in test.
  • Configuration is also an identical copy so it needs to be changed after startup. For example, subscribers still point to the production instance. This is usually solved with an additional Magnolia module that is specific to the test environment. Add such a module .jar to the WEB-INF/lib directory or even Tomcat lib folder. The module jar changes all configurations needed for the test environment.

Option 2: Use the backup and restore JSP scripts

Use the Backup and restore JSP scripts. This option is a quick win as you can be use it without any development and without having to restart the system. Export only the data you need in test. Define the exported content in the JSP, for example:

backup.jsp
public void run() {
   MgnlContext.setInstance(MgnlContext.getSystemContext());
      try{
         backupChildren(ContentRepository.WEBSITE, "/demo-project");
         backupChildren("dms", "/");
         backupChildren("data", "/products");
         backupChildren(ContentRepository.USERS, "/admin");
         backupChildren(ContentRepository.USERS, "/system");
         backupChildren(ContentRepository.USER_GROUPS, "/");
         backupChildren(ContentRepository.USER_ROLES, "/");
         // backupChildren(ContentRepository.CONFIG, "/modules");
         // backupChildren(ContentRepository.CONFIG, "/server");
         }
      catch(Exception e) {
         logMsg("can't backup", e);
      }
      finally {
         // nothing to do here
      }
      logMsg("backup completed");
   }

In production:

  1. Place the backup JSP script in for example docroot.
  2. Execute the script by requesting its URL. The script creates JCR XML exports into the webapp/backup folder.
  3. Use for example a shell script to copy the exports from the production server to the test server.

In test:

  1. Place the restore JSP script in docroot.
  2.  Execute the script by requesting its URL. The script will import all XMLs automatically. 
(tick) Pros(error) Cons
  • Can be easily be configured to export only parts of the content or only certain workspaces.
  • Is a very stable option as the script can call the garbage collector explicitly and import smaller export files.
  • Easy to use since JSP can be executed by its URL.
  • Easy to extend for more specific needs.
  • Creates order files for importing the content in the right order again.
  • Needs a running system.
  • Much slower than a DB dump
  • It is a JSP script.

Option 3: Use Magnolia's export command

Magnolia's own export command is more flexible but still very comparable to the Jackrabbit JCR import/export tool. You can also use the tool in combination with the Scheduler module or trigger it from any other Java process.

In production:

  1. Trigger the export command regularly with the scheduler and export into the file system. Extend the command so that it transfers the exported data to the test machine.

In test:

  1. Trigger the ImportCommand which imports the transferred XML files.
(tick) Pros(error) Cons
  • The command is available out of the box and well tested since we use it extensively in many places.
  • Can be used in various combinations such as with scheduler or with workflow.
  • More flexible than the backup JSP as it is fully integrated and aware of the Magnolia system.
  • Slower than a database dump.
  • Needs development.