Magnolia 5.3 reached end of life on June 30, 2017. This branch is no longer supported, see End-of-life policy.

Page tree
Skip to end of metadata
Go to start of metadata

PHPCR is an adaptation of the JCR standard which can be used to connect a PHP application with a JCR-compliant repository. Magnolia is a popular open-source, enterprise-grade Java CMS that uses a JCR repository to store web content. This article demonstrates how PHPCR can be used to easily integrate a PHP front-end with the Magnolia content repository without needing special  Java knowledge or training.

Introduction

Relational databases are great for storing and retrieving strongly-typed, structured data, and they’re extremely popular data storage containers for web applications. However, for application data that isn’t strongly typed or rigidly struc­tured, the selection of an appropriate data storage container requires deeper thought. In this situation, most developers typically head to XML, which is easy to understand, highly flexible and widely supported in most programming languages.

XML and relational databases aren’t the only two options any longer, though. Consider the evolving PHP Content RepositÏory project, aka PHPCR, which aims to “combine the best of document-ori­ented databases (weakly-structured data) and of XML databases (hierarchical trees).” The project, which provides a 100% PHP implementation of the Java Content Repository (JCR) standard, is rapidly attracting followers as a viable alternative to traditional data storage containers.

In this article, I’ll introduce you to PHPCR and demonstrate how it can be used to connect a PHP application with a JCR-compliant repository. The repository in this case belongs to Magnolia, a popular open-source, enterprise-grade Java CMS. The examples in this article will demonstrate how PHP can be used to add and update content in the Magnolia repository and have those changes reflected in the Magnolia user interface.

Understanding JCR and PHPCR

The PHP Content Repository is “an adaptation of the Java Content Repository (JCR) standard, an open API specification defined in JSR-283...[it] defines how to handle hierarchical semi-struc­tured data in a consistent way”. It was originally ported from Java to PHP by Karsten Dambekalns with the help of others for the typo3/flow3 project and is currently maintained by David Buchmann. It is freely available under the Apache License v2.0 at http://phpcr.github.com/.

How does it work? In their simplest forms, JCR and PHPCR are APIs to access and manipulate content. This content is stored hierarchically using a tree structure with each node of the tree repre­senting a single content fragment. Each node has properties that are used to store information about the node; this might include the node value, node type, node status, node identifier, and so on.

To better understand this, take a look at the repository model diagram from the JCR 2.0 specifi­cation by Day Software AG (see Related URLs), which illustrates what the JCR tree structure looks like. Then consider Figure 1, which shows a fully-realized Jackrabbit implementation of this tree structure from the Magnolia JCR browser.

Nodes can be accessed either by path or by iden­tifier. Each node may itself expose a collection of child nodes, and PHPCR and JCR define API methods for node traversal using these parent-child relationships. There also exist API methods to add or remove nodes from the tree. Here are some examples of these API methods:

  • The SessionInterface->getNode() and SessionInterface->getNodeByIdentifier() methods return a node either by path or by identifier.
  • The NodeInterface->addNode() method adds a new child node.
  • The NodeInterface->getNodes() method returns all the children of a particular parent node.
  • The NodeInterface->getPropertyValue() method returns the value of a specified node property.

See Related URLs for links to more information on the JCR 2.0 API and the PHPCR API.

Each node in a JCR tree is associated with a node type. JCR specifies a rich and configurable typing system for nodes. Available types include primitives, such as Booleans and strings, as well as types that are relevant in a hierarchical context, such as path. User-specified node types are also supported.

For example, the JCR "nt:file" node type is used to represent a file and includes a mandatory property for the file creation date. Similarly, the Magnolia-specific "mgnl:user" node type is used to represent a Magnolia user and so includes properties for user name, email address, and password, as well as child nodes for the user’s groups and roles.

Both PHPCR and JCR also support the concept of “workspaces”, each having its own node tree. Think of workspaces like branches in a version control system: they can be used independently, but they can also be merged, moved, copied and cloned. Workspaces make it possible to logically separate content, yet have it reside within the same physical repository.

It’s important to note at this point that both PHPCR and JCR merely define an implementation standard; they are not implementations them­selves. There are several implementations for each:

Installing and configuring required components

With the basics out of the way, let’s get started with some practical examples. In terms of soft­ware, this tutorial assumes that you have a properly-configured PHP and Java development environment, including the following components in your system path:

On the PHP end of things, you’ll need to download and install:

The easiest way to install PHPCR and its depen­dencies is by using Composer, the popular depen­dency manager for PHP. Create a new working directory for the project, change to it, and then run the following command at the console to download Composer:

shell> curl -s http://getcomposer.org/installer | php -- 

Within your working directory, create a file named composer.json and fill it with the following content:

Listing 1
{
    "require": {
        "phpcr/phpcr": "*",
        "jackalope/jackalope-jackrabbit": "*"
    }
}

Then, use Composer to download the necessary components using the console command below:

shell> php composer.phar install 

Note that the download process might take a while, so this is a good time to grab a cup of coffee and a slice of toast.

On the Java end of things, you’ll need to down­load and install:

You’ll find detailed instructions on how to per­form the Magnolia installation in its docu­mentation at Installing or in the beginner tutorial at   http://www.webreference.com/authoring/MagnoliaCMS/Setup/. Note that the Magnolia download includes a bundled version of Apache Tomcat.

Once you’ve downloaded and installed Magnolia, copy all the JAR files from the Jackrabbit DavEx module to the $MAGNOLIA_INSTALL_PATH/apache-tomcat-x.y.z/webapps/magnoliaAuthor/WEB-INF/lib/ directory. DavEx is WebDAV with JCR extensions, and the DavEx module is necessary to enable access to the content in the JCR repository over HTTP.

Once the module files are copied, restart Magnolia by running the following command at the console:

shell> ./magnolia_control.sh start 

Next, browse to the URL http://localhost:8080 and log in with the user name “superuser” and password “superuser”. You’ll be prompted to update the Magnolia installation with the new DavEx module. Do so, and once the process completes, you should end up at the Apps Launcher, which looks like Figure 2.

With the addition of the DavEx module, Magnolia should already be configured to allow DavEx access to the Jackrabbit repository. To check this, select the Configuration App in the Apps Launcher, and then navigate your way to the node at /server/IPConfig/allow-all/methods. Select the node value and check that it contains with the following values: GET,POST,PROPFIND,PUT,DELETE,REPORT,HEAD,SEARCH. If these values are not present, update the node by double-clicking it and entering them.

Figure 3 shows what the result should look like.

Getting started

Back in the Apps Launcher, select the Tools category and the JCR App. Navigate your way to the node /demo-project/about/history. Pay attention to the value of the node’s “title” property (Figure 4), which holds the title for the corresponding webpage.

Now, try doing the same thing with PHP. As noted previously, a PHPCR implementation can connect to any JCR-compliant repository and both read and write data to it. Listing 1 demonstrates - by logging into the Magnolia JCR repository with PHPCR - opening a session to Magnolia’s “website” workspace, navigating to the node above, and retrieving the content of its “title” property.

Listing 1 begins by setting up the Composer auto-loader, which takes care of loading PHP classes as needed. Next, Jackalope’s getRepository() method is used to initialize and configure a repository factory object referencing the Magnolia repository. The Repository object’s login() method is then used to log in to the repository with the “superuser” credentials and create a JCR session. This session object serves as the primary object for all subsequent repository operations.

Listing 1
<?php
try {
  // set up Composer auto-loader
  $loader = require_once (__DIR__.'/vendor/autoload.php');

  // initialize PHPCR session to Magnolia's 'website' repository
  $parameters = array(
    'jackalope.jackrabbit_uri' => 'http://localhost:8080/magnoliaAuthor/.davex/'
  );
  $creds = new \PHPCR\SimpleCredentials('superuser','superuser');
  $repository = \Jackalope\RepositoryFactoryJackrabbit::getRepository($parameters);
  $website = $repository->login($creds, 'website');

  // get a node from the repository using its path
  // and read a property from it
  $node = $website->getNode('/demo-project/about/history');
  $abstract = $node->getPropertyValue('title');
  echo $abstract;

  // clean up
  unset($website);
} catch(Exception $e) {
    echo $e->getTraceAsString();
}  
?>

Once a JCR session has been established, it’s quite easy to navigate the JCR tree using the PHPCR API. This listing illustrates the getNode() method, which returns a node object representing the corresponding JCR node. It’s now possible to use the node object’s getPropertyValue() method to return the value of any node property, as illustrated in Figure 5. Once you’re done interact­ing with the repository, it’s a good idea to clean up, by unsetting the session object and closing the connection to the repository.

The steps above make up a standard process for interacting with any JCR-compliant repository... although, as you’ll shortly see, PHPCR allows you to do much more than simply reading property values.

Retrieving nodes and properties

Every node in a JCR repository has a unique identifier, and so, just as you can retrieve a node by path with the getNode() method, you can also retrieve a node by identifier with the getNodeByIdentifier() method. Consider Listing 2, which produces output equivalent to the previous one using this method.

Listing 2
<?php
try {
  // set up Composer auto-loader
  $loader = require_once (__DIR__.'/vendor/autoload.php');

  // initialize PHPCR session to Magnolia's 'website' repository
  $parameters = array(
    'jackalope.jackrabbit_uri' => 'http://localhost:8080/magnoliaAuthor/.davex/'
  );
  $creds = new \PHPCR\SimpleCredentials('superuser','superuser');
  $repository = \Jackalope\RepositoryFactoryJackrabbit::getRepository($parameters);
  $website = $repository->login($creds, 'website');

  // get a node from the repository using its UUID
  // and read a property from it
  // note: this UUID might change in future releases of Magnolia 
  $node = $website->getNodeByIdentifier('674469eb-31c2-4dc3-aaa9-935469790539');
  $abstract = $node->getPropertyValue('title');
  echo $abstract;

  // clean up
  unset($website);
} catch(Exception $e) {
    echo $e->getTraceAsString();
}  
?>

The getPropertyValue() method described above is useful when you need to retrieve a specific value. However, you can also retrieve all the properties of a particular node as a PHP associa­tive array, with the node object’s getPropertiesValues() method. Listing 3 has an example.

Listing 3
<?php
try {
  // set up Composer auto-loader
  $loader = require_once (__DIR__.'/vendor/autoload.php');

  // initialize PHPCR session to Magnolia's 'website' repository
  $parameters = array(
    'jackalope.jackrabbit_uri' => 'http://localhost:8080/magnoliaAuthor/.davex/'
  );
  $creds = new \PHPCR\SimpleCredentials('superuser','superuser');
  $repository = \Jackalope\RepositoryFactoryJackrabbit::getRepository($parameters);
  $website = $repository->login($creds, 'website');

  // get a node from the repository
  // and read all properties from it
  $node = $website->getNode('/demo-project/about/history');
  $properties = $node->getPropertiesValues();
  print_r($properties);

  // clean up
  unset($website);
} catch(Exception $e) {
    echo $e->getTraceAsString();
}  
?>

Figure 6 is a side-by-side comparison that displays the node properties from the Magnolia JCR repository and the properties re­turned by Listing 3

Retrieving binary node content

The Magnolia repository also includes a “dam” workspace to store binary content. A node in this workspace represents a binary file, and as per the JCR specification, includes a child node named “jcr:data” which holds the actual binary data for the file. With PHPCR, it’s possible to access and retrieve this binary data from PHP in the same way as one would retrieve any other node content.

Consider Listing 4, which illustrates by retrieving a PDF file from Magnolia’s repository and prompt­ing the user to download it to his or her desktop. This listing creates a JCR session to the “dam” workspace, which is where Magnolia stores binary data, such as images, PDF documents and videos. It then retrieves the node at /demo-project/downloads/Magnolia_Flyer_4-0, which contains a PDF file with an advertisement for Magnolia.

This node contains only basic information about the file, such as its name and last modified date. The real meat is found in its “jcr:content” child node, which holds the binary content for the file in its "jcr:data" property. With PHPCR’s node traversal methods, it’s possible to access the child node, retrieve the binary data for the file via the "jcr:data" property, and send the output to the client (browser) as a binary stream.

By accessing other properties of the “jcr:content” node, it’s also possible to derive the file name, extension and MIME type, and send the client the appropriate response headers to force it to perform a download rather than displaying the binary content directly to the user’s console.

Listing 4
<?php
try {
  // set up Composer auto-loader
  $loader = require_once (__DIR__.'/vendor/autoload.php');

  // initialize PHPCR session to Magnolia's 'dam' repository
  $parameters = array(
    'jackalope.jackrabbit_uri' => 'http://localhost:8080/magnoliaAuthor/.davex/'
  );
  $creds = new \PHPCR\SimpleCredentials('superuser','superuser');
  $repository = \Jackalope\RepositoryFactoryJackrabbit::getRepository($parameters);
  $dam = $repository->login($creds, 'dam');

  // get a binary data node from the repository
  // as a stream
  // and have the browser download it as a file
  $node = $dam->getNode('/demo-project/downloads/Magnolia_Flyer_4-0');
  $properties = $node->getNode('jcr:content')->getPropertiesValues(); 
  header('Content-type: ' . $properties['jcr:mimeType']); 
  header('Content-disposition: attachment; filename="' . $properties['fileName'] . '.' . $properties['extension'] . '"');
  echo stream_get_contents($properties['jcr:data']);    

  // clean up
  unset($dam);
} catch(Exception $e) {
    echo $e->getTraceAsString();
}  
?>

Creating nodes and setting properties

Just as you can read data from Magnolia’s reposi­tory, so too can you write data. Consider Listing 5, which illustrates the process. In this example, the node object’s setProperty() method is used to create a new property, as well as modify an existing property, of the node. The changes are then saved back to the repository with the session object’s save() method.

Listing 5
<?php
try {
  // set up auto-loader
  $loader = require_once (__DIR__.'/vendor/autoload.php');

  // initialize PHPCR session to Magnolia's 'website' repository
  $parameters = array(
    'jackalope.jackrabbit_uri' => 'http://localhost:8080/magnoliaAuthor/.davex/'
  );
  $creds = new \PHPCR\SimpleCredentials('superuser','superuser');
  $repository = \Jackalope\RepositoryFactoryJackrabbit::getRepository($parameters);
  $website = $repository->login($creds, 'website');

  // get a node from the repository
  $node = $website->getNode('/demo-project/about/history');

  // create a new property on it
  $node->setProperty('foo', 'bar');

  // also modify an existing property
  $node->setProperty('title', 'Company History');

  // save changes
  $website->save();

  // clean up
  unset($website);
} catch(Exception $e) {
    echo $e->getTraceAsString();
}  
?>

Figure 7 shows the impact of the change in Magnolia.

Listing 6 has another example, this one adding a folder to Magnolia’s “dam” workspace. In this example, the addNode() method is used to add a new child node to the “dam” workspace. The new node is assigned the type "mgnl:folder", which is Magnolia’s custom type for folders; using this type ensures it shows up in Magnolia’s Assets App. Next, the node title is set using the setProperty() method discussed previously, and the changes are saved back to the repository.

Listing 6
<?php
try {
  // set up auto-loader
  $loader = require_once (__DIR__.'/vendor/autoload.php');

  // initialize PHPCR session to Magnolia's 'dam' repository
  $parameters = array(
    'jackalope.jackrabbit_uri' => 'http://localhost:8080/magnoliaAuthor/.davex/'
  );
  $creds = new \PHPCR\SimpleCredentials('superuser','superuser');
  $repository = \Jackalope\RepositoryFactoryJackrabbit::getRepository($parameters);
  $dam = $repository->login($creds, 'dam');

  // add a folder node to the repository
  $node = $dam->getNode('/demo-features');
  $folderNode = $node->addNode('test-folder', 'mgnl:folder');
  $folderNode->setProperty ("title", "Test Folder");
  
  // save changes
  $dam->save();
  
  // clean up
  unset($dam);  
} catch(Exception $e) {
  echo $e->getTraceAsString();
}  
?>

If you now examine the repository using Magnolia’s Assets App, you’ll see the newly-added folder (Figure 8).

Traversing and searching the node tree

PHPCR also provides methods for iterating over a collection of nodes using standard loop con­structs. Consider Listing 7, which iterates over all the children of a specified node and returns their abstracts. In this case, the node object’s getNodes() method is used to retrieve a collection of child nodes from the Magnolia repository. PHPCR makes it possible to iterate over this node collection using a standard “foreach” loop and perform operations on calculations on each child node. In this example, it checks if each node exposes an “abstract” property, and if it does, it prints the node path and property value.

Listing 7
<?php
try {
  // set up auto-loader
  $loader = require_once (__DIR__.'/vendor/autoload.php');

  // initialize PHPCR session to Magnolia's 'website' repository
  $parameters = array(
    'jackalope.jackrabbit_uri' => 'http://localhost:8080/magnoliaAuthor/.davex/'
  );
  $creds = new \PHPCR\SimpleCredentials('superuser','superuser');
  $repository = \Jackalope\RepositoryFactoryJackrabbit::getRepository($parameters);
  $website = $repository->login($creds, 'website');

  // iterate over the nodes which have abstracts
  // and print them
  $node = $website->getNode('/demo-project/about/subsection-articles');
  foreach ($node->getNodes() as $child) {
    if ($child->hasProperty('abstract')) {
        $abstract = $node->getProperty('abstract');
        echo "<h1>" . $node->getPath() . '</h1>';
        echo $abstract->getValue();
    }
  }
  
  // clean up
  unset($website);
} catch(Exception $e) {
    echo $e->getTraceAsString();
}
?>

Figure 9 shows the output.

Like JCR, PHPCR also makes it possible to perform custom searches for nodes across the JCR repository via its support for JCR-SQL2. Part of the JCR 2.0 specification, JCR-SQL2 makes it possible to retrieve a result set of nodes matching specific criteria using SQL-like syntax. You’ll find more information at http://www.day.com/specs/jcr/2.0/6_Query.html.

To illustrate, consider Listing 8 which enhances the previous one to retrieve a listing of all nodes (pages) in the “website” repository under the /demo-project/about branch. This example uses the query manager object, which makes it possible to define a JCR-SQL2 query and execute it against the current workspace. As the query string illustrates, the syntax is very similar to SQL, with selectors, clauses, filters, and functions. The result of query execution is a collection of node objects, which can be processed using a regular “foreach” loop, in the same way as the previous example.

Listing 8
<?php
try {
  // set up auto-loader
  $loader = require_once (__DIR__.'/vendor/autoload.php');

  // initialize PHPCR session to Magnolia's 'website' repository
  $parameters = array(
    'jackalope.jackrabbit_uri' => 'http://localhost:8080/magnoliaAuthor/.davex/'
  );
  $creds = new \PHPCR\SimpleCredentials('superuser','superuser');
  $repository = \Jackalope\RepositoryFactoryJackrabbit::getRepository($parameters);
  $website = $repository->login($creds, 'website');
  $qm = $website->getWorkspace()->getQueryManager();

  // get and print all the page nodes with abstracts
  // use a JCR-SQL2 query
  $sql = "SELECT * FROM [mgnl:page] WHERE ISDESCENDANTNODE('/demo-project/about')";
  $query = $qm->createQuery($sql, 'JCR-SQL2');

  $queryResult = $query->execute();
  echo '<ul>';
  foreach ($queryResult->getNodes() as $node) {
    if ($node->hasProperty('title')) {
      $abstract = $node->getProperty('title');
      echo '<li>' . $abstract->getValue()  . ': ' . $node->getPath() . '</li>';
    }
  }
  echo '</ul>';
  
  // clean up
  unset($website);
} catch(Exception $e) {
    echo $e->getTraceAsString();
}
?>

Figure 10 shows the output.

Example: page abstract editor

From the previous examples, it’s clear that PHPCR offers PHP developers some key capabilities: searching for JCR nodes using pre-defined criteria, reading their properties and writing new values to them. These capabilities make it possible to create PHP-based web applications that can directly interact with content in a JCR-powered CMS like Magnolia, without needing any special Java knowledge or training.

To illustrate this, consider Listing 9: a PHP-based editor that allows users to directly edit the abstracts of webpages inside Magnolia. This script is essentially a giant "if" test keyed on the presence or absence of the $_POST['submit'] vari­able, which is used to check whether the web form in the script has been submitted or not. Here’s how it works:

  1. The script begins by loading all required classes and opening a connection to the Magnolia repository. It also creates a session object for the “website” work­space. It then checks to see if the web form has been submitted by checking for the $_POST['submit'] variable.
  2. If it hasn’t, it initializes and executes a JCR-SQL2 query to return a list of all the pages in the CMS under the /demo-project URL with abstracts. It then presents these abstracts as editable fields within a web form. Each input field is accompanied by a hidden field holding the corresponding node’s unique identifier. The user can now edit the abstracts and submit the form once done.
  3. Once the form has been submitted, the second half of the conditional test is invoked. Here, the submitted abstracts and accompanying node identifiers are used to retrieve the corresponding nodes from the Magnolia repository via the getNodeByIdentifier() method, and the setProperty() method is used to update their “abstract” properties to the user-sub­mitted values. The changes are saved back to the repository via the session object’s save() method, and immediately become visible in the Magnolia instance.
Listing 9
<?php
try {
  // set up auto-loader
  $loader = require_once (__DIR__.'/vendor/autoload.php');

  // initialize PHPCR session to Magnolia's 'website' repository
  $parameters = array(
    'jackalope.jackrabbit_uri' => 'http://localhost:8080/magnoliaAuthor/.davex/'
  );
  $creds = new \PHPCR\SimpleCredentials('superuser','superuser');
  $repository = \Jackalope\RepositoryFactoryJackrabbit::getRepository($parameters);
  $website = $repository->login($creds, 'website');
?>
<html>
  <head></head>
  <body>
    <h1>Magnolia Page Abstract Editor</h1>
    <form method="post" action="">    
    <?php
    if (!isset($_POST['submit'])) {
      $qm = $website->getWorkspace()->getQueryManager();

      // get 5 page nodes with abstracts
      // use a JCR-SQL2 query
      $sql = "SELECT * FROM [mgnl:page] WHERE ISDESCENDANTNODE('/demo-project') AND abstract IS NOT NULL ORDER BY [jcr:uuid]";
      $query = $qm->createQuery($sql, 'JCR-SQL2');
      $query->setLimit(5);
      $queryResult = $query->execute();

      // for each node, display a form input field
      // with node UUID as hidden field
      foreach ($queryResult->getNodes() as $node) {
        $abstract = $node->getProperty('abstract');
    ?>
      <p>
        <strong><?php echo $node->getPath(); ?></strong> <br/>
        (uuid: <?php echo $node->getIdentifier(); ?>) <br/>
        <textarea rows="5" cols="50" name="ids[<?php echo $node->getIdentifier(); ?>]"><?php echo trim($abstract->getValue());?></textarea>
      </p>
    <?php
      }
    ?>
            <input type="submit" name="submit" />
    <?php
    } else {
      // get node by UUID and write new abstract
      foreach ($_POST['ids'] as $key => $value) {
        $node = $website->getNodebyIdentifier($key);
        $node->setProperty('abstract', $value);
      }
      $website->save();
      echo 'Node content saved successfully';
    }
    ?>
    </form>
  </body>
</html>
<?php
} catch(Exception $e) {
    echo $e->getTraceAsString();
}
?>

Figure 11 illustrates the abstract editor in action, and the resulting change in Magnolia page content.

As the concluding example demonstrates, PHPCR provides a full-featured and robust implementation of the JCR specification. With PHPCR, PHP devel­opers can easily build and integrate PHP-based web applications and frontends with JCR-powered Java applications, without needing special Java knowledge or training. It’s truly the best of both worlds...so what are you waiting for? Fork a copy of PHPCR today, and start coding!


Credits:

  • Vaswani, Vikram: Integrating PHP Web Applications with JCR and Magnolia. This article was first published in the October 2012 issue of php|architect magazine. Vikram Vaswani is the founder and CEO of Melonfire, a consulting services firm with special expertise in open-source tools and technologies. He is also the author of the books Zend Framework: A Beginners Guide and PHP: A Beginners Guide.

  • Kahwe Smith, Lukas: Code examples in this article. Liip AG.
  • Day Software AG: Repository diagram. Day JCR License.