Search
Magnolia CMS search functionality is provided by the Jackrabbit repository. An indexer based on Apache Lucene extracts text from content nodes and properties. Content of Web pages and documents is included in the index. To search the index, you can write queries in a query language supported by the JCR repository. You can test the queries in AdminCentral and execute them in code. The Standard Templating Kit includes a complete example of site search. This article explains how the STK default search implementation works and walks you through the example.

Indexing
There are two processes involved in making content searchable: indexing and querying. Indexing collects and parses Web pages and documents and stores the data in an index to make information retrieval fast and accurate. Querying searches the data in the index and returns results.
Magnolia CMS search is based on the default Jackrabbit search implementation. Jackrabbit uses an Apache Lucene-based indexer to process the data stored in the Java Content Repository. An index makes it faster to retrieve requested portions of the data. Node names and property values are indexed immediately as they stored in the repository. Text from documents is extracted in a background process which makes document content searchable after a short delay.
You can find the physical index folders and files in the webapps installation directory at repositories/magnolia/workspaces/*/index. See the Jackrabbit Search wiki to learn how to configure the search indexing and options available with the implementation. The workspace.xml file mentioned on the wiki is under the repositories/magnolia/workspaces/<name of workspace> directory.
Each Magnolia CMS instance has its own repository and its own index. This means that the author instance index is different from public instance indexes. Any content that has not been activated to a public instance cannot be found when running a search on that public instance.
Querying
Use a query to search the index. You can write the query in SQL-2 (grammar, examples). A query returns a result set which you can display on a Web page.
Tools > JCR Queries is a good place to test queries. When you get the result set you want, you can implement the query in code. Select the workspace to search from the dropdown.

Here are example queries written in SQL-2.
Pages that have the word "article".
SELECT * from [mgnl:page] AS t WHERE
ISDESCENDANTNODE([/demo-project]) AND
contains(t.*, 'article')
Modules that provide commands. This query finds folders named commands in the config workspace and returns the full path to the child nodes.
select * from [mgnl:content] as t where
ISDESCENDANTNODE([/modules]) and
name(t) = 'commands'
Folders and documents under /demo-docs/sheet-music in the dms workspace.
select * from [nt:base] as t where
([jcr:primaryType] = 'mgnl:contentNode' OR
[jcr:primaryType] = 'mgnl:content') AND
ISDESCENDANTNODE([/demo-docs/sheet-music])
order by [t].title asc
See JCR Query Cheat Sheet for more examples.
STK search
The Standard Templating Kit (STK) is a set of common templates and functionality. It also includes a complete example of search. Below is a walkthrough of the STK search, starting from the search box and ending on the results page. We recommend the STK search as a best practice over the non-STK search functionality. It will get you started faster.
Search box
Start exploring the STK search example from the user interface. In the demo-project and demo-features example sites you can find the search box in the top right corner.

The search area script in Templating Kits > Templates /templating-kit/pages/global/search) renders the box on the page:
<div id="search-box"> <h6>${i18n['accessibility.header.search']}</h6> <form action="${stkfn.searchPageLink(content)!}" > <div> <label for="searchbar">${i18n['accessibility.header.searchFor']}</label> <input required="required" id="searchbar" name="queryStr" type="text" value="${ctx.queryStr!?html}" /> <input class="button" type="submit" value="${i18n['button.label.search']}" /> </div> </form> </div>
When a user types a search term into the box and submits the form, the term is assigned to the queryStr parameter.
Notice how the box is prefilled with the previously run search term. The template script reads it from the queryStr context attribute, available through the ctx templating support object.
Query execution in the model
The information sent in the form is processed in the SearchResultModel Java class (Git). The class gets the search term from the request using the getQueryStr method.
public String getQueryStr() { return MgnlContext.getParameter("queryStr"); }
The search term is then embedded into a hard-coded SQL query pattern. The value of the jcr:path parameter is set to the site root node.
select * from nt:base where
jcr:path like ''{0}/%'' and
contains(*, ''{1}'')
order by jcr:path
The model class executes the query and gets results back from the JCR repository. It stores the results in an array named results which is available to the template script for rendering them on the page.
Displaying the results
The stkSearchResult component displays the result to the user. The modelClass property in the component definition is set to the SearchResultModel. This makes the results of the search execution available to the template script. You can find the component definition in Templating Kit > Template Definitions > /components/features/stkSearchResult.

The component definition references a Freemarker script /templating-kit/components/features/searchResult.ftl. The script loops through the result set, rendering each result as a list item.
[#assign result = model.result!][#list result as item] [#-- Macro: Item Assigns --] [@assignItemValues item=item/]
[#-- Rendering: Item rendering --] <li> <h2><a href="${itemLink}" >${itemTitle}</a></h2> [#if hasDate || hasAuthor || hasCategory] <div class="text-meta" role="contentinfo"> <ul class="text-data"> [#if hasDate] <li class="date">${itemDate?date?string.medium}</li> [/#if] [#if hasAuthor] <li class="author">${itemAuthor!}</li> [/#if] [#if hasCategory] <li class="cat">${i18n['search.category']} ${itemCategory!}</li> [/#if] </ul> </div><!-- end text-meta --> [/#if] <p>${itemText!}</p> </li> [/#list]
The script renders the following details about each search hit:
- Title of the page, rendered as a link
- Date last modified
- Author
- Category
- Snippet of item text with the search term highlighted

Non-STK search
The search functionality described below does not depend on the Standard Templating Kit. You can find it in all Magnolia CMS editions regardless of whether the STK is installed.
QueryManager
QueryManagerImpl is a utility class that allows you to execute queries in code. To execute a search, first assign the query statement to a variable. Then get an instance of QueryManager from the JCRSession.
Here is an example of using QueryManager in a Groovy script. The query finds pages that have the word "Article" in their title. To execute the statement, get a JCR session for the website workspace. Pass the query statement and the language used (JCR-SQL2) as parameters. You will get a QueryResultImpl object in return. The Groovy code iterates through the result set.
queryString = "select * from [mgnl:page] as p where contains([p].title,'Article')" q = ctx.getJCRSession("website").getWorkspace().getQueryManager().createQuery(queryString, "JCR-SQL2") queryResult = q.execute() queryResult.nodes.each { println it.path }
The ctx templating support object is a shortcut for MgnlContext and stands for Magnolia context. It is an abstraction layer that represents the current process such as a request for a Web page. The context query is recommended as a best practice for executing queries programmatically when not using the STK.
simpleSearch templating function
The simpleSearch templating function allows you to run a search from a template script without the need for a custom model class. The results are available to the script in a collection variable of your choice. The function belongs to the CMS TemplatingFunctions class which is exposed to template scripts as cmsfn.
The function takes four parameters:
workspacesuch aswebsiteordms.statement- a set of labels the target has to contain. Insert the labels as one string, separated by commas.returnItemType- node type to return such asmgnl:page.startPath- path to search. For results without limits set it to forward slash.
[#assign results = cmsfn.simpleSearch("website", "interesting,article", "mgnl:page", "/demo-project") /] [#list results as result] <p>${result!}</p> [/#list]
Multisite
All sites that run on the same Magnolia CMS instance store their content in the same repository. When you execute a query in the website workspace you will get results from all sites.
In order to limit the search to a specific site, add the jcr:path parameter to the query and set its value to the root node of the site. The example below searches the demo-project website only.
select * from nt:base where jcr:path like '/demo-project/%' order by jcr:path
Multilanguage
If you use the single-tree approach to enter multilanguage content, the system stores all language variants under the same page node. This means there is no separate hierarchy for each language. Site visitors will get search results from all language variants at the same time.
This may not be what you want. If you need to return results from one language only:
- Maintain each language tree separately so that you can limit the search to a particular path.
- Index the content on the public instance using an external search implementation such as Google Custom Search. Using a language identifier in the URL such as
example.com/de/article.htmlthe external search can be configured to return results from that path only. - If you are using the single-tree approach, customize the search query by adding a language parameter. In the JCR repository, language variants are stored in nodes whose names have a language suffix, such as
subtitle_de.
Special characters
Jackrabbit stores all character data (node names and values) in Unicode. This ensures that special characters such as accents and umlauts are indexed and can be used in search. If you experience issues with special characters, this is likely due to character set conversion problems in the application server. See URI encoding in Tomcat.
Security
Search within Magnolia CMS is access controlled. Search results include only content the user has permission to access. Permissions are controlled through security.
When you execute a query in Magnolia context (MgnlContext), contextual factors such as the current user's permissions are taken into account. If you are a user who does not have permission to the items you are querying, they will not show up in the results. Contrast this will running the same query in SystemContext which provides full access.
External search
The default Jackrabbit search works well for finding documents that contain a given term or properties that have a particular value. However, this approach has its limits when you consider how a dynamic page is put together. Data from many source is aggregated on the page. There is a good chance that the search term occurs in data that is stored in the Data module, for example. If will only be present on the rendered page. A search query that looks for the term in the website workspace would not find it. Take a web shop, for example. Product descriptions and images will most likely come from nodes in the data workspace. It is not too complicated to execute two queries and aggregate their results, but how do you sort those results in a relevant way?
External Indexing module
The External Indexing module gets around the limitations of the built-in JCR search and makes the content available for external third-party indexers. The content can come from different workspaces (website, dms, commenting, data) and can be of different type (Web pages, documents, forum threads, shop products). It is provided to a third-party search implementation in a uniform way such as plain text, an object or a URL. The module allows you to integrate semantic search implementations for features such suggesting other pages that are related to the page the visitor is currently on. Read more in Opening the door to semantic search.