Apache Solr and Magnolia Solr module
Apache Solr is a standalone enterprise search server with a REST-like API: The documents are sent in for indexing via JSON, XML, CSV or binary over HTTP. The search queries are sent via HTTP GET and the results are received in JSON, XML, CSV or binary form.
Apache Solr uses the Lucene library and is:
- Scalable – Solr scales by distributing work (indexing and query processing) to multiple servers in a cluster.
- Ready to deploy – Solr is open source, is easy to install and configure, and provides a preconfigured example to help you get started.
- Optimized for search – Solr is fast and can execute complex queries in subsecond speed, often only tens of milliseconds.
- Large volumes of documents – Solr is designed to deal with indexes containing many millions of documents.
- Text-centric – Solr is optimized for searching natural-language text, like emails, web pages, resumes, PDF documents, and social messages such as tweets or blogs.
(from Chapter 1 of Solr in Action)
Magnolia's Solr-based search capability is provided by the Solr module (full name Magnolia Solr Search Provider module), which consists of the following three key submodules:
- Content Indexer - indexes Magnolia workspaces. It can also crawl a published website.
- Search Provider - provides templates for displaying Solr search results on a site and faceted search components.
- Solr Workbench - provides a Solr container for list, search and thumbnail views in content apps.
For installation, configuration information and Solr module release notes please see the Solr module page tree.
Creating a sitemap with Solr
To create a custom sitemap with Solr in four easy steps, please refer to the page called Step-by-step integrating Solr in Magnolia and generating custom sitemap available on the Magnolia Community Wiki.