Magnolia CMS employs a web cache to store server responses so that future requests for the same content can be served faster. The chief benefit of using cache is a reduction in information load on the network, meaning:
- Less bandwidth used
- Reduced processing
- Improved responsiveness
- How caching works
- Policy configuration
- Ehcache backend
- Advanced strategies
- Cached URLs
- Cache strategies
- Flushing the cache
- Excluding content from cache
- Setting cache headers
Cache is a community module bundled with Community and Enterprise Editions and comes installed by default. To restore the default configuration, delete the
/modules/cache node and restart your server.
Although bundled as a separate module, the Cache module is integral to Magnolia CMS and many other modules depend on it. It is not advisable to uninstall it. If necessary, it is possible to disable caching by adding an
enabled node set to "false" in Configuration:
How caching works
- Content not modified: If a client has the latest version of content (i.e. not modified), the browser cache policy instructs the filter to respond with "304 Not Modified".
- Modified content: If content has been modified or does not exist in the cache, the filter passes the request to the server cache policy. Server cache policy then analyses the request and replies with the expected behavior. The filter then invokes the appropriate executor. This mechanism allows you to add, remove and use executors by changing the current cache policy.
- Content not available: If the content is not available, the filter passes the request on to Magnolia CMS. On the return trip, the filter reads the content from the response and stores it in the cache store for future use. Flush policy is completely independent from this chain and reacts on content changes rather than content served.
Cache configurations are defined in:
/modules/cache/config/configurations/. Within each configuration you can define:
- what to cache,
- when to flush the cache,
- what header data to pass to browsers, and
- specific implementations of tasks.
To select one of the cache configurations, set the
cacheConfigurationName parameter in the Cache filter. The chosen configuration is read into a JavaBean using the Content2Bean mechanism, making it dynamically available to your own module code.
Caching behavior for each configuration is defined with policies.
- Server cache policy: This policy defines whether the requested content should be cached or not. The decision to cache relies on voters, which are used whenever configuration values are not assigned at startup but depend on rules. Voters evaluate a rule such as "should content residing at this URL be cached" and return a positive or negative response. By default, all content on public instances is cached except the AdminCentral UI at
/.magnolia. Server cache policy is configured in
/modules/cache/config/configurations/default/cachePolicy. The default implementation checks if the content exists in the Ehcache store and requests caching if the content is not found.
- Client (browser) cache policy: This policy defines how long the browser may cache a document. The time is passed to the browser in the response header. The default
FixedDurationoption instructs the browser to cache the document for 30 minutes.
Neverinstructs the browser to do nothing. Client cache policy is configured in
- Flush policy: The Flush policy defines when to flush the cache. The default configuration observes changes (activation, import, edit) in a repository and flushes the cache if new or modified content is detected. Cache can be flushed completely, partially or not at all. Each module can register its own flush policy (or multiple policies) and receive notification about new or modified content in each repository. Flush policies are informed about changes in observed workspaces. The list of observed workspaces can be defined per policy under the
repositoriessub node of each policy.
Registering a workspace for flush policy
The Cache module provides a
RegisterWorkspaceForCacheFlushingTask install task that custom modules can use to register their workspace default
FlushAll policy. When registered, any cached content originating from this repository will be flushed from the cache when a change to any content anywhere in the repository is detected. By default the configuration repository is not observed. This means that activated changes to configuration elements will not always be served in the public instance. This can be overcome by manually flushing the cache in Cache tools. Alternatively, you can add this repository.
These are actions taken once a caching decision has been made. There are three possible actions:
useCache: Retrieves the cached item from the cache and streams it to the client.
store: Stores the response in the cache for future use.
bypass: Skips caching. This is useful for content that cannot or should not be cached.
/modules/cache/config/configurations/executors. Each of the executors is also responsible for configuring expiration headers.
Magnolia CMS uses Ehcache for its back-end cache functionality. Ehcache is a robust, proven and full-featured cache product which has made it the most widely-used Java cache. Ehcache has its own configuration options that can be set in
You can use a different cache library as long as you implement Java interfaces that allow you to configure caching behavior from AdminCentral. The library can be changed by implementing
info.magnolia.module.cache.CacheFactory interfaces. A cache factory is an interface that wraps the functionality and hides the configuration mechanism of the library you choose.
| ||10000|| Instructs Ehcache to wait the specified time in milliseconds before attempting to cache the request. Create the
| ||120||Number of seconds between runs of the disk expiry thread.|
| ||false||Determines if disk store persists between restarts of the Virtual Machine.|
| ||30||Size to allocate to DiskStore for a spool buffer. Writes are made to this area and then asynchronously written to disk. Default: 30MB. Each spool buffer is used only by its cache. If OutOfMemory errors, you may need to lower this value. To improve DiskStore performance consider increasing it. Trace level logging in the DiskStore will show if put back ups are occurring.|
| ||true||If elements are set to eternal, timeouts are ignored and the element is never expired.|
| ||10000|| Sets maximum number of objects that will be created in memory.
| ||10000000||Sets maximum number of objects maintained in the DiskStore. The default value of zero means unlimited.|
| ||LRU|| Policy is enforced upon reaching the maxElementsInMemory limit. Available policies:
| ||true||Permits elements to overflow to disk when the memory store has reached the maxInMemory limit.|
| ||0|| Optional attribute. Sets max idle time between accesses for an element before it expires. Only used if the element is not eternal. A value of
| ||0||Sets lifespan for an element. Only used if the element is not eternal. Optional attribute. A value of 0 means an Element can live for infinity.|
Compression is a simple and effective way to save bandwidth and speed up your site. It is a common practice used by Google and Yahoo! for example. (How to Optimize Your Site with GZIP Compression is a great general introduction to the topic.) Compression is performed in the gzip filter, configured in
/server/filters/gzip. When a client requests a resource such as index.html, Magnolia CMS delivers it zipped. A typical HTML page is compressed to 20% of its original size. So if your page is 100 kB uncompressed, it is 20 kB compressed.To improve performance further, zipped content is streamed from the repository to the client rather than read into memory first.
text/css: Cascading Style Sheets
To add more content types, such as XML, create a numbered data node under
allowed. Use the Internet media type (MIME type) as value. Here are some common media types:
application/xhtml+xml: XHTMLhttp://jira.magnolia-cms.com/browse/MAGNOLIA-85 exec
text/csv: Comma-separated values
text/plain: Textual data
text/xml: Extensible Markup Language
application/pdf: Portable Document Format
Accept-Encoding: gzip. Note that Magnolia CMS does not cache big binaries.
Internet Explorer 6
Note that while all modern browsers support compression, some older browsers do not, notably Internet Explorer 6 before Service Pack 2. To get around this, Magnolia uses a
userAgent voter that rejects compression and delivers uncompressed content if the browser identifies itself as IE6 in the
User-Agent field in request headers.
To test your compression configuration, use a tool such as Web-Sniffer that allows you to change the
User-Agent sent headers easily.
Here's what the headers look like when the Magnolia CMS demo site home page is submitted to the sniffer.
GET /demo-project.html HTTP/1.1 Host: demopublic.magnolia-cms.com Connection: close User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9) Gecko/2008052906 Firefox/3.0 Accept-Encoding: gzip Accept-Charset: ISO-8859-1,UTF-8;q=0.7,*;q=0.7 Cache-Control: no Accept-Language: de,en;q=0.7,en-us;q=0.3 Referer: http://web-sniffer.net/
Status: HTTP/1.1 200 OK Date: Fri, 23 Jul 2010 07:45:10 GMT Server: Apache/2.2.9 X-Magnolia-Registration: Registered Cache-Control: max-age=900 Last-Modified: Thu, 01 Jul 2010 14:03:12 GMT Content-Encoding: gzip Vary: Accept-Encoding Content-Length: 3852 Connection: close Content-Type: text/html;charset=UTF-8
Advanced strategiesAdvanced caching strategies are available in a separate Advanced Cache module.
Cache related commands are in the
flushAll: Completely flushes all caches that are configured in
/modules/cache/config/configurations/default/flushPolicy/policies/flushAll/repositories. Note that the
imagingworkspace is not included in the flush by default.
flushByUUID: Completely flushes all entries related to given UUID from all available caches. This command expects
Tools > Cache tools in AdminCentral provides the following operations:
- Caches information: Tells you the number of elements currently in the cache.
- Flush from caches by UUID: Allows you to flush a single item by specifying the UUID and repository.
- Flush a cache by name: Allows you to flush a named cache.
- Flush all caches: Flushes all cached elements.
The Cache tools page is installed as part of the Cache module.
By default, the following URLs are cached:
- On public instance everything except
/.magnolia/*which is AdminCentral.
- On author instance all static resources
magnolia.developproperty is set to false.
magnolia.develop property to
true in the default
magnolia.properties file. For more complex configurations, you need to adjust the configuration under the
Flushing the cache
- Shut down Magnolia CMS, delete the
cachedirectory and restart.
- Enable Java Management Extensions (JMX). (It is enabled by default on some application servers.) Connect to the server using
jconsoleor use your server's own JMX administration interface if provided. Locate the
net.sf.ehcache.CacheManagerbean and invoke the
flush()operation of the default instance of the cache.
- Any activation from author instance to public automatically flushes the public cache.
- Use Tools>cache tools
Excluding content from cache
There are various reasons why you may wish to exclude content from cache. For example, you may have components that query an external data source dynamically. The rendered HTML changes even if the content of the Magnolia CMS page has not changed.
When we say cached content we mean the rendered output generated by Magnolia CMS itself, the actual content of the page. When you exclude a page from cache you tell Magnolia CMS that it should re-render that content every time the page is requested by a user.
Configuring an exclusion
The first option for excluding content from cache is to configure an exclusion in the cache policy. The example below excludes all pages whose URL starts with
/.magnolia. This means that AdminCentral pages are not cached.
Implementing a custom cache policy
To implement a custom cache policy, extend the CachePolicy class and override the methods you wish to customize. The default policy:
...determines if a requested page should be cached, retrieved from the cache or not cached at all. It is called for every request and takes care of any expiration policy, that is if the page should be re-cached. The CacheFilter (or any other client component) can determine its behavior based on the return CachePolicyResult, which holds both the behavior to take and the cache key to use when appropriate.
Configure your custom cache policy class in Configuration >
Cache header negotiation
Cache header negotiation is a mechanism that allows templates and components to influence whether the content should be cached and for how long. This mechanism can be used when it is too late to configure an exclude but you do not wish for a page to be cached. Excludes are typically configured before it is clear what kinds of pages editors will add and what kind of content those pages will have. Cache header negotiation allows page components to influence whether the page should be cached.
You can use cache header negotiation for:
- Live, dynamic data. A component that display dynamic data can indicate that the page the component is on should be cached only for a short duration: 5 minutes, 1 minute etc.
- Personalized content. Components that display personalized content can indicate that the page should not be cached at all.
- Error resolution. If a component fails to read data from an external source and outputs a message to say there is a problem, you may not want to cache the error message, at least not for long. Instead, it is better if the component makes another attempt at getting the data when the next visitor requests the page.
A template can influence the decision in the same way. A template might want to cache the page for 10 minutes because the page displays real-time weather updates. If a component on the page wants to be cached for a maximum of 15 minutes, the template's instructions win because they are stricter. The page is cached for 10 minutes.
The following options are not best practices but they may help you during testing. Don't use them as a long-term production strategy.
- Dummy URL parameter: The simplest way to exclude content is to link to the relevant page with a dummy query parameter in the URL such as
http://www.example.com?a=1. A more subtle solution is to add
bypassto the cache filter. This ensures that no cache filter is executed on particular URLs.
- Deny URLs in cache policy: To exclude a URL from caching, add the URL to the
denylist of the cachePolicy. Entries on the
denylist are not cached by Magnolia CMS but are taken through the entire filter chain, meaning that other policies such as
BrowserCachePolicycan still be applied. In effect this solution, 'switches off' caching for the URL in question.
- Regular cache flush: Flush a page from the cache at regular intervals. This involves reconfiguring the underlying cache engine (Ehcache).
Setting cache headers
Cache header negotiation uses standard cache response headers. Cache needs to be enabled for the site or cache headers have no effect on the server side cache. The mechanism is built into the Cache module and requires no extra modules. Cache header negotiation is being introduced to Magnolia CMS in two phases:
- Code: Developers set the cache headers in the rendering model of a component or a template. This is the current implementation.
Example 2: Setting a cache header in a JSP template script.
<% response.setHeader("Cache-Control", "no-cache"); %>