Magnolia 4.5 reached end of life on June 30, 2016. This branch is no longer supported, see End-of-life policy.
The RSS Aggregator module displays external feed content on a Magnolia page (feed aggregation) and generates feeds from Magnolia content (feed syndication). Feed syndication increases your content exposure. This in turn generates more traffic and backlinks that improve your site rank. Displaying aggregated feeds provides continuous fresh content on your site, which encourages regular crawling by search engines. RSS feeds can also add a sense of timeliness and community involvement to your site.
Magnolia 4.5.10+ / RSS Aggregator 1.4.2+ Planet feeds enhance standard feeds by collecting feed statics. Planet Magnolia is an example site that uses the module to collect blog posts about Magnolia.
A feed aggregator displays external feed content on a Magnolia page. This provides continuous fresh content and encourages regular crawling by search engines.
In this example we collect two individual RSS feeds into one aggregate feed:
Both have a common topic – search engine optimization (SEO) – so they are good candidates for aggregation. A reader interested in one feed is likely to be interested in the other too.
To create a RSS aggregator:
SEOBlogs
. This is the internal name of the aggregator.To import the feed data:
See also: Scheduling an automatic feed import.
To display the aggregated feed on the site:
/demo-features/aggregation-paragraphs/rss-aggregation
.content
area.Importing feed data manually is not a long-term solution. You should configure an automatic import schedule. Choose the update frequency depending on how often the feed has new content. For example, if a blog gets one post a day then importing once a day at 6 a.m. is enough. For more see Importing data.
The rssaggregator
importer is configured in /modules/data/config/importers/rssaggregator
.
To schedule automatic feed import in /automatedExecution
:
enabled
property to true
.0 0 6 * * *
imports data daily at 6 a.m. See Cron Maker for help. Properties:
targets
: Defines where the imported data is stored.main
: Name of the target.class
:
SimpleImportTarget
stores at the path configured in targetPath
. targetPath
: Path in the workspace. Typically the root /
. The workspace name is configured in the repository
property below.automatedExecution
: An import can happen either manually or automatically. This is the automatic import schedule.cron
: Cron job pattern for scheduled execution of the import handler. .enabled
: Enables and disables scheduled import.feedFetcher
:class
:
FastRSSFeedFetcher
retrieves RSS feed content over HTTP. Fetches all feed channels defined in an aggregate feed.activateImport
: Allows imported data to be activated (published) automatically after a successful import. Default is false
.backup
: Backs up feed content automatically if you are importing feeds automatically at set intervals. Default is true
.class
:
RSSFeedImportHandler
imports RSS and Atom feeds over HTTP for aggregate feeds. You can optionally configure a feedFetcher
that executes the actual import.deleteOldData
: Deletes and deactivates data that is no longer found in the external system. Default is false
.repository
: Target workspace for the imported data. Default is data
.
You can use
SimpleRSSFeedFetcher
as an alternative to FastRSSFeedFetcher
. This simple, single-threaded fetcher reduces server load and is suitable when you don't have many feed channels and you don't import often.
Feed data is stored in the data
workspace. View the data in Data > JCR Browser.
Here's the SEOBlogs example.
Structure:
<RSS aggregator name>
importState
: Internal feed properties.feeds
: Feed information you entered into the dialog when you created the aggregator.<feed number>
img
: Image displayed in planet components.link
: Feed URL.title
: Title displayed in RSS components.data
: Data that was retrieved from the Internet. channel-<number>
: A channel is created for each RSS feed.description
: Feed properties retrieved from the internet.entry-<number>
: One entry in the feed such as one blog post.author
channelTitle
content
Content will only display in the feed components after it has been imported. Unless the /modules/data/config/importers/activateImport
property is set to true
, it is necessary to activate the aggregator in Data > RSS Aggregator to display feed content on the public instance. This activation also activates the feed data.
There are two standard feed components:
STKCombinedFeed
: Combines all feeds in a channel and renders them sequentially. You can set the number, sort order and character limit of the entries.STKSingleFeed
: Displays a defined number of entries for each feed in the channel. The external feed title is used if no internal title is set. You can set the number, sort order and character limit of the entries.For non-STK users, the RSS Aggregator module provides equivalent components, CombinedFeedParagraoh
and feedListParagraph
. The two sets of components are essentially identical.
The components definitions are configured in STK > Template Definitions /components/teasers/stkCombinedFeed
and /stkSingleFeed
.
Here's the stkCombinedFeed
definition:
The model class and template script are the important properties that determine content in the different components. All RSS model classes extend
AbstractFeedModel
that provides the business logic to retrieve a defined feed and it's data from the data
workspace and supports:
Entry sorting by title or publication data.
Ascending and descending entry sorting.
Maximum results property. Default is 20.
Search capabilities
Any custom model class should extend AbstractFeedModel
.
Magnolia 4.5.10+ / RSS Aggregator 1.4.2+ Planet feeds have all the features of standard RSS feeds but also store additional data that is used to create feed statistics. For example, a planet feed will tell you the number of posts by an author. You can use this information in a Planet Statistics component.
To create a planet feed check the Planet Feed box in the aggregator dialog when creating or editing a feed. Feeds can be marked as a planet feed at any time. When you change a standard feed to a planet feed, re-import the feed data before generating the additional planet data (described below).
The module includes two custom commands that generate planet data. These are configured in /modules/rssaggregator/commands/planet/generatePlanetData
and /collectPlanetStatistics
.
PlanetDataGenerator command is executed when planet data needs to be updated. Here's what the command does:
author
, channelTitle
, title
, content
and description
properties. Posts without these properties are not stored in the archiveCollectStatisticsCommand command is executed when planet statistics need to be updated. Here's what the command does:
Use the Scheduler module to schedule planet updates. The update schedules are stored as a standard scheduler job in /modules/scheduler/config/jobs/generatePlanetData
and /collectPlanetStatistics
. The jobs execute the planet commands that do the actual work.
To schedule a planet update:
active
property to true
.0 0 6 * * *
imports data daily at 6 a.m. See Cron Maker for help. There is no way to manually generate the planet data. You can use the pattern
0 0/5 * 1/1 *
to generate the data five minutes after changing the settings.
To display planet components on a site:
Make the components available in any area. You can make them available in a specific page template or globally for all pages in the template prototype. The planet components are not available by default.
In Website edit a page where the components are available.
Add the components to the page.
Here are the planet components using our SEOBlogs example data.
Planet Feed: Displays feed posts, including images. Editors can define the length of the text, and pagination is available.
Planet Statistics: Shows a list of authors with statistics (number of posts).
Planet Authors: Shows a list of sites with individual feed subscriptions.
Planet data is used to generate content for planet components. The data is stored in the data
workspace. View the data in Data > JCR Browser.
Here's the SEOBlogs example after marking it as a planet feed, re-importing the data, and running the planet commands.
Properties:
<aggregator name>
planetData
: The GeneratePlanetData
command reads feed data and stores it here.posts-<number>
: All posts from all feeds. First all entries from the first feed, newest entry first. Then all posts from the second feed, and so on.entry-<number>
: One entry in the feed such as one blog post.author
authorLink
channelTitle
checksum1/2
: The module uses checksums to handle duplicate entries. Because feed data is deleted and recreated on every run, there is a high probability that a subsequent run will include entries that were contained in a previous run. Some entries may only have changed slightly, for example a different publication date. To avoid duplicates, the PlanetDataGenerator
command uses checksums for each entry. Two checksum
properties are generated. If an archive node with one of the checksums exists, no data is stored for the new item and an INFO level entry is written in the logs.description
hidden
: Set to true
to hide the entry from the planet feed. Useful for hiding spam entries.link
pubDate
rssLink
title
Planet statistics are generated from planet data and stored in the data
workspace. View the data in Data > JCR Browser.
Here's the data for the SEOBlogs example.
Properties:
<aggregator name>
statistics
: The CollectPlanetStatistics
command extracts statistics from the /planetData
node. This node is deleted and recreated on every run of the command.authors
: All authors from all feeds.author-<number>
: Each author is allocated a number. author
blogLink
feedLink
postCount
: Number of posts by this author in the aggregate feed.counted-posts
: Each child node is a reference to a post in the feed.<post UUID>
Planet data is generated and stored for the last 3 months by default. You can configure the time period for which the data is retained in /modules/rssaggregator/config/planetOptions/lastMonthsIncluded
.
Properties:
planetOptions
lastMonthsIncluded
: Number of months. Default is 3.Planet component definitions are configured in /modules/rssaggregator/templates/components/
.
The Planet Feeds component is an enhanced version of the STK Combined Feed component and can only be used with planet feeds. The component:
The component definition is in /modules/rssaggregator/templates/components/planetFeeds
.
The Planet Statistics component uses the planet statistics data. The component:
The component definition is in /modules/rssaggregator/templates/components/feedStatistics
.
The Planet Authors component allows users to subscribe directly to the external feeds on a Magnolia page. The links take the user to the external sites.
The component definition is in /modules/rssaggregator/templates/components/feedSubscriptions
.
Feed syndication increases your content exposure. This in turn generates more traffic and backlinks that improve your site rank.
Feed generators generate RSS feeds from Magnolia content and imported content stored in the data
workspace.
Four feed generators are registered in /modules/rssaggregator/config/feedGenerators
.
rss
:
RSSModuleFeedGenerator
generates a SyndFeed
based on standard aggregation feeds.
planet
:
PlanetFeedGenerator
generates a SyndFeed
based on planet aggregation feeds.
category
:
CategorySyndicator
is registered by the Categorization module. It generates a SyndFeed
based on articles tagged with categories.
templateContent
:
PageSyndicator
is registered by the Standard Templating Kit module. It generates a SyndFeed
based on template categories and subcategories.
Custom generators should extend the convenience base class
AbstractSyndFeedGenerator
. Subclasses need to implement the template methods loadFeedEntries()
and setFeedInfo(SyndFeed)
.
FeedSyndicationServlet
writes an XML feed to the response. Based on the request parameters, the feedGenerators
configuration (above) is resolved and used to generate the XML feed. The content of the feed is written to the response with the appropriate character encoding.
The servlet is registered in /server/filters/servlets/FeedSyndicationServlet
.
The syndication components use virtual URI mappings to redirect the generated feeds. The mappings use regular expressions and are called by the feed generator classes to render appropriate content in the XML feed.
/modules/rssaggregator/virtualURIMappings/rssFeeds
and /planetFeeds
.contentFeeds
mapping in /modules/standard-templating-kit/virtualURIMapping/contentFeeds
.categoryFeeds
mapping in /modules/categorization/virtualURIMapping/categoryFeeds
.Three modules provide syndication components: Standand Templating Kit, Categorization and RSS Aggregator. All components rely of the functionality of the RSS Aggregator module.
The STK module includes the stkExtrasContentTypeRSSFeed
component. The component renders an RSS subscription icon on the page. Editors can define feeds that aggregate pages based on the Article, News or Events templates, and select a parent page.
Here's the component definition is in STK > Template Definitions /components/extras/stkContentTypeRSSFeed
:
Properties:
modelClass
:
ContentTypeSyndicateModel
creates a STK renderable definition and returns the appropriate content.
templateScript
: syndicate.ftl
, (GIT) renders the RSS icon in the component and itemLinks
provided by the model class.
The contentFeeds
URI mapping calls the templateContent
feed generator class (
PageSyndicator
) that generates the XML feed for content based on templates with a category
property equal to content
, and a subCategory
property equal to the selection in the dialog. Here's the generated feed for all article pages in the demo-project/about/subsection-articles
section.
The Categorization module includes the categoryRSSFeed
component. The component renders an RSS subscription icon on the page. The generated feed includes all pages tagged with specified categories. Editors can define the categories and root page for the feed.
The component definition is in /modules/categorization/templates/components/categoryRSSFeed
.
Properties:
modelClass
: The
CategorySyndicateModel
model class provides the business logic to select the relevant entries for the feed.
templateScript
: syndicate.ftl
(Git) renders the RSS icon in the component and itemLinks
provided by the model class.
The categoryFeeds
URI mapping calls the category
feed generator class (
CategorySyndicator
) that generates the XML feed for content based on the categories and root page selected in the dialog. The URL for the generated feed of content in the demo-project/about
section, tagged with the family
category, is similar to http://localhost:8080/magnoliaAuthor/rss/?generatorName=category&categories=ab9437db-ab2c-4df5-bb41-87e55409e8e1&siteRoot=/demo-project/about/subsection-articles
. Compare the feed URL to the categoryFeeds
Virtual URI mapping. The long number sequence is the UUID of the family
category set up in Data > Category.
The RSS Aggregator module includes the planet feedSyndication
component that renders subscription Atom and RSS icons and links. Editors can add a title and select a planet feed.
The component definition is in /modules/rssaggregator/templates/components/feedSyndication
.
Properties:
templateScript
: The feedSyndication.ftl
(Git) script renders the icons and links, and the
PlanetFeedGenerator
generator class renders the feed.The anonymous user does not have permissions to the data
workspace on the author or public instance by default. Public users cannot see the component content. You can check this by logging out of the public instance.
The RSS aggregator module creates the rss-aggregaotor-base
role with the following permissions:.
Workspace | Permission | Scope | Path |
---|---|---|---|
Data | Read only | Selected and sub nodes | /rssaggregator |
To give anonymous access to the RSS components assign the rss-aggregator-base
role to the anonymous user on the public instance.