Magnolia 4.5 reached end of life on June 30, 2016. This branch is no longer supported, see End-of-life policy.
The Google Sitemap module creates an XML sitemap file that lists URLs for each page. Sitemaps are used to tell search engines which pages they should index. This improves search engine optimization (SEO) by ensuring that all site pages are found and indexed. This is particularly important for sites that use dynamic access to content such as Adobe Flash and for sites that have JavaScript menus that do not include HTML links. Where navigation is built with Flash, a search engine will probably find the site homepage automatically, but may not find subsequent pages unless they are provided in a Google Sitemap format.
Note that using Google Sitemaps does not guarantee that all links will be crawled, and even crawling does not guarantee indexing. Nevertheless, a Google Sitemap is still the best insurance for visibility in search engines. Webmasters can include additional information about each URL, such as when it was last updated, how often it changes, and how important it is in relation to other URLs in the site. Google Sitemaps adhere to the Sitemaps protocol and are ready to be submitted to search engines.
The Google Sitemap module is not bundled with the Community or Enterprise Editions. Download the module from Magnolia Store or Nexus repository.
To install the module, follow the general module installation instructions.
See the general module uninstalling instructions and advice.
The Google Sitemap module is configured in /modules/google-sitemap
.
To create a sitemap:
standard
and mobile
. Google recommends that you use separate sitemaps for different content types. Mobile sitemaps use compliant mobile-specific tag and namespace requirements.dialog
property to the Sitemap template if the dialog does not open. For earlier versions see below. The root node of the selection will not be included in the site map. Assume you have the following trees: /a/b/c
and /a/b/d
. If you select /a/b
as the root of the sitemap, only pages under c
and d
will be included in the map. The root node b
will not be included.
Sitemap links are generated using the protocol that is defined in your site definition. The default protocol is HTTP. If you want HTTPS define the protocol in domain mapping.
For versions up to 4.5.13, the Google sitemap page does not have a properties dialog and mobile sitemaps are not supported. To exclude the sitemap page from site navigation, the options are:
dialog
property in the siteMapsConfiguration
template definition. Set the value to the generic standard-templating-kit:generic/master/basePageProperties
dialog. Then open the dialog and check the Hide in navigation box.hideInNav
property under the sitemap page node and set it to true
.To define properties for the entries click Edit properties:
priority
in XML Sitemap protocol.always
, hourly
, daily
, weekly
, monthly
, yearly
and never
. Use the value always
for pages that change each time they are accessed. Use never
for archived pages that will never change. See changefreq
in XML Sitemap protocol.hideIngoogleSiteMap
property is stored in the page itself. This means you need to activate the page. Activating the sitemap only is not enough.To include virtual URIs in your sitemap, add the VirtualUriComponent to the page. No dialog is associated with this component. The component directly renders virtual URIs defined in this instance. Virtual URI mappings are a Magnolia method of redirecting requests and shortening URLs. The Google Sitemap module reads all virtual URI mappings from the system and lists them here. Set properties as required. The entries display as list (as opposed to a tree) and you can set the same properties that are available for pages, except Hide children that is inapplicable.
Hide default mappings defined in the adminInterface module, such as those for accessing AdminCentral. Public users will not access AdminCentral, so these URLs do not need to appear in the sitemap. Also, hide mappings that use regular expressions in the toURI
property. These are not understood by search engines as regular expressions.
Activate the sitemap page to the public instance to ensure that it is accessible to the search engines. You also need to activate any pages you excluded from the sitemap.
You can view the XML sitemap on the author or public instance at /<CATALINA_HOME>/<contextPath>/<sitemap name>.xml
, for example, http://localhost:8080/magnoliaPublic/sitemap.xml
. Note that a filter mechanism removes duplicate URLs.
Here's the rendered XML for a standard
and mobile
sitemap for the demo-project
site. Note the use of the mobile
tags.
The siteMapsConfiguration
page template renders the sitemap. The configuration is at /modules/google-sitemaps/templates/pages/siteMapsConfiguration
:
modelClass
:
SiteMapModel
is the main model class for site map templates.templateScript
: main.flt
(GIT) includes
two alternative scripts, mainXml.ftl
(GIT) and mainConfiguration.ftl
(GIT) that renders text or XML content dependent on the URL extension.dialog
: google-sitemap:pages/googleSitemapProperties
.Add the following line in your robots.txt
file. Include the full URL to the sitemap:
Sitemap: http://www.example.com/sitemap.xml
Submit the sitemap to major search engines via the webmaster tools of each engine or wait for the engines to find the sitemap on their own.
1 Comment
James Allingham [X]
Hi,
When I generate a sitemap it contains vwqwertyp, de and default in the URL e.g. The below 2 urls are present in one sitemap:
1. http://localhost:8080/vwqwertyp/default/de/contact-us
2. http://localhost:8080/vwqwertyp/default/contact-us
I don't want vwqwertyp, default and de in the urls. I only want true urls in the sitemap. So in the above example one URL should be included in sitemap; it should be:
http://localhost:8080/contact-us because this is how contact-us page is accessed by end use via the public. We do not have a german site so I don't know where de comes from.
I also generated a sitemap using the demo site, even that contains de. Is there some kind of configuration to include and exclude these from the sitemap.xml
ex:
<url>
<loc>http://travel-demo.magnolia-cms.com/meta/privacy.html</loc>
<lastmod>2015-06-18</lastmod>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
</ur>
<url>
<loc>http://travel-demo.magnolia-cms.com/de/meta/privacy.html</loc>
<lastmod>2015-06-18</lastmod>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
It would be appreciate taht if you can give me any advice.