Similar content

Loading

Powered by Canoo FindIT.

Google Sitemap

The Google Sitemap module creates an XML Sitemap file that lists URLs for each page. Sitemaps are used to tell search engines which pages they should index. This improves search engine optimization (SEO) by ensuring that all site pages can be found. This is particularly important for sites that use dynamic access to content such as Adobe Flash and for sites that have JavaScript menus that do not include HTML links. Where navigation is built with Flash, a search engine will probably find the site homepage automatically, but may not find subsequent pages unless they are provided in a Google Sitemap format.

Note that using Google Sitemaps does not guarantee that all links will be crawled, and even crawling does not guarantee indexing. Nevertheless, a Google Sitemap is still the best insurance for visibility in search engines. Webmasters can include additional information about each URL, such as when it was last updated, how often it changes, and how important it is in relation to other URLs in the site. Google Sitemaps adhere to the Sitemaps protocol and are ready to be submitted to search engines.

Download

The Google Sitemap module is not part of the standard Community or Enterprise Edition bundles and can to be downloaded at Magnolia Store or Nexus repository.

Installing

To install the module, follow the standard module installation instructions. The installation process adds a GoogleSiteMap page template and two component templates: SiteComponent for physical pages and Sitemap while VirtualUriComponent allows you to add virtual URIs.

Uninstalling

See the general module uninstalling instructions and advice.

Creating a Google Sitemap page

Create a new page on the site and assign the GoogleSiteMap template to it. You can place the page anywhere in the website tree. You can move the pages under a site root node if you plan to have Sitemaps for more than one site. Multiple Sitemaps are supported.

Tip

You might want to hide this page from navigation. Open the Properties dialog and check the Hide in navigation box.

Including pages in the Sitemap

Add the SiteComponent component on the page in order to include pages in the sitemap. The component allows you to select a site root or branch. All pages under that branch will be listed in the Sitemap.

Note

The root node of the selection will not be included in the site map. Assume you have the following trees: /a/b/c and /a/b/d. If you select /a/b as the root of the Sitemap, only pages under c and d will be included in the map. The root node b will not be included.

The SiteComponent displays the pages under the selected root node as a list.

The following properties define how often the page changes and its importance in relation to the other pages on the site. The search engine will consider the properties as it decides how often to re-index the site.

Property Description
Changefreq Suggested frequency for search engines to crawl the page. Valid values are: always, hourly, daily, weekly, monthly, yearly and never. Use the value always for pages that change each time they are accessed. Use((never}}for archived pages that will never change.
Priority Priority of the page relative to other pages on the site. Values range from 0.0 (low) to 1.0 (high). Set the priority of your most important page to 1.0. Setting all pages to 1.0 does not increase the rank of your site in search results since the importance is just a relative measure among pages of the same site. A search engine may choose to rank the page higher than other pages of the site based on the value, however.
Hide in Sitemap Select to exclude from Sitemap XML. Subpages are included to the sitemap.xml
Hide all children. If this checkbox is selected this page will be included in the Sitemap XML. Subpages are not included to the sitemap.xml.
Both checkboxes selected This page will not be included in the Sitemap XML. Subpages are not included to the sitemap.xml.

Best Practice

Best practice is to hide default mappings defined in the adminInterface module, such as those for accessing AdminCentral. Public users will not access AdminCentral, so these URLs do not need to appear in the Sitemap. Also, hide mappings that use regular expressions in the toURI property. These are not understood by search engines as regular expressions. Note also that if you have the following tree: a/b/c/ and a/b/d, if you select a/b as root, only c and d will display. The root node (b) is not displayed.

Including virtual URIs in the Sitemap

The No dialog is associated with this component. The component directly renders virtual URIs defined in this instance. Can't resolve link to: /technical-guide/virtual-uri-mapping are a Magnolia CMS method of redirecting requests and shortening URLs. The GoogleSitemap module reads all virtual URI mappings from the system and lists them here. Set properties as required. Note that the properties dialog does not have a 'Hide all children' option.

Activating the page

Activate the Google Sitemap page to the public instance. To test whether the Sitemap is served correctly, access it on the public instance at URL. Assuming you placed the map under the demo-project site root the URL is:

http://localhost:8080/magnoliaPublic/demo-project/sitemap

Displaying the Sitemap in XML

To see the XML of the Sitemap, request the page with .xml extension instead of .html. Note that a filter mechanism removes duplicate urls.

Specifying the Sitemap location in your robots.txt file

Add the following line in a robots.txt file. Include the full URL to the Sitemap:

Sitemap: http://www.example.com/sitemap.xml

Submitting a Sitemap to search engines

Submit the Sitemap to major search engines via the webmaster tools of each engine or wait for the engines to find the sitemap on their own.