Page tree
Skip to end of metadata
Go to start of metadata

The Google Sitemap app is installed by the Google Sitemap module.

The Google Sitemap app creates an XML sitemap file that lists URLs for each page. Sitemaps are used to tell search engines which pages they should index. This improves search engine optimization (SEO) by ensuring that all site pages are found and indexed. This is particularly important for sites that use dynamic access to content such as Adobe Flash and for sites that have JavaScript menus that do not include HTML links. Where navigation is built with Flash, a search engine will probably find the site homepage automatically, but may not find subsequent pages unless they are provided in a Google Sitemap format.

Note that using Google Sitemaps does not guarantee that all links will be crawled, and even crawling does not guarantee indexing. Nevertheless, a Google Sitemap is still the best insurance for visibility in search engines. Webmasters can include additional information about each URL, such as when it was last updated, how often it changes, and how important it is in relation to other URLs in the site. Google Sitemaps adhere to the Sitemaps protocol and are ready to be submitted to search engines.

Configuration

The Google Sitemap app is installed by the  Google Sitemap module . The app is based on the content app framework so its configuration is typical of any content app. The app is configured at /modules/google-sitemap/apps/siteMaps.

Node name

 modules

 google-sitemaps

 apps

 siteMaps

Workspace

The Google Sitemap app operates on the googleSitemaps workspace that stores the sitemaps.

Node types

The Google Sitemap module registers a custom mgnl:siteMap node type. The Google Sitemap app operates on nodes of this type and on folders.

Creating a sitemap

Sitemaps are created in the Google Sitemaps app.

To create a new sitemap,

  • Click Add sitemap. You can arrange your sitemaps in folders if you like.
  • Site Map Properties:
    • Name: The internal name of the sitemap. It is used in the URL that renders the sitemap. See Viewing the sitemap.
    • URI: Optional. Virtual URI that redirects visitors to the actual sitemap page. Useful for shortening a long URI.
    • Include virtual URIs: Select to include any defined virtual URIs.  Virtual URI mappings are a Magnolia method of redirecting requests and shortening URLs. The apps reads all virtual URI mappings from the system and lists them in the Virtual URIs subapp .
    • Site map type: Two sitemap types are available, Standard and Mobile. Google recommends that you use separate sitemaps for different content types. Mobile sitemaps use compliant mobile-specific tag and namespace requirements.
  • Site Selection:
    • Sites: Select the relevant site in the Pages chooser. You can also select subpages as the root node to, for example, create different sitemaps for site sections.

      The root node of the selection will not be included in the site map. Assume you have the following trees: /a/b/c and /a/b/d. If you select /a/b as the root of the sitemap, only pages under c and d will be included in the map. The root node b will not be included.

      Sitemap links are generated using the protocol that is defined in your site definition. The default protocol is HTTP. If you want HTTPS define the protocol in domain mapping.

  • Default Value Selection:
    • Change frequency: Select the default value of Change frequency to use in the current site map.
    • Priority: Select the default priority value to use in the current site map.

The default values displayed in the Default Values Selection tab are configured in /modules/google-sitemap/config.

Node nameValue

 modules

 

 google-sitemap

 

 config

 

 changeFrequency

weekly

 priority

 0.5

Editing sitemap entries

To edit the individual sitemap entries click Edit site map entries. The site pages display in an expandable tree and you can set properties for each page.

To define properties for the entries click Edit entry properties:

  • Priority: Priority of the page relative to other pages on the site. Values range from 0.0 (low) to 1.0 (high). Default is 0.5. Set the priority of your most important page to 1.0. Setting all pages to 1.0 does not increase the rank of your site in search results since the importance is a relative measure among pages of the same site. A search engine may choose to rank the page higher than other pages of the site based on the value, however. See priority in XML Sitemap protocol.
  • Change frequency: Suggested frequency for search engines to crawl the page. Valid values are: always, hourly, daily, weekly, monthly, yearly and never. Use the value always for pages that change each time they are accessed. Use never for archived pages that will never change. See changefreq in XML Sitemap protocol.
  • Hide: Excludes a page from the sitemap. Child pages are not excluded automatically. (warning) The hideIngoogleSiteMap property is stored in the page itself. This means you need to activate the page. Publishing the sitemap only is not enough.
  • Hide children: Excludes child pages from the sitemap. To exclude both a parent and its children check both  boxes.

Editing virtual URI entries

If you included virtual URIs in the sitemap you can edit their properties. Click Edit virtual URI entries to open the Virtual URIs subapp. The pages display as individual entries (as opposed to a  tree) and you can set the same properties that are available for pages, except Hide children that is inapplicable.

Hide default mappings defined in the ui-admincentral module, such as those for accessing AdminCentral. Public users will not access AdminCentral, so these URLs do not need to appear in the sitemap. Also, hide mappings that use regular expressions in the toURI property. These are not understood by search engines as regular expressions.

Publishing

Publish the sitemap to the public instance to ensure that it is accessible to the search engines.

Viewing the sitemap

You can view the XML sitemap on the author or public instance at /<CATALINA_HOME>/<contextPath>/sitemaps/<sitemap name>.xml, for example, http://localhost:8080/magnoliaPublic/sitemaps/standard-google-sitemap.xml . Note that a filter mechanism removes duplicate urls.

Exporting the sitemap

The Export option in the Action bar of the Google Sitemap app will export the sitemap itself (a <urlset>) named as <sitemap name>.xml. The JCR configuration for such a sitemap can be exported from the JCR Tools app, in which case the XML file will be named as googleSitemaps.<sitemap name>.xml .

Sitemap template

The siteMapsConfiguration page template renders the sitemap.

The definition is configured in /modules/google-sitemaps/templates/pages/siteMapsConfiguration:

Node nameValue

 google-sitemaps

 

 templates

 

 pages

 

 siteMapsConfiguration

 

 i18nBasename

info.magnolia.module.googlesitemap.messages

 modelClass

info.magnolia.module.googlesitemap.model.SiteMapModel 

 name

GoogleSiteMap

 renderType

 freemarker

 templateScript

/sitemap/pages/main.ftl 

 title

GoogleSiteMap

 visible

 false

Properties

siteMapsConfiguration 

modelClass

SiteMapModel is the main model class for site map templates.

templateScript

main.flt (GIT) includes two alternative scripts, mainXml.ftl (GIT) and mainConfiguration.ftl (GIT) that renders Text or XML content dependent on the URL extension.

 

Virtual URI mapping

The virtual URI mapping configuration is in /modules/google-sitemaps/virtualURImapping/sitemaps. SiteMapVirtualUriMapping  compares source URI to names of sitemaps available in googleSitemaps workspace and prepends the prefix.

Node nameValue
 google-sitemap
 

 virtualURIMapping

 

 siteMaps

 

 class

info.magnolia.module.googlesitemap.config.SiteMapVirtualUriMapping

 prefix

redirect:/sitemaps

Properties:

virtualURIMapping 

class

SiteMapVirtualUriMapping compares source URI to names of sitemaps available in googleSitemaps workspace and prepends the prefix.

prefix

Prefix prepended by the class.

 

Adding to robots.txt file

Add the following line in your robots.txt file. Include the full URL to the sitemap:

Sitemap: http://www.example.com/sitemap.xml

Submitting to search engines

Submit the sitemap to major search engines via the webmaster tools of each engine or wait for the engines to find the sitemap on their own.

11 Comments

  1. We see a template is available. Is it also possible to generate sitemap.html which we can display on our site?

    1. Hi Michaël-

      It’s possible, but it does not do this out of the box. You could hot fix the template to respond in HTML if the request comes in with a dot html extension. So I would create a mainHTML.ftl to generate the HTML the way I need it. Then hot fix main.ftl to respond with mainHTML.ftl if the extension is dot html. Hope this is clear.

      Cheers

      Rich

  2. Hi,

    I have a question regarding sitemap.xml page we generate via the sitemap app. in the xml file we have two urls, one from author instance and one from public instance. how do we get only public instance urls rather than having both urls in xml sitemap?

    Cheers

     

    1. Hi James-

      I'm sorry but I do not follow you. Is it possible for you to paste the xml output to Pastebin or something similar? Or perhaps you could describe the issue using our demo? I went to the demo and tried creating a sitemap from the travel site but didn't see anything wrong there.

      Thanks

  3. Hi

    Thanks for your reply. 

    this is from author instance url which is from the xml page

    <lastmod>2016-12-15</lastmod><loc>http://localhost:8080/mgnl-cms-war/default/de/money/login</loc><priority>0.5</priority></url>

    this is same url but from public instance

    <lastmod>2016-12-15</lastmod><loc>http://localhost:8080/mgnl-cms-war/default/money/login</loc><priority>0.5</priority></url>

    how to generate the xml sitemap without author instance url?

    can you see my point?

     

    1. Ok, I think I see a little better what you are talking about.

      In your above example you have one for the German site (author) and one for the English site (public). But I do see what you mean. I think what is happening here is that you have your author and public deployed the same way. So the URLs that are generated end up being the same. 

      During my testing for the travel site I noticed that the generated URLs take the configuration at my site definition.

      <url><loc>http://travel-demo.magnolia-cms.com/meta/about-demo.html</loc><lastmod>2015-06-25</lastmod><changefreq>weekly</changefreq><priority>0.5</priority></url>
      1. Sorry about the confusion. Just to make it clear we have magnolia deployed on tomcat under vwqwertyp path. So all the author URL's contain vwqwertyp e.g.
        http://localhost:8080/vwqwertyp/.magnolia/admincentral#shell:applauncher:; Public URLs do not have vwqwertyp.

        When I generate a sitemap it contains vwqwertyp, de and default in the URL e.g. The below 2 urls are present in one sitemap:

        1. http://localhost:8080/vwqwertyp/default/de/contact-us
        2. http://localhost:8080/vwqwertyp/default/contact-us

        I don't want vwqwertyp, default and de in the urls. I only want true urls in the sitemap. So in the above example one URL should be included in sitemap; it should be:
        http://localhost:8080/contact-us because this is how contact-us page is accessed by end use via the public. We do not have a german site so I don't know where de comes from.

        I also generated a sitemap using the demo site, even that contains de. Is there some kind of configuration to include and exclude these from the sitemap.xml


        Thanks

        1. In community edition, URLs generated by sitemap app takes the default base URL configured into /server/defaultBaseUrl property as context path of all URLs.


          It would be nice if Magnolia includes this detail into this documentation page.

          Hope this help

  4. Hi all,

    in this documentation page you say:

    The hideIngoogleSiteMap property is stored in the page itself. This means you need to activate the page. Publishing the sitemap only is not enough.

    It would be nice if you specify the same for virtual URI Mappings, because it is needed to publish its also, because publishing only sitemap is not enough.

    Also, not only the hide property needs that publishing logic, but also other sitemap entries, the same in virtual URI mappings, so it should be specified in this documentation.

    Thanks

  5. Hi,
    We have a site that generates pages dynamically for the products. Is there a way to create sitemap using the module for dynamic pages? We have multiple sites running on the instance. and if I need the sitemap xml with the same name as sitemap.xml on the root of the site, how do I create one for each site?
    Thanks

    1. I did this by extending the class SiteMapService, overwriting the method getSiteMapBeans and configuring it as a component.

      	<components>
      		<id>main</id>
      		<component>
      			<type>info.magnolia.module.googlesitemap.service.SiteMapService</type>
      			<implementation>com.myproject.module.sitemap.CustomSiteMapService</implementation>
      			<scope>singleton</scope>
      		</component>
      	</components>