Magnolia 5.3 reached end of life on June 30, 2017. This branch is no longer supported, see End-of-life policy.
This is a geolocation tutorial. It aims to lower the barrier of entry to Magnolia and get new developers to write modules. The topic answers a popular question:how to detect the visitor's geographical location using a public geo-API
Display Localized Content with Magnolia and a Custom Geoservice Module
Peter Wayner
By taking a user's location into account when displaying content, web sites can add value to the user experience and increase their overall relevance and utility. This article explains how to add geo-location support to Magnolia by building a custom module that combines information from the public IPInfoDB geo-location API with Magnolia's JCR search features and bundled jQuery support to produce responsive web pages displaying localized content.
Did a hurricane sweep through your town thirty years ago? Is there going to be a concert in your town hall in a few weeks? Is there a new business opening up on the town square? Wherever you go, you always want the news to follow you.
There's no reason why a web site needs to present the same information to everyone, and one way to choose the articles to present is by looking at the visitor's location. A site might pick only local articles for the front page, or it might fill a box in the sidebar with specific local content.
Magnolia's customization mechanism is flexible enough to accommodate all of these choices and this article discusses one solution that injects a list of local stories containing the name of a user's city into the output Web page. The server sends the user's IP address off to a location service and then uses the result to do a keyword search on the content. Magnolia then bundles up the results so that it appears as a block in the output page.
This article examines how to build a Magnolia geo-location plugin by describing key steps involved in the process:
Accomplishing this with a Magnolia plugin is relatively straightforward. The following sections approach each of these topics in turn.
The first task is choosing a good way to identify the location of the user. There are two main ways that a user can supply the information, one voluntary and the other involuntary.
There are several problems with this solution. First, asking the user for permission is polite but intrusive. It breaks the flow and slows up interaction with a web site. Second, the data is often not available. Desktop browsers might ask permission and then report that they don't know the location, so the intrusion is all for naught.
After some debate, I chose using the IP address because it was more likely to be useful for desktop users. If I was building a site that was mainly for mobile users, I might want to rely on the more accurate GPS data, but this approach is simpler and more general for the end-user.
There are several different options for converting the IP address into a location:
I chose to use the free service from IPInfoDB because it's easier for users who want to experiment. They also offer an upgrade to a commercial service from IP2Location that provides better accuracy and reliability.
Converting an IP Address is as simple as requesting a URL in this format:
http://api.ipinfodb.com/v3/ip-city/?key= <your_api_key>&ip=74.125.45.100
The API key is assigned to each user and you must register for one before the service will work.
Here's what it returns:
OK;;74.125.45.100;US;UNITED STATES;GEORGIA;ATLANTA;30301;33.809;-84.3548;-05:00
This API does not include any XML or JSON formatting, but others use many of the typical formats.
To fetch this information, I chose the Java library JSoup, a tool for scraping web sites that can request a URL and then parse any of the response HTML. It has more functionality than this application requires, but its ability to understand XML and HTML might be useful in other applications. To add this library, I just included this dependency in the Maven build file pom.xml.
<dependency> <!-- jsoup HTML parser library @ http://jsoup.org/ --> <groupId>org.jsoup</groupId> <artifactId>jsoup</artifactId> <version>1.6.3</version> </dependency>
The business logic for Magnolia is placed in a separate Java object that is attached to a template and queried for information. If the business logic is complex, it makes sense to use good programming practices to split up the Java code into multiple classes. This version is kept in one class file for simplicity.
The Java code does four things:
Let's look at each of these aspects below.
The IP address of the user is not hard to find, but it's not obvious. Magnolia keeps most of the information about the request out of sight of the Java model object unless the object requests it.
Here's the method used to find the IP address. The
HttpServletRequest
object is found in the WebContext
object. This can be found by calling the static method MgnlContext.getWebContext()
. It is a subclass of the
ServletRequest
class that offers the method for extracting the IP address of the user we want, getRemoteAddr()
.
String getIPAddress(){ WebContext c = MgnlContext.getWebContext(); String path = c.getContextPath(); HttpServletRequest request = c.getRequest(); String addr=request.getRemoteAddr(); if (CheckForLocalHostDuringDebugging){ addr=cleanseIPAddress(addr); } return addr; }
The code includes an extra feature helpful for debugging. If you're running the code on the same desktop you're using to test the code, the IP address is likely to be 127.0.0.1 and this number can't be located using the API.
This test routine should be shut off during operation. It adds a debugging IP address that can be found in the main database.
String cleanseIPAddress(String s){ if (s.equalsIgnoreCase("127.0.0.1")){ return ReplacementIPAddress; } return s; }
Caching the results is very useful because calling a distant API is often relatively slow and can sometimes also be expensive. Many of the commercial APIs count the number of requests and bill accordingly. So, caching the results from the API makes plenty of sense.
The cache used here is very simple: it just uses a HashMap
object to store the responses from the API. There's no logic for removing items, even when they get too old. This is probably not a practical problem because IP addresses seem to be relatively stable, but a full-featured implementation would do a better job.
public String[] getPartsWithCache(String addr){ String[] ans = ipLocationCache.get(addr); if (ans==null){ // not found. Go to web. ans=getParts(addr); ipLocationCache.put(addr, ans); } return ans; }
The location API from ipinfodb.com
takes an IP address as a parameter and returns a String
separated by semi-colons. Here's a routine that calls the JSoup library:
public String[] getParts(String addr){ String fetchMe=urlBase+addr; Document doc; try { doc = Jsoup.connect(fetchMe).get(); String text=doc.text(); String[] parts=text.split(splittingCharacter); return parts; } catch (IOException e) { e.printStackTrace(); } return null; }
You'll notice that the JSoup routine does little more than fetch a URL. The result is split with the standard String split
method. Most of the power of JSoup to parse the results is left untapped here, but it may be useful if you use an API with a more complex response.
When the location API returns the name of the city, state and country associated with an IP address, the plugin must search for these keywords. Magnolia stores the content with the Lucene search engine and implements search according to the Java Content Repository standard.
There are two different ways to write your query for Magnolia. One syntax mimics SQL, and the other mimics the XPath format used to search XML. The general consensus is that the SQL structure is simpler to use when searching for keywords, and the XPath-like syntax is better when searching only portions of content under a limited path. The XPath syntax, though, is deprecated and is no longer supported. In this example, the SQL-like syntax is probably the best choice.
This search method calls the static object QueryUtil
with a query:
public Collection<Content> search(String s){ String q="select * from mgnl:content where contains(*,'"+s+"')"; Collection<Content> ans = QueryUtil.query("website", q); return ans; }
The results come in the form of a Java Collection
object containing the Content
objects.
This simple routine collects the IP address, calls the API, searches the local content repository, and then returns the result as a Collection:
public Collection<Content> getTextsBasedUponIP(){ String addr=getIPAddress(); String[] parts = getParts(addr); Collection<Content> ans = fallBackSearch(parts); return ans; }
This will be visible to the template as the variable model.textsBasedUponIP
. Magnolia connects the Java object used as the model with the getter and setters.
The job of the template is to convert the information from the Collection
of Magnolia Content
objects into something readable by a user. The code uses one looping construct and a number of techniques for extracting data from each individual Content
object.
Magnolia offers two different languages for writing a template. The first, the classic JSP, is a good start but the Freemarker templating system is typically the simplest to use. This example uses the Freemarker approach, although it should be reasonably straightforward to use the model object to work with a JSP-based template too.
Here's the code:
[#assign cms=JspTaglibs["cms-taglib"]] [@cms.editBar /] <ul> [#list model.textsBasedUponIP as n] <li> <a href="${model.initialPath! }${n.@handle}">${n?node_type} -- ${n.title}-- ${n.metaData.creationDate}-- ${n.@name}-- ${n.@handle}-- ${n.@uuid}</a> <li> <a href="${mgnl.createLink(n)}">${n?node_type} -- ${n.title}-- ${n.metaData.creationDate}-- ${n.@name}-- ${n.@handle}-- ${n.@uuid}</a> The current page: ${page.@handle} <br> The current node handle: ${n.@handle} <br> The current node name: ${n.@name}<br> The current node uuid: ${n.@uuid}<br> The current paragraph definition: ${def.name}<br> Paragraph model: ${model}<br> Action result: ${actionResult!'... no action result here'}<br> Current locale: ${ctx.locale}<br> Aggregation state: ${aggregationState}<br> </li> [/#list] </ul>
The template builds a bulleted list in HTML with the <ul> element and embeds each item in <li> element. The loop is built with this syntax:
[#list model.textsBasedUponIP as n]
This construction will loop through all elements in a Collection
and insert each item in the variable 'n' in turn. The individual Content
objects are constructed by Magnolia and use their internal format. The parts of the object can be extracted using the dollar-sign notation like this ${n.@name
}.
The list should also include a clickable link in case a reader wants to go to the full page containing that information. The path to this URL needs to be constructed with two parts: ${model.initialPath
} and ${n.@handle
}. Concatenating these creates a URL that can be clicked to take the user to the correct page.
The classic model of a dynamic web site requires that the server collect all of the information before assembling it into the page that is sent to the user. This is often manageable when all of the information is available locally, but it can lead to slow performance when the data comes from different locations around the Web. In this example, the call to determine the location of the IP address is both predictable and often slower because the API is located on a different server.
One solution is to create a web page that loads the localized links after the fact. The initial page contains a blank DIV that is filled by a subsequent call back to the main site. The initial page will arrive quickly because it can be the same for each user. It can be cached and served immediately. The second subpage can take longer to execute because the user is happily looking at the main, static information.
Magnolia bundled the jQuery library which includes several good routines for loading blocks of data after the main page arrives.
The source code for this project can be found in the Magnolia Git repository under the project name Geo. The code includes a number of enhancements to make it easier to use the tool in a working web site. The bulk of the code is found in the class GeoserviceLocationParagraphModel
and the other classes help assimilate the core of the code with configuration options available through Magnolia. Future articles will discuss how to add these configuration options to make your plugin more reliable and easier to configure for people who want to use it in a production web server. The code in this article is stripped down to make it easier to understand. For the complete code, see http://git.magnolia-cms.com/gitweb/?p=forge/geo.git;a=summary.
This approach creates a simple connection between a user's location and the content shown on the web page by searching the site for all content with the name of the user's address. It places this information in a separate list that can be loaded independently to help speed up the creation of the main page.
This approach is just one of the different ways that IP address information can be used with Magnolia. Some developers create a separate set of Filters in the chain of filters that process requests. One can turn the IP address into a location and add the location name to a parameter that a subsequent filter can use. This can be more useful when you implement the IP address to location name conversion locally, so that it doesn't slow down the entire chain.
A common feature is to add the location to a cookie so subsequent requests do not need to look up the IP address, a solution that's even more efficient than using the cache. The cookie can also include some information about the articles presented already to allow the site to rotate the articles presented. The cookie should probably expire relatively frequently because the cookie is stored on the user's computer and the user may travel to different locations.
As this article illustrates, Magnolia provides a powerful framework to add a geo-location module to your web site. The module illustrated in this article was reasonably simple, but you can leverage Magnolia's framework to make it as complex as you wish. Try it out, and let me know what you think!