Sitemaps are the best way for search engine bots to index all your website pages correctly. They also organize the web pages into logical categories and help people easily navigate your website.
As part of its out-of-the-box (OOTB) solution, AEM uses the Apache Sling Sitemap module to generate XML sitemaps. You must explicitly enable sitemap generation in AEM to take advantage of this feature. The module is included in the AEM-as-a-Cloud-Service (AEMaaCs) Software Development Kit (SDK). To use the sitemap generator, you must also install SP11 (Support Pack 11) for AEM.
How to Implement Sitemap Generation
It's important to keep your site's XML sitemap up to date. The popular ACS Commons Sitemap Generator is now deprecated. Instead, we recommend using the AEM OOTB Sitemap generation module, which provides developers and editors with a wide range of options.
The AEMaaCs SDK is bundled with the OOTB Apache Sling Sitemap module. However, if your current AEM version is older than 6.5 service pack 11, a version upgrade is required to use the OOTB sitemap functionality.
Approaches for OOTB Sitemap Generation
To enable the sitemap for AEM Sites, you can use the Apache Sling Sitemap module in one of two ways: on-demand and background generation. Let's first look at on-demand generation.
On-Demand Generation
If you utilize the on-demand approach, a sitemap is regenerated every time you raise a request for it. The on-demand method works well for small websites. It is a handy way to test the sitemap generator as you don't have to wait for a scheduled job.
The image below shows the configuration change required to enable on-demand generation for AEM SDK. However, the OSGi configuration must be created for the AEM cloud environment.
You can then select the AEM Site tree for which the sitemap needs to be generated, as seen in the image below.
Background Generation
In this approach, the sitemap is generated at defined intervals as a background process. The generated sitemap is automatically cached and accessed when required. This is a useful and cost-effective approach for large enterprise sites.
To enable the background job that generates the XML sitemaps, you need to configure a SitemapScheduler instance. To do so for AEMaaCs, create an OSGI configuration for this PID:
org.apache.sling.sitemap.impl.SitemapScheduler
Here’s an example configuration.
Select the AEM Site tree for which the sitemap needs to be generated, as shown below.
Comparing Approaches
The table summarizes the advantages of each approach.
On-Demand | Background | |
---|---|---|
Best suited for | Small websites | Large websites |
Sitemap immediately available | Yes | No: requires waiting for a scheduled job |
Affects other processes | Yes | No: runs in the background |
Allows AEM to optimize sitemap generation and availability | No | Yes |
Auto-balances to reduce load | No | Yes: can configure limits on size and number of URLs |
Custom Sitemap Generation
The following service interfaces can be implemented when you need to limit the content of a sitemap:
If a situation arises that the default implementations don’tworkor the extension points are not proving to be flexible enough,you can implement a custom sitemap generator to take complete control of the generated sitemap content.
AEM Dispatcher Configuration Changes
The AEM Dispatcher doesn't allow the sitemap or sitemap index selectors by default. It requires changes to modify dispatcher behavior to allow sitemap-related endpoints. Below are the important dispatcher configuration changes that you must make.
The Dispatcher Allow Filter Rule
This rule allows HTTP requests for the sitemap index and sitemap files, as shown in the image below.
Apache Web Server Rewrite Rule
This rule ensures *.xml sitemap HTTP requests are routed to the correct underlying AEM page. If URL Shortening is not used or Sling Mappings are used to achieve URL shortening, then this configuration is not required. The figure below shows an example of this rule.
Conclusion
You can explicitly enable sitemap generation using the Apache Sling Sitemap module. It is designed to cover various use cases from small sites serving sitemaps on-demand to large sites generating them in the background. You can also use this module for sites that collect third-party data and include dynamically rendered pages. Another advantage of this module is that you can customize sitemap content according to business needs.