Sitemap file contains information concerning all URLs on the website[1]. It can also include some additional data such as the date of the previous modification, the importance of a given address, frequency of changes made on the website, data of different language versions.
Information concerning previous modification is often skipped by Google[2], similar to priority in the sitemap[3].
In the case of small websites consisting of few hundred URL addresses, having a sitemap.xml file[4] is not required. If the website’s structure is clear and every page is linked internally then the search engines will be able to find all of the pages. Having a sitemap is recommended for big websites with a complex structure and navigation since it will facilitate the process of finding pages. Using the sitemap file is also recommended in case of new, big sites which don’t have many backlinks yet.
<?xml version="1.0" encoding="UTF-8" ?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.example.com/</loc>
<lastmod>2018-09-01</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
</urlset>
There is a plethora of sitemap and sitemapindex examples. It is enough just to look at big websites, such as.:
https://www.bizdb.co.uk/sitemap.xml - sitemapindex here includes links to many sitemap files.
One sitemap file can consist of no more than 50.000 URLs and it can’t be bigger than 50 megabytes.
In the case of bigger websites, the best solution is to create a sitemapindex file. Then, sitemap files can be put in that sitemapindex file. Sitemapindex file example:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>http://www.example.com/sitemap1.xml</loc>
<lastmod>2018-10-01</lastmod>
</sitemap>
<sitemap>
<loc>http://www.example.com/sitemap2.xml</loc>
<lastmod>2018-09-01</lastmod>
</sitemap>
</sitemapindex>
The perfect situation would be to have a sitemap file that is generated by the website. In such a case, the URL structure should be updated on a regular basis so the sitemap file reflects the website.
One of the tools available from Google can be used to generate the sitemap file[5]. In the case of CMS Wordpress based websites, implementing the "yoast seo" plugin is recommended - sitemap.xml will be created automatically.
Sitemap.xml file is now ready with the URLs of all pages included so it is time to inform the search engines. The default file name is “sitemap.xml” or “sitemap.xml.gz”. Search engines will probably find the file on their own as long as it is placed in the main directory of the domain and available via:
https://www.example.com/sitemap.xml
There are a few ways to help search engines find the sitemap file. The easiest one is to add an extra line in the robots.txt file. Robots.txt should be available under:
https://www.example.com/robots.txt
Just type in robots.txt:
Sitemap: sitemap URL address
For example:
Sitemap: https://www.example.com/sitemap.xml
Another way of informing search engines is by adding the sitemap in the Google Search Console or Bing webmaster tools. After logging in, add your sitemap and press the “send” button.
The information has been forwarded and Google should visit the website soon.
It is highly recommended to have the sitemap file reflect the linking structure. Missing URLs in the sitemap.xml may lead to delay in indexing, similarly to having outdated or unavailable URLs in the sitemap. Checking if the sitemap file includes https:// URLs for encrypted pages is also suggested. Http version of URLs still appearing after encrypting the website is a common instance.
Sitemaps are not required but there are situations when they can prove useful. It is definitely recommended for big and complex websites. Quoting Google: in most cases, your site will benefit from having a sitemap, and you'll never be penalized for having one.
Sources:
Jacek Wieczorek is the co-founder of Pulno. Since 2006, he has been optimizing and managing websites that generate traffic counted in hundreds of thousands of daily visits. |
23-12-2018
Eliminate website errors that block you from gaining top search engine positions. Sign up for free trial.