Generating sitemaps with Jekyll

It’s always a good idea to include a sitemap.xml file for your site to help sites like Google track you and appropriately suggest page results in searches. There are a couple of Jekyll plugins that generate this file for you automatically (here and here), but I didn’t like the ouput I was getting.

If you’re looking for a simple way to generate a Jekyll sitemap without relying on plugins, here is a solution I borrowed from Joel Glovier:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
---
---
<?xml version="1.0" encoding="UTF-8"?>
<urlset
      xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
            http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">

{% for post in site.posts %}
<url>
    <loc>{{ site.url }}{{ post.url | remove: 'index.html' }}</loc>
    <changefreq>weekly</changefreq>
</url>
{% endfor %}
{% for page in site.pages %}
<url>
    <loc>{{ site.url }}{{ page.url | remove: 'index.html' }}</loc>
    <changefreq>weekly</changefreq>
</url>
{% endfor %}
</urlset>

The beauty in this solution is that it uses a blank YAML header to trigger Jekyll to compile it. Simply add this into your empty sitemap.xml file and enjoy.


It’s best practice to let Google know about your sitemap file, and this can be done using Webmaster Tools, robots.txt or via HTTP. As your sitemap changes, you’ll want to resubmit it (read this).

Again, you can do this using Webmaster Tools; or, if you opt for the HTTP method, I’ve written a relatively small ruby script that will attempt to resubmit your sitemap for you. Most of the bulk here is that I wanted the print out in the terminal to look pretty. If you don’t like all the bulk, feel free to remove it.

#!/usr/bin/env ruby

# Use this script to automatically submit a valid sitemap file to
# the Google Webmaster Tools endpoint.
#
# @params {String}: sitemap_url


require "CGI"
require "net/http"

# COLORS
cClear = "\033[0m"  # base
cFail = "\033[37;41m"  # red;bold
cSuccess = "\033[32;1m"  # green;bold
cWarning = "\033[33m"  # yellow
cInfo = "\033[34m"  # blue
cUrl = "\033[35;4m"  # purple;underline
cMisc = "\033[36;3m"  # cyan;italic

# GET ARGV
if ARGV[0]
  sitemap_url = ARGV[0]
else
  puts "A #{cInfo}sitemap url#{cClear} is a #{cFail}required#{cClear} argument!\n\n"
  exit
end

# URLS
base_url = "http://www.google.com/webmasters/tools/ping?sitemap="
webmaster_url = "https://www.google.com/webmasters/tools"

url = URI(base_url + CGI.escape(sitemap_url))

puts "Updating sitemap on Webmaster Tools...\n\n"

res = Net::HTTP.get_response(url)
code = res.code

case res
when Net::HTTPSuccess
  puts "#{cUrl}#{sitemap_url}#{cClear} was submitted #{cSuccess}successfully#{cClear}!\n\n"
else
  print "#{cUrl}#{sitemap_url}#{cClear} submit #{cFail}failed#{cClear} with code #{cWarning}#{code}#{cClear}.\n\n"

  a = [3, 2, 1]
  a.each {
    |n| puts "\t\tretrying in #{cMisc}" + n.to_s + " second" + (n == 1 ? "" : "s") + "#{cClear}..."
    sleep(1)
  }

  puts "\n"

  res2 = Net::HTTP.get_response(url)
  code2 = res2.code

  case res2
  when Net::HTTPSuccess
    puts "#{cUrl}#{sitemap_url}#{cClear} was re-submitted #{cSuccess}successfully#{cClear}!\n\n"
  else
    puts "#{cUrl}#{sitemap_url}#{cClear} re-submit #{cFail}failed#{cClear} again with code #{cWarning}#{code}#{cClear}.\n\n"
    puts "\nTry re-submitting manually via Webmaster Tools at #{cInfo}#{webmaster_url}#{cClear}.\n"
  end
end
to top

Tags

Archives