>
 





Sitemaps are XML files for search engines to learn what pages to crawl and how frequently to check for changes on each page.

 

Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site. 

Web crawlers usually discover pages from links within the site and from other sites. Sitemaps supplement this data to allow crawlers that support Sitemaps to pick up all URLs in the Sitemap and learn about those URLs using the associated metadata. Using the Sitemap protocol does not guarantee that web pages are included in search engines, but provides hints for web crawlers to do a better job of crawling your site.

The sitemap protocol format consists of XML tags. The encoding of the file must be UTF-8 .

The sitemap must begin with <urlset> and end with </urlset>. The name space must be specified inside the <urlset>. An <url> entity for each URL must be included as a parent XML tag. <loc> (URL of the page)child element for each <url> parent tag must be included. Optionally, <lastmod> (The date of last modification of the file in YYYY-MM-DD format), <changefreq> (How frequently the page is likely to change), <priority> (The priority of this URL relative to other URLs on your site.) child elements may be included for each <url> parent tag. 

A sample XML Sitemap that contains a single URL.

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
   <url>
      <loc>http://www.bloggingdeveloper.com/</loc>
      <lastmod>2007-09-06</lastmod>
      <changefreq>monthly</changefreq>
      <priority>0.8</priority>
   </url>
</urlset>

For more information: http://www.sitemaps.org/protocol.php

In November 2006, Google, Yahoo and Microsoft announced that they all use the same sitemap protocol described in Sitemaps.org to index sites around the web. (For more information: http://www.techcrunch.com/2006/11/15/google-yahoo-and-microsoft-agree-to-standard-sitemaps-protocol/)

In April 2007, they announced to use robots.txt to allow webmasters to share their Sitemaps. To do this, simply add the following line to your robot.txt file. (for more information: http://www.ysearchblog.com/archives/000437.html)

Sitemap: http://www.bloggingdeveloper.com/sitemap.xml

 

All requests to IIS are handled through Internet Server Application Programming Interface (ISAPI) extensions. ASP.NET has its own filter to ensure pages are processed appropriately. By default, the ASP.NET ISAPI filter (aspnet_isapi.dll) only handles ASPX, ASMX, and all other non-display file formats used by .NET and Visual Studio. However, this filter can be registered with other extensions in order to handle requests to those file types, too, but that will be covered later.

Every request flows through a number of HTTP modules, which cover various areas of the application (i.e. authentication and session intofmation). After passing through each module, the request is assigned to a single HTTP handler, which determines how the system will respond to the request. Upon completion of the request handler, the response flows back through the HTTP modules to the user.

HTTP modules are executed before and after the handler and provide a method for interacting with the request. Custom modules must implement the System.Web.IHttpModule interface. Modules are typically synchronized with events of the System.Web.IHttpModule class (implemented within the Global.asax.cs or .vb file).

HTTP handlers process the request and are generally responsible for initiating necessary business logic tied to the request. Custom handlers must implement the System.Web.IHttpHandler interface. Additionally, a handler factory can be created which will analyze a request to determine what HTTP handler is appropriate. Custom handler factories implement the System.Web.IHttpHandlerFactory interface.

For more information:
http://geekswithblogs.net//flanakin/articles/ModuleHandlerIntro.aspx

Overview

 

Here is a brief overview of how SiteMap HttpHandler will work: A request for SiteMap, will be intercepted and passed to our SiteMap HttpHandler which will generate the SiteMap XML.

Step 1: Create HttpHandler

 

Inside the App_Code folder, create SiteMapHandler.cs.

Add App_Code Folder
Add App_Code Folder

 

Add SiteMapHandler.cs
Add SiteMapHandler.cs

 

Here is the code for the Asp.Net Sitemap Handler implementing the IHttpHandler interface.

SitemapHandler.cs
SitemapHandler.cs

 

I commented out the loop that adds pages. You may get URL of your pages from web.sitemap file, database or another sitemap provider.

Step 2: Modify Sitemap

 

Add the following section inside system.web

<httpHandlers>
   <add verb="*" path="sitemap.axd"
        type="SitemapHandler" validate="false"/>
</httpHandlers>

In order to test your sitemap; browse the sitemap.axd file.

Step 3: Use robots.txt to announce Sitemap to Search Engines

 

Create a text file in the root of your application and name it: robots.txt

Insert the following line changing bloggingdeveloper.com with your domain name:

Sitemap: http://www.bloggingdeveloper.com/sitemap.axd

For an online demo: http://www.bloggingdeveloper.com/sitemap.axd

Download Visual Studio Project demo

Want automatic updates? Subscribe to our RSS feed or
Get Email Updates sent directly to your inbox!

Currently rated 5.0 by 3 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5


Comments

October 9. 2009 02:44

So this Sitemaps are XML files for search engines to learn what pages to crawl and how frequently to check for changes on each page. well I couldn't understand the codes you have provided but any way I like your template

rn to bsn online | Reply

October 22. 2009 21:29

Social comments and analytics for this post

This post was mentioned on Twitter by bloggingdev: Generate Sitemaps for Google, MSN/Live, Yahoo, Ask on the fly using an ASP.NET HttpHandler - http://su.pr/7zX89y

uberVU - social comments | Reply

October 29. 2009 05:24

Gifts to Sri Lanka, Flowers to Sri Lanka, Cakes to Sri Lanka

lankaflorist | Reply

October 29. 2009 06:13

Gifts to Sri Lanka, Flowers to Sri Lanka, Cakes to Sri Lanka

lankaflorist | Reply

October 31. 2009 03:27

Just curious why you use a handler and not just a regular web form -- sitemap.aspx ?

Neil | Reply

November 1. 2009 06:44

HttpHandler: Generate Sitemaps for Google, MSN/Live, Yahoo, Ask

You are voted (great) - Trackback from WebDevVote.com

WebDevVote.com | Reply

November 13. 2009 15:35

Good post!

Another way to announce the sitemap to search engines is to ping them when your site is updated. I wrote a post about that a while ago here: joelabrahamsson.com/.../...n-your-site-is-updated.

Joel Abrahamsson | Reply

December 4. 2009 08:57

ASP.NET is the next generation ASP, but it's not an upgraded version of ASP.

SEO Reseller | Reply

Add comment




(Will not be displayed!)