Indexing Robots and Your Site
Written by Eric Smith, Northstar Computer Systems LLC
Most search engines perform "spidering" when they visit your site. They read a page, retrieve all the links from the page, and follow all those links. Eventually, they will hit all the pages on your site. Depending on how your site is set up, there are directories you might not want to have indexed, such as a utility library of include files that you use. For the most part, you can create a file called "robots.txt" and place it in the root of your web site with instructions that look like this:
User-agent: *
Disallow: /includes
Disallow: /cgi-bin
Disallow: /pics
Disallow: /images
In theory, this file will prevent search engines from looking in the includes, cgi-bin, pics, and images directories. While the major search engines (and most of the smaller ones) will read and follow these instructions, remember that it is not a requirement. You should assume that any file on your web server can be found by a search engine or a user intent on finding it.
Keywords: [
Uncategorized ASP Tips
]
Publication Date: 11/1/1999
|