About Search Engines and Web Site Promotion: A Whitepaper

Introduction

Finding information on the huge and ever-growing World-Wide Web can come down to finding the proverbial needle in a haystack. The high volume of information coupled with the absence of a central organizing authority can be a roadblock to finding the highest-quality, most appropriate information for the task at hand. Search engines have been developed to help with the task. Some popular sites that permit searching are AltaVista, Yahoo, Excite, Infoseek, Lycos, WebCrawler, OpenText, Magellan, and HotBot. Between them they have indexed more than 80 million web pages.

But the process through which information providers' sites become known to the search engines is not entirely straightforward. This document describes the issues and makes recommendations of interest to information providers.

Factors Affecting Site Listing and Ranking in Response to Search Requests

A site's listing and ranking in search responses is governed by a number of factors:

  1. When "www.mysite.org," comes into existence, a site representative may submit its address to the search engine's site for indexing. Alternatively (or additionally) the search engine may discover the site's address during ongoing link-by-link explorations of the Web.
  2. Keywords and other information about www.mysite.org, either manually submitted or automatically extracted from the site's pages, are registered in the search engine’s database.
  3. When people make search requests that relate to www.mysite.org, search engines use their pre-stored information to retrieve and arrange search results. Listed sites will appear, including www.mysite.org. Results are presented in an order determined by the search engine's design. This order may be alphabetical or chronological, but it may also represent an automatic ranking of relevancy in light of parameters in the requested search.
  4. As newer sites come along, they may supplant www.mysite.org and appear higher in search responses.
  5. If www.mysite.org changes address (its URL), search engines may be unable to find it. This will result in a broken link in the engine's listing. If site content has changed, and especially if there are new keywords, the site may no longer be properly searchable. Search engines revisit sites on an ongoing basis to see if they are still reachable or if contents have changed. Accordingly, they may reindex the site. Because of the size of the Web, content providers cannot depend on the timeliness of automatic reindexing.

These above factors present questions for information providers:

Of course, these issues are relevant to information consumers as well. A site providing the best information on a particular subject that cannot be found by search engines is nearly invisible.

Relevancy Rankings

Search engines often use proprietary methods to determine the relevancy of a site with respect to keywords. To make it difficult for Web site owners to manipulate the rankings of their sites in search results, the ranking methods are sometimes kept secret. Search engines might take some or all of the following into account when assigning relevancy:

In addition, the following aspects of page design can cause problems for search engines, or even prevent them from discovering pages to index:

What Information Providers Need to Do to Optimize Searching

There are two things that information providers can do to maximize the likelihood that search sites index them correctly and rank them highly. These are (1) tune sites explicitly for the search engines, and (2) regularly resubmit sites to the search engines.

Tuning Your Site for Search Engines

User-Agent : *
Disallow : /private/

This file would instruct any robot conforming to the robots exclusion standard not to visit any documents in the /private URL space (or any of its subdirectories).

<META name="description" content="A description of the site or page.">
<META name="keywords" content="a comma-separated series of keywords by which you want your Web site to be located">

Note that Excite and some search engines ignore <META> tags because they have been abused by webmasters trying to attract more hits. If you do use <META> tags, choose your description and keywords carefully and accurately.

For more information about fine-tuning your web site for search engines (as well as tips on using search engines for your own searches), see the Search Engine Watch pages.

Resubmitting your Site

If your site has changed addresses or its content has changed significantly, you should resubmit site information to the major search engines. Some search engines have web forms for submitting or resubmitting a page, and some of them require submissions to be made by email. Some search engines accept a root URL for a site and automatically index the site by following links from the root. Others, including Infoseek, accept only single pages--sites with multiple pages must submit these via email as a list of URLs.

Resubmission of site information is only a one-time fix for updating search engine indices. It doesn't guarantee that search services will revisit the site by any particular date.

Reindexing sites takes time. The average time between submitting your URL and getting it into the database seems to be 5-8 weeks. Lycos and Excite both say that it takes 2-4 weeks for a submitted site to appear in their indices. AltaVista says that a home page should be listed in one or two days, with the rest of the site being indexed "over time." Different services use different reindexing methods decide differently how frequently to revisit a site to look for changes. If it takes a new site some weeks to make it into an index, a follow-up visit may take even longer. The decision to revisit a site may be influenced by how many hits the site gets, or how often it turns up in search queries.

Submission to Yahoo and some other search sites is a bit more involved. As a categorization service rather than a search service per se, Yahoo requires the submitter to specify which category or categories apply to a site. There can be multiple possibilities. Here are some possibilities for agricultural sites in Yahoo:

Non-profit or grant making organizations might belong in some of the following categories

Web site owners should select categories carefully, as Yahoo and some other categorization services allow only limited number of links to a given site.

Submission Services

There are services that will manage submissions and resubmissions for you. Some of these are Postmaster (http://www.netcreations.com/postmaster/), Submit It! (http://www.submit-it.com/) or WebPromote (http://www.webpromote.com/av.shtml). The prices of these services range from free (for a limited number of search engines) to several hundred dollars.