Saturday, 8 October 2011

How Search Engine Works



We all know that the methodology of running through the hundreds of pages on a directory is an obsolete way of searching a product, service or information. Also the directories have their own limitations with respect to content, ease and usability. That’s how the need of online search and search engines arise.
Before a search engine can tell you where a file or document is, it must be found. To find information on the hundreds of millions of Web pages that exist, a search engine employs special software robots, called spiders, to build lists of the words found on Web sites. When a spider is building its lists, the process is called Web crawling. (There are some disadvantages to calling part of the Internet the World Wide Web -- a large set of arachnid-centric names for tools is one of them.) In order to build and maintain a useful list of words, a search engine's spiders have to look at a lot of pages.
How does any spider start its travels over the Web?

The usual starting points are lists of heavily used servers and very popular pages. The spider will begin with a popular site, indexing the words on its pages and following every link found within the site. In this way, the spidering system quickly begins to travel, spreading out across the most widely used portions of the Web.

Spiders" take a Web page's content and create key search words that enable online users to find pages they're looking for.

When a search engines spider looks at an HTML page, it takes note of two things:

      The words within the page

      Where the words were found

Words occurring in the title, subtitles, Meta tags and other positions of relative importance were noted for special consideration during a subsequent user search. The spiders were built to index every significant word on a page, leaving out the articles "a," "an" and "the." Different spiders take different approaches

No comments:

Post a Comment