A web search engine like Google has three pieces. The first is an automated program that roams the Web, dowloading everything it finds. Thisprogram (often known by more picturesque names like spider, robot, bot, or crawler) eventually stumbles across your site and copis its contents.
The second piece is an indexer that chews through web pages and extracs a bunch of menaingful information, incluinding the page's title, description, and keywords. The indexer also records a great deal more esoteric data. For example, a search engine like Google keeps track of the words that crop up most often on a page, what other sites link to your page, and so on. The indexer inserts all this digested information into a giant catalog (technically, a database).
The search engine's final task is the part you are probably most familiar with - the front end, or search home page. You enter the keywords you are hunting for, and the search engine scans its catolog looking for suitable pages. Different engines have different ways of choosing pages, but the basic idea is to make sure the best and most relevant pages turn up early in the search results (the best pages are those that the search engine ranks as highly popular and well-linked. The most relevant pages are those that most closely match the search keywords.)
Due to the complex algorithms search engines use, a slightly different search (say, "green tea health" instead of just "green tea") can get you a completely different set of results.
No comments:
Post a Comment