Minggu, 18 Oktober 2009

The way search engines work

Web search engines work by storing information about many web pages, which they retrieve from the WWW. These pages are retrieved by a web crawler - automated web browser which follows every link he saw. The contents of each page and then analyzed to determine how mengindeksnya (for example, the words taken from the title, subtitle, or special fields called meta tags). Data about web pages are stored in an index database for use in subsequent searches. Some search engines such as Google, store all or part of the page source (called cache) as well as information on the web page itself.

When a user visits a search engine and enter a query, typically by entering keywords, search engines index and provides a list of web pages that best matches the criteria, usually accompanied by a brief summary of the document title and sometimes some of the text.

There are other search engines: search engine real-time, such as Orase. Machines like this do not use indexes. Machinery necessary information is collected only if there is a new search. When compared with systems that use index-based machines such as Google, real-time system is superior in several respects: information is always up to date, (almost) no dead links, and fewer system resources required. (Google uses nearly 100,000 computers, Orase only one.) But there are also disadvantages: the search for longer completion.

Benefits of search engine depends on the relevance of the results it provided. Although there may be millions of web pages containing a word or phrase, some pages may be more relevant, popular, or authoritative than others. Most search engines employ methods to rank the results to be able to provide the "best" first. The way the machine determines which pages are most appropriate, and the order of the pages shown, very varied. The methods also change over time as Internet usage changes and evolve new techniques.

Most web search engines are commercial ventures supported by advertising revenue and therefore most controversial practice, which allows advertisers to pay for their pages ranked higher in search results.

Tidak ada komentar:

Posting Komentar