In the early days of the web, yellow page-like listings (e.g. Yahoo) were the primary method of locating information. Then came spiders collecting web pages to create repositories that could be searched using keywords (Lycos, AltaVista). These search engines did not take advantage of the connected nature of the web. Google changed that by ranking a page based on the quantity and quality of the pages linking to it.
What is next is still an open question. The most common answer is clustering. Teoma (bought by Ask Jeeves) was one of the earliest search engines to use clustering. Vivisimo and its sister search engine, Clusty, are two more examples. Now Yahoo is jumping on the bandwagon with Mindset.
The advantage of clustering is the possibility of teasing out semantic information by grouping pages based on their content and how they link to each other. If someone were to search for "Lincoln", does she want a biography of Abraham Lincoln or perhaps information about Lincoln, Nebraska or maybe tickets for Lincoln Center. Clustering allows the search engine to present some basic categories so that the user can easily refine her search. In the Lincoln example, the user could limit her search to only pages in the Lincoln Center cluster so she can get directions and concert times. Yahoo's Mindset is simpler in that it restricts the clusters to shopping or informational groupings. Right now, clustering search engines do not look like a threat to Google's dominance.
Comments
A big news item in the search engine world lately is the claim that Google is using humans to refine their index. The full story is here. I haven't seen it hit the major tech news sources yet. I think it is just another way to defeat the spammers and search engine optimizers and not as big of a news item as some would like to make it out to be.
Posted by: CJ Costello on Saturday, June 11, 2005