Comparison of search engines 2008

They solve problems, they answer questions, and hell they even make money. Here is an in-depth analysis of the “big three” search engines, and how they compare to other search engines and their behind-the-scenes workings.

Everyone has a favorite search engine. Google is used by the majority, next is Yahoo for those that think different, and then (for those that can’t figure out how to change browser settings) there is Microsoft’s Live. While most everyone blindly uses the search engines without too much thought as to how they return indexed results, they each have their own unique method for returning results, although it is largely dependent on how their crawler works. First lets look at the obvious favorite, Google.

It has become a household name, even a verb: to Google something. Established in early 1996 (same year as ask.com, no less), Google has grown exponentially in both users as well as profit, thanks largely to their search-populated offerings and services. Their web crawler, Googlebot, indexes billions of pages on the internet daily, ranking them using an internal technology called “Pagerank”. Pagerank mainly works in terms of popularity, meaning a page with hundreds of links will get ranked higher than a page with only thousands. On the other hand, however, if a page has hundreds of links from regular blogs, and a similar page gets linked to from Yahoo! (or an equally popular site), page #2 just earned a higher rank, since the site linking to it has more links itself (quickly becoming a very complex math equation studied since the early hypertext standard).

The downside to this, however, is that if 2 pages cover the topic of say, kangaroos, and page #1 only skims the surface with general facts yet has many links to it. Page #2 covers the topic very in depth, obviously using the work “kangaroo” many times, but not many pages link to it since for the most part the internet at large has become less information-centric. Page #1 will be ranked very high in a Google search for “kangaroo”, given its content and linkage, while page #2 will be located maybe on the second or third results page.

This is where Yahoo! comes in. Yahoo!’s Slurp! crawler!, while also concerned somewhat about linkage!, is often more concerned with the actual! content! itself as a higher priority than simple! link! popularity!. I have found this to be true with startup websites, which are more oft than not ranked high on Yahoo! initially, while on Google it takes a while to achieve rank (since Googlebot also takes site age into consideration in addition to the related link references). So Yahoo! is more likely to show an obscure! website! with content! and keywords! related to the search! query! rather than a popular! website! with some of the query content, a stark comparison to Google’s pagerank popularity contest.

This brings us to Windows Live (or MSN search, or Microsoft Live, or just Live), which is the third most used search engine in North America (Baidu is also popular, but its popularity is mainly oriental based although statistics place it more popular then Live). Placed as the default search engine in Microsoft Internet Explorer, MSN.com, and Windows in general (which happens to be installed on 90% of desktop computers), it has a fairly decent user base although not comparable to Google or Yahoo!.

Live search is more (copied) like Google’s ranking system, mainly in a thinly-veiled attempt to gain ground on Google (currently Microsoft’s primary competitor). Live does offer better regular-expression-like functionality in their search queries, something appreciated when you are really looking for something specific (e.g. something in ppt, txt, or doc format as opposed to HTML). However, Live search has some under fire for its slight censoring of search results as opposed to Google. A while back, I was doing a research paper on batch file viruses, so my search queries were… interesting to say the least. I noticed that Google and Live offered very similar, yet different search results for a group of keywords I searched for, mainly pertaining a specific site indexed on Google but not on Live. Because of that alone, I would refuse to use Live given a better alternative (we have several), since I may be looking for something that Microsoft, God forbid, does not want me to ’see’.

From a webmaster’s SEO perspective, Yahoo! is the better! indexer! here as far as content, with Google better as far as popularity goes (basically the yin and yang of internet searches). Live has its place, but I don’t find myself using it as often as the better two, also aided by its lack of crawler documentation.

Two other search engines worth at least a look at (for criticism) are Cuil and ask.com. Ask.com was founded around the same time as Google, but does not even come close to matching Google’s search quality despite matching Google’s maturity. Cuil is no better. Being a new engine on the scene lately, it was extremely hyped up as “the next Google”, but after many tried it with enthusiasm, it sharply fell into the status of “Google wannabe” (mainly criticized by John C. Dvorak), aided by its slow web crawl no doubt from lack of funds of Google’s size.

Ask.com’s ranking as a search engine was hurt by its fairly subtle “anti-google” campaign (information-revolution.org, which briefly held a higher ranking than Google in an ask.com search for “Google” at the time). In addition to this, Ask.com suffered an identity crisis a while back as it bounced between askjeeves.com and the present ask.com in an attempt to stay “fresh” with searchers, although its popularity has sharply fallen recently with even Live coming onto the scene.

In conclusion, web search engines are almost the new flame war contestant: just as some still go on and on about Windows vs. Linux vs. Mac vs. X, some will equally fight about search engine dominance, when they clearly all have their ups and downs in comparison to each other, and they all pretty much leech off of Wikipedia for information hording. They are the best of the best, and other have a hard time establishing themselves in a world where favoritism comes with age (and hence with funds), where some have made millions indexing the internet for content locating and other have lost profit trying to accomplish the same thing (or services based off it). Whatever the case, they remain as crucial to the internet as access to the internet itself.



About Stephen:



Stephen (last name kept private) is currently a student at the University of South Carolina with a major in computer science. He is very knowledgeable when it comes to current as well as up-and-coming software technologies, and is renown for his intuitive reviews of software products and services.

Leave a Reply