Keeping an Eye on Search Engine Diversity
Wednesday, December 24th, 2008Last week, I came across two fascinating stories that show some of the problems relating to search engine diversity. Chris Soghoian writes how Google’s restrictive Adwords policy stands in the way of informing searchers about public data about political donations. Paul Alan Levy comments on a story about a company called Lifestyle Lift that tries to lead users away from a critical website by optimizing its presence on the Google search result page.
Research shows that the majority of search engine users stay on the first page of results and usually click on the first few results only. This increases the importance of being in the top of the organic results. In addition to offering a competitive way to be found in the organic results, search engines allow organizations to bid on search terms which result in sponsored results. This is the dominant search engine business model. Sponsored links typically show up in the right hand side of the search result page or on the top of the page if they perform above a certain threshold set by the search company. For practical purposes I will restrict my comments to Google.
What is clear is that the two sets of results represent two different information flows. The flow of organic results is the result of the crawling and ranking of sources on the Web. The sponsored results are the result of the Adwords auctioning system, or other forms of sponsoring, such as Google Grants or promotion by Google itself. Research shows that about half of the Internet users might be unaware of a difference between organic and sponsored results.
Although it seems to make sense to to require transparency between organic results and sponsored results, in reality the difference is complex and problematic. On a superficial level, the difference reflects ideas about transparency of advertising in the media. Listeners, viewers or internet users should be able to distinguish between editorial content and advertising (in the broad sense). But, in the context of search results, the difference between a sponsored or an organic result does not tell much about the amount of money that has been paid for the result to show up. It is also quite common for a website to show up both in the organic results and the sponsored results. The different processes that govern the ranking in the two sets of results are equally opaque for users. And finally, the sponsored results are the results that are in some waysmore editorial in nature. My tentative conclusion is that the difference is not well understood and of limited value. What is more important is that we evaluate diversity and quality of search results pages in their totality.
Difference does not reflect payments
Both information flows of results, i.e. organic and sponsored, involve the spending of money to show up in the results as prominently as possible. The difference is who receives the payments, but it is clear that the user is not in a position to evaluate them. Whereas an organic result might show up for certain search terms without someone having invested any money into that, many organizations and companies use search engine optimization services to optimize their organic ranking and presence in search results. The example of Lifestyle Lift shows that sometimes complete websites are being set up to enhance presence and ranking.
It would be interesting to see how much is typically paid for organic optimization in comparison to sponsored optimization. Organic results might be seen as more trustworthy by internet users that think that they understand the difference between the sets of results. This increases the incentives to invest in organic optimization. If the ability to place sponsored links on search engine property would be restricted, one can expect increased pressure on organic results. There are no editorial restrictions on advertising in organic results.
Duplication of results and sources is common
Especially for searches related to products, companies and services, it is common that some of the sponsored links are duplicates of the organic links. Duplication blurs the line between organic and sponsored results. An argument in favor of duplication is that it decreases noise for users that were looking for that source anyway. Duplication is problematic because it decreases the diversity of the results and the possibility that users are confronted with information that adds another perspective to a search. The fact that duplication is fairly common suggests that it is worth for advertisers to push organic results off the first page and out of the attention of most searchers. This is especially the case if the duplication involves a sponsored result in the top of the page on the left, and the same organic results right below it. Duplication might be costly so this is a typical case where the rich get richer because they manage to capture the attention of internet users.
Ranking mechanisms of both sets of results are equally opaque
The ranking of organic search results by Google is notoriously opaque. The Page Rank algorithm, that uses links to establish a global measure of relevance for websites is only one of the many different (200+) factors that is used by Google to establish the ranking of organic results. If one wants to know about ranking algorithms the place to look is the search engine optimization industry. These search engine experts have a pretty good idea how search engines work and walk the fine line between accepted optimization and being punished. Ironically, this industry is also seen as the major reason for search engines not to be more transparent about their ranking algorithms.
The ranking of sponsored results is equally opaque. I have heard some experts explain that the ranking of sponsored links reflects the bids the advertisers, but in reality it is more complex. Google places advertisements on the basis of its keyword auction, and a quality score. The quality score is determined by Google. It includes the historical click through rate and other partly-non-disclosed relevance factors. Thus, the highest placed sponsored link is not necessarily the one that involves the highest willingness to pay. In addition, some sponsored links are placed (and can be clicked on) without them having to make a payment. They are sponsored by Google.
Quality control on sponsored links is stronger than for organic results
Interestingly, the sponsored results are subject to detailed editorial policy that guarantees the quality of sponsored results. The policy of Google Adwords also includes the suppression of unauthorized usage of trademarks, as Chris Soghoian found out. There are several legal and non-legal reasons for this, e.g. the fact that search engines receive money for the placement of these results. Thus, sponsored links might end up being of higher (average) quality and more relevant for some keywords. The relative balance between the quality of organic and sponsored results is of major importance to search engines. If search engines can keep the overall quality of the results in their totality the same, while shifting the relevance of results towards the sponsored results, they make more money.
Keeping an eye on search engine diversity
My conclusion from these ideas and remarks is that we need to focus more on quality metrics of actual search result pages in their totality than on transparency between organic and sponsored results. This difference is of limited value to users. I have the impression that Google is committed to diversity of search results and have heard one Google engineer in Europe say this publicly, but more independent empirical research (Benkler 2008, p. 285) is needed in this direction. Search engines could also be more explicit about this commitment, for instance by being transparent about their commitments to diversity and implementing policies that prevent duplication of results.
