PhD Thesis Submitted on Search Engine Freedom

Tuesday, November 22nd, 2011

Ten days ago, I submitted my PhD thesis:

‘Search Engine Freedom: On the implications of the right to freedom of expression for the legal governance of Web search engines’, Ph.D. thesis, Institute for Information Law, (submitted, November 2011).

I plan to start writing here again, after a long absence. The upcoming decision of the European Court of Justice in SABAM v. Tiscali coming Thursday will be a good moment to pick up. I plan to work things around a bit and make this blog less central to this website, but that will take some time.

My (Long) Take on the ECJ’s AG Opinion in the Google Adwords Cases

Friday, October 2nd, 2009

In its long awaited opinion in the French Google Adword cases about trade mark infringement of keyword advertising, the European Court of Justice’s Advocate General, the young Miguel Poiares Maduro, has concluded that the sale and use of keywords to trigger sponsored search results does not constitute infringement. In addition, the AG has tried to clarify the scope of intermediary liability regime from the Directive on Electronic Commerce as regards keyword advertising, hyperlinks and search engines, differentiating between third party liability for sponsored and natural results. There are several more cases about keyword advertising awaiting treatment by the ECJ. Hence, this opinion is the first in a series.

What is at stake?

The AG starts by rephrasing the questions as referred to the Court by the French Cour de Cassation as a larger question relating to the ability of end-users to use search engines to find information online:

5. The answer [to the questions before the Court] will determine the extent to which keywords corresponding to trade marks can be used outside the control of the proprietors of the trade marks. To put it differently: when you enter a keyword which corresponds to a trade mark, what can be given and what can you find in cyberspace?

This perspective – how much control should trademark law provide to the proprietors of trademarks to control communication more generally – remains dominant in the AG’s treatment of the legal issues involved in the three cases. The AG’s concern about the general communicative ecosystem, his critical discussion of the possible justification for trade mark protection to include keyword advertising or a contributory infringement doctrine, and most importantly the negative consequences for freedom of expression and freedom of commerce of extending protection come as a welcome surprise to me.

The line of reasoning with regard to trademark law

To structure his analysis, the AG discerns two uses of keywords by Google which correspond to trade marks in the Google Adwords system. The first is allowing advertisers to select keywords (Use 1). The second is displaying the resulting ads alongside the natural results (Use 2). For both uses, the AG addresses whether they fulfill the additional cumulative criteria for trademark infringement under Article 5(1) of Directive 89/104, which are:

i) that use takes place in the course of trade;

(ii) it relates to goods or services which are identical or similar to those covered by the trade marks; and

(iii) it affects or is liable to affect the essential function of the trade mark – which is to guarantee to consumers the origin of the goods or services – by reason of a likelihood of confusion on the part of the public.

The AG concludes that Use 1 does not fulfill criteria ii, which requires “that a sign corresponding to a trade mark is used ‘in such a way that a link is established between the sign … and the goods marketed or the services provided’“. According to the AG, such a link is not established when Google is offering advertisers the possibility to select keywords:

In the traditional example of a use in advertising, the link is established between the trade mark and the good or service sold to the general public. This happens, for example, when the advertiser sells a good under the trade mark. That is not the case with the use by Google consisting in allowing advertisers to select keywords so that their ads are presented as results. There is no good or service sold to the general public. The use is limited to a selection procedure which is internal to AdWords and concerns only Google and the advertisers.

After this conclusion, the opinion discusses Use 2 of the trademark. In short, the AG concludes that Use 2 does not constitute trade mark infringement either, since it only satisfies criteria i and ii, but not criteria iii. There is a lack of likelihood of confusion on the part of the public. Interestingly, the AG pays particular attention to the consequences of concluding otherwise. He states that it would be difficult to make a legal distinction under the Court’s case law between the use of keywords to trigger advertisements and the use of keywords to trigger natural results. This would mean that the use of trademarks to trigger natural search results would amount to trademark infringement as well. Poaires Maduro’s analysis shows he has thought carefully about Google’s search engine and the distinction between sponsored and natural results. This is a distinction which I have also argued to be limited and more complex than in editorial media like newspapers. The AG’s point is that the distinction is not one of whether or not there is exposure, but in the degree of exposure:

72. [..] by associating ads with certain keywords through AdWords, Google provides the advertisers’ sites with added exposure. However, it should be remembered that such sites, even the counterfeit ones, could feature among the natural results of the same keywords (depending on their relevance as detected by the search engine’s automatic algorithms). It should also be remembered that ads and natural results have very similar characteristics: a short message and a link. Accordingly, the difference between ads and natural results lies not so much in whether or not ads provide exposure, but more in the degree of such exposure. [...]

Only after these initial considerations, addressed to the Court, the opinion deals with the fulfillment of criteria i-iii, in which he also addresses the use of keywords to trigger natural results. Criteria i and ii are satisfied for both sponsored and natural results (par. 75-81). But, criteria iii is not, as there is no likelihood of confusion. Of course, such confusion may be present because of a specific use of the trademark by the advertiser. The question the AG had to answer, however, was whether the use of keywords corresponding to a trademark would constitute infringement per se (par. 83). The AG argues that such confusion would only be present if users would assume that all the sponsored results presented in return to a search with a keyword corresponding to a trademark, originate from the same or economically linked undertakings as the trademark proprietor. In his conclusion that this is not the case, the AG relies on the search experience, which included the assessment of natural and sponsored results, as a whole. In particular, he uses the argument that searching in a Web search engine is what I would call a polluted process anyway. End-users continuously have to deal with results they might not have had expected.

86. By comparing ads with natural results, the parties assume that natural results are a proxy for ‘true’ results – that is to say, that they originate from the trade mark proprietors themselves. But they do not. Like the ads displayed, natural results are just information that Google, on the basis of certain criteria, displays in response to the keywords. Many of the sites displayed do not in fact correspond to the sites of the trade mark proprietors.

87. The parties are influenced by the belief to which I referred at the outset – that if an internet user seeks something in Google’s search engine, the internet user will find it. However, that is not a blind belief; internet users are aware that they will have to sift through the natural results of their searches, which often reach large numbers. They may expect that some of those natural results will correspond to the site of the trade mark proprietor (or an economically linked undertaking), but they will certainly not believe this of all natural results. Moreover, sometimes they may not even be looking for the site of the trade mark proprietor, but for other sites related to the goods or services sold under the trade mark: for example, they might not be interested in purchasing the trade mark proprietor’s goods but only in having access to sites reviewing those goods.

88. Google’s search engine provides help in sifting through natural results by ranking them according to their relevance to the keywords used. There may be an expectation on the part of internet users, based on their assessment of the quality of Google’s search engine, that the more relevant results will include the site of the trade mark proprietor or whatever site they are looking for. However, this is nothing more than an expectation. Confirmation only comes when the site’s link appears, its description is read, and the link is clicked on. Often the expectation will be disappointed, and internet users will go back and try out the next relevant result.

89. Google’s search engine is no more than a tool: the link that it establishes between keywords corresponding to trade marks and natural results, even the more relevant sites, is not enough to lead to confusion. Internet users only decide on the origin of the goods or services offered on the sites by reading their description and, ultimately, by leaving Google and entering those sites.

90. Internet users process ads in the same way as they process natural results. By using AdWords, advertisers are in fact attempting to make their ads benefit from the same expectation of being relevant to the search – that is why they are displayed alongside the more relevant natural results. However, even assuming that the internet users are searching for the site of the trade mark proprietor, there is no risk of confusion if they are also presented with ads.

91. As with natural results, internet users will only make an assessment as to the origin of the goods or services advertised on the basis of the content of the ad and by visiting the advertised sites; no assessment will be based solely on the fact that the ads are displayed in response to keywords corresponding to trade marks. The risk of confusion lies in the ad and in the advertised sites, but, as has already been pointed out, the Court is not being asked about such uses by third parties: it is being asked only about the use by Google of keywords which correspond to trade marks.

92. It must be concluded, therefore, that neither the display of ads nor the display of natural results in response to keywords which correspond to trade marks leads to a risk of confusion as to the origin of goods and services. Accordingly, neither AdWords nor Google’s search engine affects or is in danger of affecting the essential function of the trade mark.

Additional functions of the trade mark restricted by freedom of expression and freedom of commerce

This still leaves additional functions of the trade mark unaddressed, such as the reputational function of well-known brands, which others can harm or take unfair advantage of. Notably, this type of protection does not depend on a risk of confusion of the public, and is therefore independent of the essential function of trademarks (par. 94). The Court has stated that such other functions of the trade mark “include guaranteeing the quality of goods or services and those of communication, investment or advertising; it has also stated that such functions are not limited to trade marks which have a reputation but apply in the case of all trade marks.” The AG continues by arguing that these functions of the trade mark, which he places on a sliding scale, are linked to the promotion of innovation and investment.

After having made a short overview of the justification of the protection of other functions of the trade mark, the AG places these interests against other interests involved in the Google Adwords cases. The AG argues that these interests need to be balanced against freedom of expression and freedom of commerce. In other words, the AG construes the protection of trade marks as an interference with freedom of expression and freedom of commerce. The AG concludes that in the search engine and keyword advertising context, freedom of expression and freedom of commerce trump the interests of trade mark proprietors to control the use of their trade marks as keywords in the absence of confusion.

103. Those freedoms are particularly important in this context because the promotion of innovation and investment also requires competition and open access to ideas, words and signs. That promotion is always the product of a balance that has been struck between incentives, in the form of private goods given to those who innovate and invest, and the public character of the goods necessary to support and sustain the innovation and investment. That balance is at the heart of trade mark protection. Accordingly, despite being linked to the interests of the trade mark proprietor, trade mark rights cannot be construed as classic property rights enabling the trade mark proprietor to exclude any other use. The transformation of certain expressions and signs – inherently public goods – into private goods is a product of the law and is limited to the legitimate interests that the law deems worthy of protection. It is for this reason that only certain uses may be prevented by the trade mark proprietor, while many others must be accepted.

After giving two such examples where use of trade marks must be accepted because of freedom of commerce and freedom of expression (comparative advertising and use for descriptive purposes), the AG addresses

[t]he question [..] whether freedom of expression and freedom of commerce should also take precedence over the interests of the trade mark proprietors in the context of Google’s uses of keywords which correspond to trade marks. Those uses are not purely descriptive; nor do they constitute comparative advertising. However, in a manner comparable to such situations, AdWords creates a link to the trade mark for consumers to obtain information that does not involve a risk of confusion. It does so both indirectly, when it allows the selection of keywords, and directly, when it displays ads.

107. Google’s uses of keywords which correspond to trade marks are independent of the use of the trade mark in the ads displayed and on the sites advertised in AdWords; they are limited to conveying that information to the consumer. Google does so in a manner which can be said to intrude even less on the interests of the trade mark proprietors than purely descriptive uses or comparative advertising. As I shall develop shortly, that point emerges more clearly if one reflects how absurd it would be to allow sites to use a trade mark for purely descriptive uses or comparative advertising, but not to allow Google to display a link to those sites. I believe, therefore, that the same principle should apply: given the lack of any risk of confusion, trade mark proprietors have no general right to prevent those uses.

108. I am concerned that, if trade mark proprietors were to be allowed to prevent those uses on the basis of trade mark protection, they would establish an absolute right of control over the use of their trade marks as keywords. Such an absolute right of control would cover, de facto, whatever could be shown and said in cyberspace with respect to the good or service associated with the trade mark.

109. It is true that, in the present cases, the trade mark proprietors limit their claims to Google’s uses in AdWords. Nevertheless, once the notion of ‘confusion’ between ads and natural results is dispelled, this becomes a matter of perspective. Trade mark proprietors may also try to prevent the display of natural results alongside ads. The right of control that they claim covers all the results of keywords corresponding to their trade marks.

110. That absolute right of control would not take into account the particular nature of the internet and the role of keywords in it. The internet operates without any central control, and that is perhaps the key to its growth and success: it depends on what is freely inputted into it by its different users. (57) Keywords are one of the instruments – if not the main instrument – by means of which this information is organised and made accessible to internet users. Keywords are therefore, in themselves, content-neutral: they enable internet users to reach sites associated with such words. Many of these sites will be perfectly legitimate and lawful even if they are not the sites of the trade mark proprietor.

111. Accordingly, the access of internet users to information concerning the trade mark should not be limited to or by the trade mark proprietor. This statement does not apply only to search engines such as Google’s; by claiming the right to exert control over keywords which correspond to trade marks in advertising systems such as AdWords, trade mark proprietors could de facto prevent internet users from viewing other parties’ ads for perfectly legitimate activities related to the trade marks. That would, for instance, affect sites dedicated to product reviews, price comparisons or sales of second-hand goods.

112. It should be remembered that those activities are legitimate precisely because trade mark proprietors do not have an absolute right of control over the use of their trade marks. The Court played a determining role in establishing this, by holding that the interests of trade mark proprietors were not sufficient to prevent consumers from benefiting from a competitive internal market. It would be paradoxical if the Court were now to curtail the possibility for consumers to have access to those benefits, as internet users, via the use of keywords.

113. It should therefore be concluded that the uses by Google, in AdWords, of keywords which correspond to trade marks do not affect the other functions of the trade mark, namely guaranteeing the quality of the goods or services or those of communication, investment or advertising. Trade marks which have a reputation are entitled to special protection because of those functions but, even so, such functions should not be considered to be affected. Thus, the uses by Google may not be prevented even if they involve trade marks which have a reputation.

Contributory infringement, third part liability and the absence of a safe harbour

In the remainder of the opinion, the AG discusses the responsibility of Google for trademark infringements taking place in the context of the Adwords system. The AG explains that the plaintiffs are arguing for a contributory infringement standard under European trade mark rules, by claiming that the possible (and in fact realistic) third party infringements in the context of the Adwords system. He asserts that this standard is foreign to most European countries and should be discarded. Instead, third party infringements should be dealt with through general liability rules. He gives one additional argument against a contributory liability standard, based on its possible chilling effects:

121. The claims of the trade mark proprietors would create serious obstacles to any system for the delivery of information. Anyone creating or managing such a system would have to cripple it from the start in order to eliminate the mere possibility of infringements by third parties; as a result, they would tend towards overprotection in order to reduce the risk of liability or even of costly litigation.

122. How many words would Google have to block from AdWords in order to be sure that no trade mark was infringed? And, if the use of keywords can contribute to trade mark infringements, how far would Google be from having to block those words from its search engine? It is no exaggeration to say that, if Google were to be placed under such an unrestricted obligation, the nature of the internet and search engines as we know it would change.

123. That does not mean that the concerns of the trade mark proprietors cannot be addressed, only that they should be addressed outside the scope of trade mark protection. Liability rules are more appropriate, since they do not fundamentally change the decentralised nature of the internet by giving trade mark proprietors general – and virtually absolute – control over the use in cyberspace of keywords which correspond to their trade marks. Instead of being able to prevent, through trade mark protection, any possible use – including, as has been observed, many lawful and even desirable uses – trade mark proprietors would have to point to specific instances giving rise to Google’s liability in the context of illegal damage to their trade marks. They would need to meet the conditions for liability which, in this area, fall to be determined under national law.

124. It is in the context of possible liability that particular aspects of Google’s role – such as the procedure under which it allows advertisers to select keywords under AdWords – could be taken into account. For example, Google provides advertisers with optional information which can help them to maximise the exposure of their ads. As some of the parties have pointed out, it may be that information on keywords which correspond to trade marks will also yield (as related keywords) information on expressions denoting counterfeit. On the basis of that information, advertisers may decide to select those expressions as keywords in order to attract internet users. It is possible that, in so acting, Google may be contributing to internet users being directed to counterfeit sites.

125. In such a situation, Google may incur liability for contributing to a trade mark infringement. Even though an automated process is involved, there is nothing to prevent Google from making limited exclusions from the information which it provides to advertisers regarding associations with expressions clearly denoting counterfeit. The conditions under which Google might be liable are, however, a matter to be decided under national law. They are not covered by Directive 89/104 or Regulation No 40/94 and, accordingly, fall outside the scope of the present cases.

Unfortunately, current search engine liability and intermediary liability standards more generally, could easily be equally chilling as the contributory liability standard the AG rejects. The safe harbour system for intermediaries in the EU, as introduced by the Directive on Electronic Commerce, was crippled from the start, as I have argued in ‘Legal space for innovative ordering’. And there is no reason to believe that the wide variety and unclear scope of third party liability standards for search engines throughout the European Union would provide a robust protection against the risks as identified above under par. 121-22.

Does the hosting safe harbour protect Google?

In the next section of the opinion, the AG addresses the question whether Google can assert protection under the safe harbours for third party liability, more specifically Article 14 of the Directive on Electronic Commerce. This is the first time the European Court of Justice will have to address the scope of these safe harbours. The first question is whether the activities of Google in the context of AdWords fall under the scope of the Directive, i.e. is Adwords to be considered an information society service. The second question is whether these activities fall under the scope of one of the intermediary liability provisions.

The AG concludes rightly, that the first question should be answered in the affirmative. It is unclear to me, why the Advocat General discusses Article 21 of the Directive in this context. Article 21.2 puts the European Commission under an obligation (which it unfortunately refused to fulfill for the last 5 years) to analyse “the need for proposals concerning the liability of providers of hyperlinks and location tool services“. As everyone that took a close look at the legislative history of the safe harbours in the Directive should know, Article 12, 13 and 14 provide safe harbours for mere conduit, caching (by ISPs), and hosting activities respectively. The lack of safe harbours for other types of activities, and the limited language of Article 14 has led some national courts to extend the safe harbours, but the directive is really limited to quite traditional ISP activity. The directive left open the possibility for member states to adopt additional safe harbours, such as for search engines. In fact, this is what some countries have explicitly decided to do (for instance Austria and Spain), and others have explicitly refused to do (Germany), others have consulted the industry and public about (United Kingdom), and some have ignored altogether (The Netherlands). In some member states, a safe harbour for search engines is basically read into general tort law principles for third party liability (The Netherlands), in line with case law about ISP liability in the 90s predating the safe harbours.

The AG ignores this legal and legislative reality and decides to come up with his own analysis. With reference to the wording of Article 14 – “AdWords features certain content – namely the text of ads and their links – which is both provided by the recipients of the service (the advertisers) and stored at their request“, he concludes that the conditions for hosting, as defined in Article 14 of Directive 2000/31, are nominally fulfilled. Notably, he misses half of the question, since third party liability for a service like Adwords, does not only relate to illegal third party content in the references itself, but also illegal third party in the websites the references link to.

When reaching the point about the ‘nature’ of the activities which are covered by the hosting safe harbour, the AG seems to disagree with the argument that hosting activities need to be of a purely technical nature to be covered. But the AdWords system stands out, in the view of the AG, because it is not ‘neutral’ enough.

142. To my mind, the aim of Directive 2000/31 is to create a free and open public domain on the internet. It seeks to do so by limiting the liability of those which transmit or store information, under its Articles 12 to 14, to instances where they were aware of an illegality. (71)

143. Key to that aim is Article 15 of Directive 2000/31, which prevents Member States from imposing on information society service providers an obligation to monitor the information carried or hosted, or actively to verify its legality. I construe Article 15 of that directive not merely as imposing a negative obligation on Member States, but as the very expression of the principle that service providers which seek to benefit from a liability exemption should remain neutral as regards the information they carry or host.

Maybe this is the classical trap of intermediary liability standards. If neutrality of intermediaries is a prerequisite for enjoyment for distributor liability, those intermediaries will not be able to provide social benefits that require them to make ‘non-neutral’ / active choices. The fact that a book store or library should not be held liable in the absence of knowledge about the availability of illegal material, does not imply that it should not try to have a collection of valuable books on offer.

The AG makes another side-step, when he makes a difference between the natural and the sponsored results when it comes to the neutrality of their selection. Similarly to the analysis with regard to trade mark infringement, the AG seems preoccupied with the idea that natural results would not be protected by the safe harbour if he would conclude that the Adwords system for sponsored references would not be covered by the safe harbour. The best I can make of his argument, is that there is room to include natural search results under a similar regime as Article 14 and Article 15 of the Directive. Whereas the AG put his faith in national third party liability rules a number of paragraphs before, he now seems to think those liability rules would be insufficient.

Its natural results are a product of automatic algorithms that apply objective criteria in order to generate sites likely to be of interest to the internet user. The presentation of those sites and the order in which they are ranked depends on their relevance to the keywords entered, and not on Google’s interest in or relationship with any particular site. Admittedly, Google has an interest – even a pecuniary interest – in displaying the more relevant sites to the internet user; however, it does not have an interest in bringing any specific site to the internet user’s attention.

The assertion that Google can make a determination of the relevance of websites and remain ‘neutral’ at the same time strikes me as utterly unconvincing. It doesn’t help that the notions of ‘relevance’ of websites and the ‘interest’ in giving specific prominence to websites are not being defined. Google chooses to use certain ranking algorithms because it believes they end up delivering the ‘best’ results. There is no agreement what means ‘best’. Google adds value (and lowers value, depending on the perspective you take), by making value-based judgments about the use of their algorithms and other decisions about their search results. And if one takes a closer look, there are all sorts of sites that Google has a direct or indirect economic interest to rank higher or lower (YouTube, AdSense network, Adwords clients, etc, etc…). The AG ignores this point completely. Returning to Adwords, he concludes that Google is not neutral enough in this context:

145. That is not the position as regards the content featured in AdWords. Google’s display of ads stems from its relationship with the advertisers. As a consequence, AdWords is no longer a neutral information vehicle: Google has a direct interest in internet users clicking on the ads’ links (as opposed to the natural results presented by the search engine).

This simply implies that third party liability for sponsored referencing systems like Adwords is governed by national law, which was the right answer in the first place, both for natural and sponsored results. The new European Commission will have to evaluate the situation of search engine liability (its evaluation is planned for 2009 and long overdue). Whether they will be able to clean up the ever growing mess of European intermediary liability remains to be seen. The (partly) European harmonisation of national general liability laws, which it would basically amount to, is an extremely ambitious project.

My conclusion

I find the AG’s analysis of the questions about trade mark infringement quite thoughtful and freedom of expression and freedom of commerce are rightly brought into play to prevent complete control of keyword based referencing and advertising tools by trade mark proprietors. But the result of the AG’s conclusion, if followed by the Court, implies that national courts will fall back on national provisions for third party liability, which are maybe not as chilling for freedom of expression and freedom of commerce but can hardly be argued not to remove incentives for internationally operating search engines to be very responsive towards requests of removal of references by trade mark proprietors.

The AG’s opinion about the scope of intermediary liability in the Directive of Electronic Commerce is confusing. The European regime of safe harbours was incomplete from the start. There are many good arguments to extend the safe harbours to information location tools and other intermediaries at the European level. Neutrality as the AG seems to understand it, however, should not be the overriding factor in making that choice, since it creates disincentives for intermediaries to add value through active governance of their platform.

Julie Cohen on the Changing Meaning of `Unauthorized Access`

Monday, June 8th, 2009

This is a really great lecture! Julie Cohen manages to touch upon almost everything I am interested in, in about half an hour.

A Trolling Professor Gagging on Google

Monday, May 4th, 2009

The respectable LSE professor Willem Buiter has ‘taken up’ the debate on regulating search and is all in favor. In fact, he proposes to regulate Google (not search), and more precisely to break it up and put it out of business if possible.

I must say that I do like his style of writing and I agree that Google’s treatment of privacy and copyright are important issues to discuss. But unfortunately, the content of the essay is not all of high quality: it’s a kind of Google bashing that could ultimately do more harm than good, because the debate about Google in Europe needs economists like Buiter to explain what’s going on or even better to lay out a vision for the policies and laws of the future.

Copyright and theft

I have a particular problem with Buiter’s claims about copyright and Google. He claims that some of Google’s services are (or should be) illegal under copyright law:

Google has been making available copyrighted material for download on its websites for years (books through Google Books, music through YouTube, newspaper material through Google News), often without obtaining prior consent of the copyright holder and generally without making any payments to the copyright holders. There is a word for that kind of behaviour: theft. Just because you steal using internet technology does not make it anything other than theft. As an author, this naturally concerns me.

It’s hard to defend that YouTube is illegal altogether, simply because users can upload infringing videos. In addition, Youtube is more and more positioning itself as a partner for the audiovisual industry, because it seems need them to monetize the service. It would be helpful to get an economic perspective on that.

The Google Book Search scanning program is more complicated. From Buiter, one would expect an analysis of the public welfare benefits of a comprehensive full-text book search service.

Finally, the word ‘theft’ obfuscates the nature of the protection of intellectual labor through legally enforced monopolies for a period of time. This protection can hardly be called property. It’s not unfair to profit from each others intellectual work. The whole idea of copyright protection is to make it profitable for society as a whole. A university professor and successful author should know that.

With these superficial remarks Buiter does not add anything to the debate about copyright and Google, other than his name and some exaggerated qualifications in defense of an industry that opposes change but should be looking for answers instead. His claims are normative without economic foundation. If anything, the news, music and publishing industry probably need the platforms provided by companies like Google and Yahoo to retain some control over consumption of creative products.


Buiter’s complaints about privacy and the importance of default settings are more to the point. He is rightly concerned about the unprecedented collection of user data by companies like Google and Yahoo and the access to that information by government agencies. But I dislike and distrust his reference to the maltreatment of copyright in this context. Politically, these issues are of a completely different nature.

Can we trust Google not to abuse the information they collect? Of course not. This is a profit-seeking company. Its owners, CEO and top managers are typical amoral capitalists who want to make as much money as they can without ending up in jail. Their ruthless, unethical behaviour as regards copyright, Of course we cannot trust them. They must be regulated and restrained by law so we can sleep at ease even though we know we cannot trust them.

I do agree that Google and others should develop an anonymous search experience and use an opt-in for their behavioral targeting program because I think that access to information and ideas should remain free (as in freedom). But default settings are hard to regulate, as an economist should know, because there are so many different products and services, default settings are part of the innovation, it’s partly a matter of technological design and legally speaking contractual freedom poses a hurdle to reckon with. It’s to simple to compare this with H-bombs. This is precisely the type of ‘do no evil’ engineering ethics that makes it harder and not easier to debate the real issues.

Buiter’s opt-out?

Buiter finishes his rant by claiming that he will start deleting everything from Google. Maybe he should also ask the FT to remove his blog from Google search (by adding its directory to this file), remove his website and publications, and tell his agent to stop advertising his speaker qualities through Adwords. Or maybe Google does offer something valuable? I hope Buiter will reconsider and come up with some more realistic proposals.

(sidenote: I took the first part of the title from a reaction to his article at the FT site.)

GBS: From a Zomby Army of Orphans to Dead Souls

Saturday, April 18th, 2009

James Grimmelmann has been writing thoughtfully about the Google Book Search Settlement and has an essy on the orphan works perspective on the settlement. Now, Pamela Samuelson draws the analogy between Google and Gogol’s Chichikov. And here are the slides of a presentation she did about the merits and issues of the settlement, calling on others to study the settlement and join the conversation.

ICSR Report on Online Radicalisation

Tuesday, March 10th, 2009

The International Centre for the Study of Radicalisation and Political Violence (ICSR) released an interesting policy report ‘Countering Online Radicalisation‘. The report critically examines negative measures such as filtering hiding and removal of material, addresses freedom of expression concerns and proposes a number of new positive measures to make the Internet less attractive as a platform for extremism and radicalisation.

Interestingly, the section in the report on negative measures contains a subsection on the strategy of hiding content on the Internet through the removal of material from search engines and the deployment of SEO strategies:

In general, the various tools that have been deployed by governments in recent years can be grouped into three categories: removing content from the web; restricting users’ access and controlling the exchange of information (filtering); and manipulating search engine results, so that undesirable content becomes more difficult to find (hiding).

The report forgets to mention the highly relevant co-regulatory frameworks in Germany and France that do precisely that. It does refer to China’s targeting of search engines and mentions that:

Though technically feasible, it is highly unlikely that Western governments would consider pursuing this course of action.

As governments and third parties are increasingly using the strategy of hiding content through the targeting of search engines, also in the Western world, it is unfortunate that the researchers did not develop their concerns in more detail. The deployment of SEO by governments to reduce the prominence of online extremist material is problematic and rather hypothetical in my opinion.

The report is less than enthusiastic about the use of any of these strategies, noting that removal of content amounts to fighting the symptoms and not the cause, negative externalities of negative measures, the technical imperfection of filtering, freedom of expression concerns and political controversy within certain communities. It does recommend that law enforcement strategically targets illegal material for removal, while focusing on the perpetrators and not the material.

Above all, the report proposes a number of interesting positive measures that could help to make the Internet less attractive for extremists, namely empowering the online communities, reducing the appeal by strengthening media literacy and promoting positive messages. These proposals are sympathetic, but I feel ambivalent about the proposal to strengthen the role of end-users to regulate content. One the one hand, user empowerment is what the Internet and many successful online services are about. On the other hand, community empowerment might lead to the over-empowerment of ultra-sensitive users that are not part of the community but merely active to restrict others in their online communications. Most user-driven sites are far from homogeneous and that is a good thing. Promoting user-empowerment should go hand in hand with promoting tolerance.

Child Safety Online @ Berkman & EP Recommendation

Tuesday, February 3rd, 2009

In half an hour, the Berkman Center is hosting a talk on Child Safety Online. There is a live webcast and the video will be made available on their website. The Berkman Center has recently finished a big study on Online Child Safety (It participated in the Internet Safety Technical Task Force). The speakers include John Palfrey, danah boyd and Dena T. Sacco.

Earlier today, the European Parliament adopted a recommendation to the Council (here is the report), in the context of child abuse, exploitation and pornography. The recommendation calls for new measures that would affect criminal liability on the Internet and of online services. It asks for the:

criminalisation of providers of paedophile chat rooms or Internet paedophile fora

measures to ensure that the Member States, in the context of a comprehensive strategy of international diplomatic, administrative and law enforcement cooperation, take appropriate steps to have illegal child abuse materials taken offline at source, thereby giving victims maximum protection, and work with Internet providers to disable websites which are used to commit, or to advertise the possibility of committing, offences established in accordance with the Framework Decision;

allowing the national enforcement agencies to require Internet providers to block access to websites which are used to commit, or to advertise the possibility of committing, offences established in accordance with the Framework Decision and, if they fail to do so, to require the deletion of the registered domain names which are used for those purposes;

And finally, IAPP reports about the real life implications of (alleged) criminal liability of online service providers. Peter Fleisscher, Google’s global privacy council, will appear in Italian court this week on criminal charges of defamation and failure to exercise control over personal data.

Search Engine Society by Alex Halavais

Friday, December 12th, 2008

Alexander Halavais has done the Web search research community a great favor with his book Search Engine Society published in the Digital Media and Society Series of Polity. The book is not only a comprehensive overview of the relevant literature about search engines from the last decade. It is also well written, concise and pleasingly balanced.

To a large extent, the book is a literature review and does an excellent job at it as well. So I will not try two write a review of that review, but simply make a few remarks and recommend everyone interested to get the book.

It’s worth noting that Halavais has not restricted himself to the social sciences, but has looked at computer science and legal scholarship as well. Although his analysis of legal and policy issues is not as advanced as his discussion of the social aspects, it is good reading for legal scholars as well, exactly because it is embedded in his discussion of the societal aspects of search engines.

Halavais makes some very interesting points about privacy and social search. He makes a connection between what we understand about the impact of search engines on society and the private (unshared) nature of the use of search engines. By keeping our searches private, only companies and others that have access to the laws by law or deals, will get to know what we are searching for and how this might impact us in general. I have (amateurishly) thought about this aspect of search technology myself a few times. Asking questions to other people is a deep social process if you think about it. It expresses trust, curiosity, willingness to bond and a range of other fundamental social values. The reference to the predominantly anti-social nature of current search technology is a welcome and recurrent theme in the book.

There were a few points in the book where I had to disagree. For instance, Halavais describes the crawler as a simple piece of technology. In my understanding crawling and technical crawling management has grown into one of the most complex parts of major search engines.

He also states that Google has started anonymizing the logs of users that are not logged in. Unfortunately, Google will do that only in two incomplete steps, one after 9 months and another step after 18 months. Even after 18 months the logs can hardly be said to be anonymous if they remain organized with a unique identifier replacing ip-address and cookie The logs themselves simply contain too much information, as was shown by the AOL data release.

Halavais also points to the danger that German, by using Google and Yahoo news services, they will be exposed to English language centric or United States oriented news. I am sure this will go down well in Europe, and he attributes this conclusion to German scholars Machill & Beiler, but it’s hard to subscribe to this conclusion since these providers have German language news services, in which one finds German language news by German newspapers.

Culturing Google to Copy Right – Lecture by David Nimmer

Thursday, November 20th, 2008

This Tuesday, I attended a lecture at Suffolk Law School by copyright law authority David Nimmer. Nimmer discussed the copyright implications and legal developments for text, image, video and book search. As the title suggests, he focused almost entirely on services provided by Google. His discussion of video search did not implicate ‘search’ in the strict sense, but addressed the Viacom v. YouTube lawsuit. The Google Book Search has taken a different legal path since the proposed settlement, which received preliminary approval this week. Below are my notes of professor Nimmer’s discussion of text and image search.

Netcom and pre-DMCA

Nimmer started his discussion with the famous Netcom case of 1995. In RTC v. Netcom, Scientology sued a Bulletin Board Service (BBS) for infringement of copyright of Scientology material. Users of the BBS had posted the material on the BBS. In a pragmatic decision, the court concluded there was no direct and no contributory infringement, because there was no volition and no knowledge of the infringing activity respectively.

Nimmer then referred to the WIPO discussions about copyright leading to the WIPO Internet Treaties, and the legislative efforts in the United States leading to the well known Digital Millennium Copyright Act (DMCA), which currently provides for safe harbors for online intermediaries, including search engines, in section 512. He pointed out that part of the intermediary copyright liability regime in the DMCA are two provisions that state that exempted intermediaries have to (1) adopt a policy for repeat infringers and (2) conform to standard technological measures to prevent infringing activity (in other words adopt filtering measures if there is standard accepted filtering technology). He clarified that until now the latter provision has proven meaningless, because there are no such accepted filtering technologies yet. He did point to filtering technology companies as Audible Magic, claiming they had improved over the last decade to the extent that the provision would start to be meaningful. Recently, this reasoning received a serious blow when a Belgium court lifted an order to use filtering technology because it considered it to be ineffective.

Field v. Google

Focusing on text search and search engine caching, Nimmer discussed the case Field v. Google. Before addressing that case he discussed an image search case, Kelly v. Arriba Soft. In that case, the 9th Circuit held that the use of thumbnails did not require authorization by the rights holders because of fair use. The court considered the use of thumbnails transformative and stressed the value of the service provided by Arriba. Nimmer did not address the issue of inline linking in Kelly.

Going back to Field, he first explained the background of the case. Field, an attorney, was the publisher of a website with various stories and essays in which he held copyrights. He waited for Google to crawl, index and make available cached copies of this website and then sued Google for copyright infringement. Field, probably not surpisingly, lost on all accounts. The district court found no direct infringement. The court  also held that Google had four defenses with regard to possible direct copyright infringement. First, it had an implied license for the publication of cached copies. Second, Field was estopped from asserting its copyright claim against Google. Third, the conduct of Google amounted to fair use. And fourth, Google was entitled to the caching safe harbor in Section 512 (b) of the DMCA. It might be interesting to note that Field did not assert that the initial copying by Google as a result of crawling and indexing constituted direct infringement.]

The court in Field relied on Netcom in its conclusion that there was no direct infringement by Google:

[...] when a user requests a Web page contained in the Google cache by clicking on a “Cached” link, it is the user, not Google, who creates and downloads a copy of the cached Web page. Google is passive in this process. Google’s computers respond automatically to the user’s request. Without the user’s request, the copy would not be created and sent to the user, and the alleged infringement at issue in this case would not occur. The automated, non-volitional conduct by Google in response to a user’s request does not constitute direct infringement under the Copyright Act. See, e.g., Religious Tech. Ctr., 907 F. Supp. at 1369-70 (direct infringement requires a volitional act by defendant; automated copying by machines occasioned by others not sufficient);[...] Summary judgment of non-infringement in Google’s favor is thus appropriate.

Nimmer asserted that this reference to volition in Netcom was wrong, since the legislative history had replaced the holding in Netcom with the DMCA and the DMCA did not codify the volition criterion. He explained how in the Judiciary Committee, the Playboy ruling had first been overruled and the Netcom ruling held in favour, but that later the current DMCA framework was adopted that replaced both rulings.

Robots.txt and implied license

Nimmer seemed to be more pleased with the court’s conclusion with regard to the implied license. The argument of the court is based on the availability of an opt-out for webmasters, in the form of robots.txt, that makes it possible for them to exclude their material from crawling, indexing and caching in search engines. This, in Nimmer’s words, ‘idiot proof computer program’ lies at the basis of the implied license and the estoppels defense.

Fair use and DMCA defenses in Field problematic

Nimmer pointed out that the he considered the fair use defense problematic, because Googe copied and made available the complete material as an archival (cached) copy. Nimmer stated that the defense of section 512 b of the DMCA was not available to search engines. He clarified that this caching safe harbor involves 4 actors in the statutory provision. It is not applicable to search engine cache, because the actors involved in the statute do not map to the search engine context. The caching safe harbor is written for proxy caching. For a complete discussion, see Peguera 2008.

Nimmer also shortly discussed the famous Napster litigation, in which the safe harbor for linking and search engines did not apply because of knowledge of the infringement taking place with the help of its service.

Perfect 10 v. Google

Moving on to image search, Nimmer discussed the Perfect 10 v. Google case, which hasn’t ended yet. Perfect 10, a rights holder in adult pictures, sued various intermediaries, including Google and major credit card companies, for direct and/or indirect copyright infringement. The case is different from Field because the material that ended up in Google consisted of unauthorized copies. As a result the implied license and estoppels defense were not applicable here, since Perfect 10 had no control over the postings of the content like Field had. In his lecture, Nimmer focused on the question about fair use and indirect infringement. The fair use defense failed at the district court level, because the court found that there was an actual market for thumbnail size pictures (mobile phone downloads).  The Ninth Circuit reversed and remanded, with a complex ruling in two steps, arguing that the use of thumbnail images was commercial in nature but that the public interest had to be taken into account when determining whether the use was transformative. The case is back at the district court level. Nimmer said he was agnostic as to who we will ultimately win the case.

European developments

It is interesting to note that the European Commission has issued a Green Paper that deals with some of these important issues (in footnote 21) and that in Germany the image search litigation continues to be challenging for search engines.

Debunking the Fiction of Privacy Policies

Thursday, October 9th, 2008

Ars technica reports; “Do people read online privacy policies? Of course not. But if they did read them at least once a year, it would take an average of 10 minutes per policy and cost $365 billion in lost leisure and productivity time.” The study is here.

