Robots.txt & Privacy Protection
May 7th, 2008Lately, there seems to be some consensus that the robots.txt instruction, and similar instructions and protocols to be developed in the future, are the way forward to protect privacy and other interests in controlling information online. The eloquent Jonathan Zittrain gives a very nice explanation of this idea in this video:
I tend to agree that meta-data will help to solve some of the problems. The robots.txt gives some effective control for publishers over who gets to index data. One can imagine that it could be extended in a way that it allows third parties, e.g. people in a picture that can be identified though facial recognition, to express their wishes about publication and reprocessing of the material, including identification.
There are however a number of problems. First of all, to express privacy wishes is a privacy problem itself. So the system has to be very sophisticated. Second, major search engines do obey these sort of instructions as robots.txt but there are search engines and automated content aggregators that don’t and this can be legitimate. The fact that Google respects robots.txt instructions has implications for the accessibility of information and robots.txt and similar instructions are used for all sorts of reasons, including illegitimate ones. I would see similar problems arise trying to protect online privacy with instruction protocols. The control has to be mediated and breakable. I am currently reading Zittrain’s book and did not reach chapter nine yet. I am very curious about his particular proposal and hope to write more about this myself in the future.
May 15th, 2008 at 8:05 pm
[...] Jonathan Zittrain promotes self-regulation when it comes to searchengines and privacy (which I think is naive). (via) [...]