Thursday, May 1, 2008

Privacy through Difficulty

I had lunch today with Harr Chen, a graduate student at MIT, and we were talking about the consequences of information efficiency for privacy.

A nice example is the company pages on LinkedIn. No company, to my knowledge, publishes statistics on:
  • the schools their employees attended.
  • the companies where their employees previously worked.
  • the companies where their ex-employees work next.
If a company maintains these statistics, it surely considers them to be sensitive and confidential. Nonetheless, by aggregating information from member profiles, LinkedIn computes best guesses at these statistics and makes them public.

Arguably, information like this was never truly private, but was simply so difficult to aggregate that nobody bothered. As Harr aptly put it, they practiced "privacy through difficulty"--a privacy analog to security through obscurity.

Some people are terrified by the increasing efficiency of the information market and look for legal remedies as a last ditch attempt to protect their privacy. I am inclined towards the other extreme (see my previous post on privacy and information theory): let's assume that information flow is efficient and confront the consequences honestly. Then we can have an informed conversation about information privacy.

1 comment:

Daniel Tunkelang said...

It occurred to me that some might see a contradiction between this post and the previous week's post on Accessibility in Information Retrieval. Here, I'm suggesting that difficult-to-access content shouldn't be considered secure; there I'm suggesting that difficult-to-access content shouldn't be considered accessible.

Of course, these are different use cases. Still, it's worth keeping in mind that different users have different motives. What prevents a casual user from accessing information won't stop a sufficiently determined one.

Thursday, May 1, 2008

Privacy through Difficulty

I had lunch today with Harr Chen, a graduate student at MIT, and we were talking about the consequences of information efficiency for privacy.

A nice example is the company pages on LinkedIn. No company, to my knowledge, publishes statistics on:
  • the schools their employees attended.
  • the companies where their employees previously worked.
  • the companies where their ex-employees work next.
If a company maintains these statistics, it surely considers them to be sensitive and confidential. Nonetheless, by aggregating information from member profiles, LinkedIn computes best guesses at these statistics and makes them public.

Arguably, information like this was never truly private, but was simply so difficult to aggregate that nobody bothered. As Harr aptly put it, they practiced "privacy through difficulty"--a privacy analog to security through obscurity.

Some people are terrified by the increasing efficiency of the information market and look for legal remedies as a last ditch attempt to protect their privacy. I am inclined towards the other extreme (see my previous post on privacy and information theory): let's assume that information flow is efficient and confront the consequences honestly. Then we can have an informed conversation about information privacy.

1 comment:

Daniel Tunkelang said...

It occurred to me that some might see a contradiction between this post and the previous week's post on Accessibility in Information Retrieval. Here, I'm suggesting that difficult-to-access content shouldn't be considered secure; there I'm suggesting that difficult-to-access content shouldn't be considered accessible.

Of course, these are different use cases. Still, it's worth keeping in mind that different users have different motives. What prevents a casual user from accessing information won't stop a sufficiently determined one.

Thursday, May 1, 2008

Privacy through Difficulty

I had lunch today with Harr Chen, a graduate student at MIT, and we were talking about the consequences of information efficiency for privacy.

A nice example is the company pages on LinkedIn. No company, to my knowledge, publishes statistics on:
  • the schools their employees attended.
  • the companies where their employees previously worked.
  • the companies where their ex-employees work next.
If a company maintains these statistics, it surely considers them to be sensitive and confidential. Nonetheless, by aggregating information from member profiles, LinkedIn computes best guesses at these statistics and makes them public.

Arguably, information like this was never truly private, but was simply so difficult to aggregate that nobody bothered. As Harr aptly put it, they practiced "privacy through difficulty"--a privacy analog to security through obscurity.

Some people are terrified by the increasing efficiency of the information market and look for legal remedies as a last ditch attempt to protect their privacy. I am inclined towards the other extreme (see my previous post on privacy and information theory): let's assume that information flow is efficient and confront the consequences honestly. Then we can have an informed conversation about information privacy.

1 comment:

Daniel Tunkelang said...

It occurred to me that some might see a contradiction between this post and the previous week's post on Accessibility in Information Retrieval. Here, I'm suggesting that difficult-to-access content shouldn't be considered secure; there I'm suggesting that difficult-to-access content shouldn't be considered accessible.

Of course, these are different use cases. Still, it's worth keeping in mind that different users have different motives. What prevents a casual user from accessing information won't stop a sufficiently determined one.