Nick Bilton over at the NYT Bits Blog has the story of Internet security consultant Ronald Bowes’s recent Facebook caper. Ron noticed that Facebook has a directory of its users, just like the old Bell Telephone White Pages. I agree with Ron’s assessment that this is a very little-noticed feature: normally one searches on Facebook not by looking at a directory, but rather by typing a name into a search box. It’s in plain sight, though, at http://www.facebook.com/directory:
There are two differences that jump out between this awe-inspiring alphabetical listing of all Facebook users and a dog-eared telephone directory. First, Facebook’s directory has a staggering 171 million names in it. Second, in good news for paper prices everywhere given the first difference, the directory is digital — it’s right there, online. And if it’s online, it’s scrapable. Ron, being of the inquisitive engineering sort who can’t help but push a button if he sees one, figured that supply creates demand, and went ahead and scraped the directory.
That means he produced a file on his own hard drive containing more or less the directory’s main contents: for each person listed, a name, the person’s Facebook URL (what one types in to go directly to his or her entry), and unique Facebook ID (not a secret; this is part of a person’s Facebook url). The resulting file is only a few gigs — amazing how cheap storage has become that so much can be roughly the side of an episode of House. Ron then placed it online as a torrent — which means anyone can download the file, and voila, a snapshot of Facebook’s membership as of July 2010.
So, is this a problem? As I’m writing, news is only just breaking, so it’s like that moment when a toddler trips, falls, and then has to think about whether to cry or not. “You’re OK!” is usually what the alert parent encouragingly says — and if the toddler buys it, it’s usually true. In fact, even if the toddler doesn’t buy it, it’s still usually true. In this case, I think I’m with the metaphorical parent. The data that Ron grabbed is precisely what Facebook users have chosen (or perhaps more accurately, passively acquiesced) to share. For those who lock their privacy settings to avoid having a public listing in a Facebook search, they’re not present here. For those who have, they are — along with a click through to their respective Facebook pages however they’ve chosen to share them.
Ron appears a little disquieted by it because of the prospect that the snapshot can live forever more. If you remove your Facebook account or up your privacy settings, that will be reflected in real time in the Facebook directory and search (or at least it should be!). But the torrent file exists forever — so one’s privacy choices are locked into that moment. This is an artifact of having a service — Facebook — converted into a product — a Facebook database — the way that universities used to not just maintain online directories, but also publish bound volumes of their alumni with addresses, for those who opted in. (In fact, many universities still do this; someone should tell them about saving the trees.)
There’s some privacy hit there, but there are also benefits. By making a public directory — and a scrapable one, no less — Facebook gets more inbound links and attention as its members become easier to find. And we benefit by having Facebook’s subscribers’ public pages indexed by the likes of Google and Yahoo! search. In fact, when searching on a person’s name in a regular search engine, quite commonly a Facebook entry is one of the top hits. That seems to me a good thing, and once Google, Yahoo!, and Bing have it, why shouldn’t Ron and anyone else who wants it have it too? Indeed, Ron already did some cool stuff with the data. For example, he crunched it all and came up with a list of Facebook’s most commonly used first and last names, discovering “Michael” and “Smith” coming in at number 1 for each. Congratulations, Michael Smith, you are hidden in plain sight, since a search for you turns up so many others at the same time! (Not so much with “Jonathan Zittrain”…)
Anyway, that’s generativity at work: Facebook makes available a directory on free and open terms, and people do stuff with it, some of which can surprise us. There could be bad surprises, too — Ron and others hint at undesirable data mining — but I’m glad that the gates of Facebook’s gated community have some slats in them, rather than being a solid wall. At most, it seems to highlight the desirability of getting the defaults right: Facebook shouldn’t have people automatically publicly sharing stuff they’d not normally share, without clear markers on what’s about to happen. As Google would say, “Please read this carefully. It’s not the usual yada yada.”
Indeed. There have been so many Facebook privacy mini-scandals that we’re primed for the next, and the involvement of a torrent file adds an element of seeming subversiveness to the mix, given the association of p2p with contraband material. But sometimes when the boy cries wolf it’s just a shadow. I count 8 Yadas in the Facebook directory. And I, along with my cool musician brother Jeff Zittrain, fall in between Aron Zittra and Austin Zittrauer. Until now, who knew? Interesting — but not pitchfork worthy. …JZ


