Google liberates data
October 11th, 2009 | by elisabeth | Published in cloud, Future of the Internet | 4 Comments
Professor Zittrain has spent time on this blog and elsewhere discussing the future of cloud computing. One of his frequent suggestions is that it should be easier to move data within the cloud, so we don’t all get locked into a certain photo storage system, or spreadsheet provider, or what have you. It seems that Google agrees. An Google engineering team called the Data Liberation Front—complete with revolutionary logo—recently revealed that it’s been working since 2007 to make sure people can move their data out of Google products. Blogger and Gmail are done; Google sites and docs are coming next.
As the team explains:
[W]e always encourage people to ask these three questions before starting to use a product that will store their data:
1. Can I get my data out at all?
2. How much is it going to cost to get my data out?
3. How much of my time is it going to take to get my data out?The ideal answers to these questions are:
1. Yes.
2. Nothing more than I’m already paying.
3. As little as possible.
The Data Liberation Front explains the motivation for doing this in the usual Google manner: Don’t be evil. Don’t trap consumers into products they don’t want to be using.
This is a great step, and a real benefit to the users who want to have choices. Kudos to Google.
One thing to pay attention to, though: the DLF has made it possible to liberate ad-campaign data on AdWords by exporting it in a CSV file. Ben Edelman, a Harvard Business School professor (and sometime Zittrain co-author) who’s been studying online advertising, is skeptical of how well the “liberation” product works for AdWords. He thinks that CSV exports are clumsy, time-consuming, and error prone, whereas an API-based export could be powerful tool for advertisers. API-based export is technically possible, but apparently prohibited by Google’s terms and conditions. Without a good way of syncing data across platforms, he argues, advertisers tend to stick solely with Google. In short, he says, “I credit Google’s efforts to facilitate data portability in its ancillary businesses, like document sharing and image hosting. But when it comes to the one business where Google makes billions of dollars—and where Google has 70%+ market share—Google’s actions reveal the company’s willingness to put its own bottom line before advertisers’ interests and, for that matter, fair competition.” Google argues in reply that CSV export is perfectly workable, and that, in fact, many advertisers do use the AdWords editor to run campaigns for multiple platforms.
The disagreement points out that data portability isn’t an on/off issue—it’s a spectrum, and it bears watching how Google and its competitors fall out along that spectrum.
—By Elisabeth Oppenheimer


October 11th, 2009 at 7:58 pm (#)
Vendor lock-in is an issue with any data storage system – in the cloud or hosted in-house. We need to look into and investigate the tools that the vendor provides to extract the data out of the system.
From what I seen (and experimented with), Google provides excellent set of APIs to access the data stored in Google’s Cloud. And Google is always working on to improve the APIs. Google usually first adds functions to the API, and then introduces them in the UI. Compare this to other software vendors, who usually introduce the new functions in the UI and then at a later time provide API access to those functions – if it all.
I currently use both Google Docs and Windows Live Workspace to store my personal / school related stuff. I use both of these because they both have their benefits. Windows Live Workspace provides complete integration with Office 2007, whereas Google Docs provide editing capabilities in a Web browser. Recently I have been thinking of writing an application that will synchronize the content of both of these repositories. Google provides APIs that make this task easy from Google’s side, but there are no Windows Live Workspace APIs, so I have to devise a workaround to get documents into the Windows Live Workspace.
“With problems that we are not aware of yet, the ability to put right – not the sheer good luck of avoiding indefinitely – is our only hope, not just of solving problems, but of making progress. ” – Physicist David Deutsch
October 12th, 2009 at 6:50 am (#)
There’s certainly a need for a voluntary code of practice, and perhaps regulation:
Picture a future scenario where a significant cloud data provider gets into financial difficulty. As rumours about its viability spread, everyone tries to pull off copies of their own data. This overloads the servers, compounding the fears. The company goes bust, and the liquidators do their best to return data to its rightful owners, but they may not have the resources to do so.
It’s not unlike a run on the bank – and for those with data significantly ‘invested’ in the provider, approaching the same level of seriousness.
October 13th, 2009 at 8:43 am (#)
I think this is a battle that will be difficult to fight in the long run. As soon as you start using some of these products they gain control of what ever you put onto those networks. Facebook has rights to the photos in Facebook. It is the price you pay to use their services. Nobody is forcing anyone to use a Google blog or email.
October 15th, 2009 at 1:30 pm (#)
I think that it`s dangerous to use this tools and putting own data to networks, everyone can use….