This week a bunch of news got picked up by The Economist (http://www.economist.com/opinion/displaystory.cfm?story_id=13740181) in articles about "openness". That is, they claim that Cloud computing (and by implication SaaS as well) is at risk of becoming a vehicle for proprietary lock-in. In response to this there is an Open Cloud Manifesto (dot org) which got some press earlier this spring. The claim is that "data residing in the cloud can be difficult to move to another service".
I am not sure there is much to this argument. Data residing in any system can be difficult to move to another system. Cloud or not. Take as an example any system where the data coming out is utterly foreign in structure to any other replacement system one might want to put that data into. The only way to fix this is standardizing the data layer.
For clouds, a big step is: customers must insist on standard data interfaces providing the ability to query and extract their data.
I want to address the application of these arguments to SaaS-deployed applications. Most people think of SaaS applications as things "deployed in the cloud", in the sense that they don't care where they are deployed as long as they are secure and reliable. Many cloud-computing folks are thinking lower level - like standard virtual machines running in the cloud. I believe the lock-in phenominon - and the need for safety from it - occurs at every level of the technology from the base OS, to the application servers and data stores, to the application level, but there is something very key about the data layer.
But it is important also to understand that unless everyone is selling indistinguishable commodity goods and services there is no perfect compatibility. If providers can differentiate their offerings and add value over their competitors, then there is some distinction which will make compatibility problematic. I think most people get this notion. Basically, we need to take some precautions to be sure we aren't held hostage to a bad lock-in situation, but making it utterly easy to move is simply not in the cards.
So what should one do with respect to SaaS software to manage the lock-in risk? There are two important things customers should do to avoid lock in.
First is understand the extent to which things are different for SaaS vs. regular software. Many things really are the same: since software companies want to add value, and they want to distinguish themselves from competitors, you can't truly trade them off against each other interchangably. There is some investment/pain required to switch, and that's true even if you are switching from one ordinary RDMBS to another - The SQL standard is quite imperfect. Costs can be very high if you are switching from one ERP system to another, one application server to another, or any other major enterprise software infrastructure. SaaS applications don't really change this. The key thing that prevents total lock-in is that while it may be expensive to switch, it is at least *possible* to switch. You can take the data out of one system, and put it into another with some adaptation. You are dead, that is totally locked in, unless the data is ultimately available to you.
This principle applies to SaaS as well. Customers should be careful that the SaaS products to which they subscribe offer standard interfaces for data access, so that the providers do not hold your data hostage. The data you provide to a SaaS system is yours, and you are entitled to it, and if you can't get it in an intelligable bulk format, then you aren't being given access.
By far the easiest way to avoid this kind of data lock-in, is to choose vendors with support for industry-standard data access protocols and languages. Oco does BI solutions. We have a pretty nice display system that is easy to use by business users. But what if you want to use something else? What if you want to take a snapshot of your data for some other purpose. This is a main reason why we offer secure ODBC tunnels as a way to get at your data as well, so that you can bring to bear products from other vendors on the same integrated data (we specialize in the integration of the data). You can use ordinary SQL to see your data at what we call the view layer, which gives you a standard star-schema style of data that you could load up into your favorite RDBMS. SQL and ODBC/JDBC are important standards, and customers should insist on the option to have standard connectivity even if they don't need it right away.
This is why I've been worried about lack of standard SQL support in some recent cloud computing announcements. (Amazon SimpleDB is one I have been worried about, see http://snarfed.org/space/amazon simpledb thoughts, though they seem to be evolving it toward a bit richer SQL subset. I have a bigger problem with the Google Apps Datastore, though Google does publish its Data APIs openly.) I'm not saying everyone should support all of SQL 92. But there is no way I'd advocate rewriting the Oco system using Amazon SimpleDB, or using Google's Datastore. These would preclude us providing standard data access to our customers, and by the way, these two systems have basically nothing in common other than a philosophy that they needn't be much like SQL. To me SQL is important, and a query-only subset is fine, and even restricting the queries somewhat is fine, but the point is that for our system to work, and for our customers to feel they're safe from unnecessary lock-in, customers should be able to take the data out, using a means that they roughly understand, and put the data into a different vehicle for execution of their application needs.
There's also a whole host of new BI tools on the market which don't support standard SQL databases. I won't namedrop here, but it should be a big concern to people that these systems haven't yet matured to the point where they support SQL.
Standards-based access to the data is what is key to avoiding lock-in meaningfully.