Welcome!

Web 2.0 Authors: Pat Romanski, Manuel Weiss, Martin Etmajer, Roger Strukhoff, Liz McMillan

Related Topics: Cloud Expo, Linux

Cloud Expo: Blog Feed Post

SQL Data Services: Your Database in the Cloud

This will really make sharing of data in the cloud so much easier

One thing in the Microsoft cloud I find really interesting is SQL Data Services and Huron/Data Hub - SQL cloud sync service, one of the “cloud” offerings I believe has lots of potential and will really make sharing of data in the cloud so much easier.

I had the pleasure to sit down and talk about this subject with Liam Cavanagh, Sr. Program Manager at Microsoft, with the SDS/Huron team, and get some insights about the current state and the future of this remarkable new technology. In this article I’ll talk about SQL Data Services, and I’ll follow up with one about Data Hub/Huron.

SQL Data Services is at the core, nothing more than a (Microsoft SQL) database-as-a-service offering from Microsoft, part of the Azure Services Platform. First thing you’ll find about SQL Data Services is that “is just SQL” (at least that’s how Microsoft is advertising it). And it is. You’re able to change your connection string from your local database to your cloud database and you can access the “cloud” SQL. You can use SQL Studio to run queries, create tables, everything (oh well, almost) you do locally. First version of SQL Data Services will support: tables, indexes, views, stored procedures, triggers, constraints, table variables, session temp tables etc. It will not support: distributed transactions or queries, CLR, Service Broker, Spatial, physical server or catalog DDL and views. Also, reporting services, Business Intelligence  services, will be available sometimes in the future. So far there’s no information for when some of the features not included in the first version will be available.

The initial commercial release will have some limitations on database size, most likely it will be around 10 GB. The limitation might be lifted on future releases, but for now will be there to stay. This limitation is mainly because Microsoft feels that this is a good size they can easily manage in the background: backups, moving the database from a server to another server, data recovery, etc. You can have as many databases you want, and let’s be honest, 10 GB is a lot of data to store.

Other limitation will have to do with the duration of transactions and resource load on the server hosting your data. Keep in mind that your data will be living on servers in Microsoft’s data centers, with data from other customers. Microsoft makes sure your data is secure (I’m sure we’ll see some guarantees in the SLA), but in order to maintain good multi-tenant practices it will have to throttle or otherwise make sure that all the databases on the server get enough resources to function properly. One of the techniques used is moving more active databases from a loaded server to an idle server.

Like with any other database, corruption of data can happen in the cloud database as well. Microsoft has mechanisms in place to recover from data corruption (mainly by keeping database replicas on multiple servers), however, they don’t provide any user level backup of the database (at least in the first version). As we’ve seen in some of the PDC 2008 presentations, in the future we will probably see database backup/restore and geo-replication (synchronous – replica set spans datacenters and asynchronous – independent replica sets in different datacenters).

There’s no surprise on how concurrency is handled in the cloud database, SDS has the same mechanism like any SQL Server. SQL Server supports optimistic (time-stamps or value comparisons) or pessimistic concurrency models. The presence of the “cloud” doesn’t change the model at all. If you’re really curious about the subject, here’s a link to some information about SQL Server 2008 Concurrency which essentially deals with how the SQL Server handles locking.

By having the database in the cloud, there’s going to be a latency when accessing it from your premises. Microsoft recommends running your applications that are using the database in the cloud on the Azure Platform, so the latency is minimal. When you deploy an application on Windows Azure and provision an SDS server, the two are going to be co-located, to provide low latency between the application and the data.

You will find out rather quickly that there’s no web based administration tool for managing your database in the cloud, but most probably some kind of web admin tool (Microsoft or third party) will be available in the near future.

The exact billing model is not yet available. However, we know from Nigel Ellis (the person responsible for the design, development, and release of SQL Data Services) that customers will be charged for the physical database size including all data and indexes defined.

What is SDS offering more than other SQL hosting services? High availability - your data is guaranteed, is available all the time. If you’re hosting SQL, in order to have high availability, you need to probably have two servers (mirrored) in case one goes down, the other one can take over. Also, SDS solution seems to be cost effective, since you pay just for what you’re using.

Initially SDS was built to use SOAP and REST protocols to access the data. With the switch to being a full relational database in the cloud, SDS is now using Tabular Data Stream (TDS) protocol, an application layer protocol used to transfer data between a database server and a client, initially developed by Sybase Inc. for their Sybase SQL Server relational database engine in 1984, and later by Microsoft in Microsoft SQL Server. There are already lots of drivers already implemented for this protocol: ODBC, OLEDB, ADO .NET, ODBC driver for PHP stack, you can access it from ruby, from linux using the Open TDS driver.

Of course, it will take some time for the platform to mature. It is the goal of this first version to address the needs of 95% or more web and departmental applications.

The SQL Data Services Community Technology Preview (CTP) will be available soon. You can join the mailing list in order to receive an e-mail notification when it will become available.

Related posts:

Read the original blog entry...

More Stories By Alin Irimie

Alin Irimie is a software engineer - architect, designer, and developer with over 10 years experience in various languages and technologies. Currently he is Messaging Security Manager at Sunbelt Software, a security company. He is also the CTO of RADSense Software, a software consulting company. He has expertise in Microsoft technologies such as .NET Framework, ASP.NET, AJAX, SQL Server, C#, C++, Ruby On Rails, Cloud computing (Amazon and Windows Azure),and he also blogs about cloud technologies here.