Windows Azure for Noobs
Wednesday, September 22, 2010
A few months ago, I had the opportunity to attend a Windows Azure Boot Camp. So this post today is for those of you who haven’t been keeping up and want to know about some of the basics. It’s not to teach you the ins and outs of developing with Windows Azure. Although, getting started isn’t too difficult and the boot camp site has all the materials you need to get you started quickly.
If you’re not familiar yet, Windows Azure is part of Microsoft’s cloud computing platform. Specifically, it is built for developing ASP.NET applications and WCF services in a cloud environment. It shouldn’t be confused with Microsoft’s other online product offerings such as BPOS (which includes things like Exchange Online , SharePoint Online, and OCS Online). I’ll try and start out by clarifying some of the basics that I was wondering about when I first started. The first thing to know is that you need to get a handle on pricing. I’m not going to attempt to explain it all, but the gist is you pay for the number of instances you have of a given application (this sort of equates to a VM – but not necessarily), the number of transactions you make to online storage, the amount of storage you use, and the amount of bandwidth you use.
This may sounds like it’s going to cost a lot, but keep in mind most of these costs are measure in cents (although they can add up). One important thing to note about pricing is that you are paying for “compute time” as they call it whether your application is being used or not. Even if it is suspended, you are paying for that application until you go and delete it.
One more thing you’ll want to know before starting. The DevFabric (bascially a local version of the cloud) assumes you are using a SQL Express database. If you are using a regular SQL server, then follow the information I found in this blog on how to set up your database. As for developing in Azure, it's just like developing in ASP.NET, you can use most of the things you are used to. However, some things aren’t allowed. For example, code that tried to access machine specific resources or writes to the registry is going to throw a SecurityException.
When you create a new Windows Azure project you are asked to pick a role.
Effectively there are two types of roles: Web and Worker. The Web Role will create an ASP.NET project. It has some variations (MVC 2, WCF Service, etc), but it really just affects the type of ASP.NET project that is created. The Worker Role is very similar to a Windows Service. It typically starts and runs in a loop.
Once you create your project, you’ll get a pair of projects that look like this in Visual Studio.
The roles you have added to your project show up in the Roles folder. Viewing the settings of this allows you to configure storage settings and endpoints for the service. You can also choose Full or Partial Trust here. When this project is active and you hit F5 to debug, it will actually launch something called the DevFabric which is basically local emulation of the “cloud”. In training, they typically said that the DevFabric covered 90% of what the actual cloud does, but I never saw an example of anything that wouldn’t work. You can fully debug your ASP.NET application in the DevFabric, but it’s not possible once you get your code in the cloud.
When you’re ready to go to the cloud, you use the Publish menu option. This packages your solution into a .cscx file. You then take it and the configuration settings file (.cscfg) and upload it on the deployment page. At this point, you have to wait a few minutes while it does the deployment.
I kind of like the way you stage deployments to production. You start by uploading your package file to a staging instance of your application. Effectively this spins up another instance (i.e. server). Your application in staging gets a new separate public facing address for you do your QA. Once you’re happy with it, you click the swap button and it simply swaps what you had in production with staging and you’re done. Of course, the drawback to this is that when you have a staging instance up and running you’re paying compute hours on it. That means, you only want to leave staging up as long as you need it.
Additional Development Information
You can’t upload your own COM objects since you can’t write to the registry. However, you are allowed to use DLLImport to make calls to execute unmanaged code from the Windows API for example. Honestly, I am surprised this is allowed, but they claim it is. You can also execute executables which seems like you could use to circumvent some of the things that prevent you from knowing much about the underlying operating system. You’re also allowed to spawn background threads.
The AppFabric brings a few more advanced features you can take advantage of. Windows Azure has a service bus which you can use to connect services together. It also supports a security mechanism for securing REST services. It’s based upon OAuth and it sounds a lot like SharePoint 2010’s Claims-Based security, but honestly I don’t know if they are one and the same.
There are a few things that Windows Azure adds to ASP.NET. There are three assemblies that you can reference to take advantage of these extra features. <> Windows Azure brings the concepts of Tables, Queues, and BLOB storage. All of these start with the concept of a Storage Account which is simply a billing and grouping that you create through the Azure administration portal. The storage account uses a pair of keys to authenticate any application that accesses it. Keep these keys secure. You can create the items we mentioned above inside that storage account. Any time you write something to a storage account it is written in triplicate meaning its written to three separate locations to help prevent data loss.
We’ll talk about Tables first. Now before we start, forget everything you know about SQL tables because that does not apply here. These are not relational tables and have nothing to do with SQL Server in any way. They are highly scalable capable of containing extremely large amounts of data. Your storage account can contain multiple tables and inside the table you store entities. The interesting thing is that the entity schema can vary inside the same table. Since it’s not relational, you can’t do server side joins and there is no foreign key relationship. You get data out of it using LINQ so this does mean you can do some joins via code running on your web server, but remember its not a true server side join. There is a 30 second query timeout, so if your query returns too much, it will get killed.
As for the entity going into the table, you simply create a class and inherit from TableServiceEntity. In the past people have not liked inheriting from some class in their ORM, so this may be an issue for some people. Tables are partitioned. The concept of partition is kind of hard to grasp, but you effectively split out a single table into different groups of things. I’m not really sure why you do this and what the advantage of it is, but you do have to provide a partition key such as Products, or Contacts. You also have to provide a RowKey in your entity class. This serves as a primary key. In a lot of code I have seen, they auto-generate this in the constructor. To query the data, you will create a DataContext class in a very similar manner to how you use one when using LINQ to SQL.
How do you create a table? Well, there are no design-time tools for it. It’s created completely in code. There also isn’t any included tools to view, edit, or delete the data in your tables. A third party tool called Cloud Storage Studio does exist from a company called Cerebrata, but keep in mind any time you interact with the data in your table you are paying for storage transactions and bandwidth.
The last thing to note is that although your data is relatively secure from hardware loss since its written in triplicate, there is no concept of a backup. This means if data was accidently deleted, you have no way to get it back. If you do want a backup, you are going to have to write some code to export the data into something local.
Windows Azure also has the concept of queues which sits on top of your storage account. There work in many ways like other queuing architectures that you might be familiar with. Queues can hold an unlimited amount of data (or at least that’s what they claim). They work by storing XML serializable messages but they can’t be larger than 8KB each. The way you would typically use a queue in Azure is by pushing a message in your Web Role. A worker role then comes and pops the messages off periodically and acts on them.
I’m sure you all know what BLOBs are already. The purpose of these is to give you a place to store files in Azure. BLOBs are stored in containers and each BLOB can be up to 1TB in size. Containers have an unlimited capacity (at least that is what is claimed) and they act very much like a folder. Containers are private in default but you can also make them available to the public. BLOBs can also contain metadata. From what I gathered from the training though, I don’t believe there is a way to query upon that metadata. I hope I’m wrong though so if someone finds out how, please let me know. One thing you can do is upload a .VHD file as a BLOB and mount that as a legacy file system. This is called Windows Azure Drive. This is mainly for legacy use though. The impression I got was that they really want you to use the new storage mechanisms instead of using this. As is common with everything else, there is no included way to backup your files. If you need to backup them you will need to turn to a third party tool or write some code.
So if Azure Tables aren’t for you, you can choose to make use of SQL Azure (for an additional fee of course). However, there are a number of limitations. First, currently you can only create databases that are 1GB or 10GB. I’ve recently been told that 50GB databases will be available in June. As far as what you can do in SQL, just stick to the basics. Things like Cross-Database joins, Database Mirroring, SQL CLR, Replication, Extended Stored Procedures, Spatial Data, and Backup and Restore are simply not supported. That’s right, I said no backup and restore. Although, your data is automatically written to multiple servers, you have no built-in mechanism to do backups or restores when accidental data deletion occurs. This is obviously going to be a deal breaker for a lot of people, but this is supposed to be a very high priority issue for Microsoft, so we may see this in the future. If this is going to be an issue for you, you can write your own backup routines, or maybe SQL Azure is just not right for you right now.
As for authentication, SSPI is out, only SQL authentication is supported. When you create a SQL Azure account, it spins up a server (rather quickly even), prompts you for the name and password of an administrator, and it gives you an address of where to find your server. You then create the database through the web interface. At this point, you can actually connect to your SQL server using the address it provides using SQL Server Management Studio (latest version). Unfortunately, I was never able to get the authentication to work. Also if you are curious about security, you can actually configure the firewall to allow specific IPs access to your SQL server.
It probably goes without saying at this point, but SQL Reporting Services, Analysis Services, and Integration Services are not supported either.
Roles Non .NET in the Cloud?
Sure. Provided it runs via FastCGI. This makes things like PHP possible in the cloud. You can actually upload the executable such as php.exe and then set up pages of the appropriate extension to run through CGI. It’s still basically getting routed through IIS but it does open up your options somewhat.
Windows Azure is not the silver bullet for all applications going forward. We all know that’s SharePoint of course. :-) No, but really, you have to look at your application and see if it makes since. My biggest concern at this point is the development story. There are a number of offers for getting cloud access. Some are paid, some are trials, you get some for being a partner or with MSDN. If you are a partner, you are only entitled to a few hours a month. There are a few paid options and what not that are reasonable, but you’re going to have to get someone to pony up a credit card number for you to use anything. Being a partner, also only the people that have download permissions already have access to the Azure accounts. This could be an issue with some partners that rule over those accounts with an iron fist.
This is already turning out to be a long post and I’ve far from covered everything in any real detail, but hopefully this is a good start if you just wanted some basic information. Plus, I’ve provided plenty of links with more information and how to get started. I hope I got all of the facts right, but if I missed something, please leave me a comment. If you have a VM with Windows Server 2008 R2 already on it, you can get started pretty fast. Windows Azure has a relatively low startup cost and has great scalability. If your application turns out to be a good fit for it, then I definitely recommend taking a look at it.
If you are interested more in Windows Azure, I have to recommend finding a boot camp. The one I attended was great and very informative. There are still many more being scheduled. What’s cool is that if there is not one in your area, they will help you throw one. I’m seriously considering this for Oklahoma as I think there would be plenty of people interested.
Follow me on twitter.
NOTE: This post is also available at www.DotNetMafia.com.