Virtualization, Cloud, Infrastructure and all that stuff in-between

My ramblings on the stuff that holds it all together

Category Archives: Microsoft Online Services

Windows Azure under the hood


There is a an excellent video interview with Manuvir Das from the Azure team on the MSDN Channel 9 site here.

 )The interview is quite long, but I’ve tried to summarise it for infrastructure people/architects like me as follows;

Azure is an overall “OS” for the cloud, akin to VMWare and their VDC initiative but with a much richer re-usable services and applications framework layer.

In terms of describing the overall architecture diagram (below), Azure is sort of the”kernel for the cloud”, “Xbox for the cloud?” buy it in increments and (ab)use it – don’t worry about building the individual infrastructure components – you get all the tools in the box and the underlying infrastructure is abstracted so you don’t have to worry about it.

image image

The services layer Microsoft provide on top of Azure are as follows

Live Services Mesh (high level user/data sync – will run as app on Azure, doing some now) will be migrated to run on Azure over time

.net services (Zurich) high level services to enable rich scenarios like authentication, Federation, liveID, OpenID, Active Directory Federation Services etc.

SQL  – premium Database services in the cloud offering data warehousing, and I would assume massive scalability options – but I’m not sure how this would be implemented.

Sharepoint/Dynamics I understand are coming soon but would offer the same sort of functionality in the cloud.

It’s based around modified Windows with Dave Cutler’s involvement (no specifics offered yet) virtualized server instances are the base building blocks with an allocated and guaranteed amount of resource – 1×1.9GHz CPU, 2gb ram, 160gb disk) which is dedicated to your machine and not contended, which would mean MS are doing no over-subscription under the hood? that seems unlikely, and maybe wasteful to me; DRS anyone?

Dell have provided the underlying physical hardware hosted in Microsoft’s data centres with a customised server model, as noted here – and you can see a video tour inside one of the hosting data centres here from BBC news

There is an overall Fabric Controller which is essentially a resource manager, it continually monitors hosts, VMs, storage via agents and deploys/allocates/moves .net code packages around hosts.

to deploy your service to the Azure cloud;

You build your application as a code package (.net, others coming later)

You build a service model, this describes the number, type of hosts, dependencies etc.

The Azure storage layer a distributed, flat table-based storage system with a a distributed lock manager and keeps 3 copies of data for availability – it’s not SQL based (interesting) uses a REST API and is more akin to a file system so sounds like it’s been written from the ground up.

Interestingly it seems that the storage layer is deployed as a service on Azure itself and is controlled by the fabric manager, parts of the current live mesh services are using it now in production.

Interestingly Manuvir describes your service as containing routers, load balancers as well as traditional services so it sounds like they may have either built a complex provisioning framework for physical devices, or have implemented virtualized versions of such devices (Cisco Nexus type devices implemented as VM’s maybe?)

Azure can maintain staging and production platforms within the cloud, you can swap between production/stage etc. with an API command that re-points DNS.

There is a concept of an upgrade domain; where VMs are taken out of service for updates/deployments etc. – your service description I assume describes what are key dependencies and it works out the least-impact sequence?

No automatic paralellism, you can’t just issue a job and have it execute in a distributed fashion using all the Azure resources without being designed/built as such, which I think Amazon offer (but I may be wrong, as that does sound like something v.complicated to do)

Azure strategy for scale out is the traditional MS one, make the most use of individual resource allocation for your VMs (see above), scale out multiple independent instances with a shared nothing architecture

Azure is a programmable API, it’s not an end-user product, it’s a platform for developers to build services on.

There is no absolute requirement for will provide PHP/RoR/Python facilities over time and .net and visual studio integration out of the box – but can use other developer tools too.

A “Developer fabric” is available – it can run on a desktop, it mocks up the whole Azure platform on your desktop and behaves the same way so developers can understand how it works and debug applications on their desktops before pushing out to the cloud – this is an important shiny for Microsoft, as it’s a simple and quick way to get developers hands-on with understanding how to use Azure.

The cool part is that you can export your service model and code packages directly to Azure from your developer tool, akin to a compile and public option for the cloud. it’s part of SDK which can be downloaded here.

You can debug service copies locally using the SDK and developer fabric, no debugging in the cloud {yet} but provides an API to get logs and are working on an end-end transaction tracing API

Microsoft have made references to making Azure on-premise as well as in Microsoft’s own data centres in the same way that VMWare have with the VDC-OS stuff… but I would think that’s going to need some more details on what the Azure OS is to understand how that would be feasible.

As I concluded in an earlier blog post here, Microsoft could be poised to clean up here if they execute quickly and well – they have the most comprehensive offering for the corporate space due to having a very rich applications/services layer that is directly aligned to the desktop & application technology choices of the bigger customers (.net), they just need to solve the trust in the cloud issue first; and the on-premise piece of the puzzle is key to this… Maybe a server version of Windows 7 or MiniWin or Singularity is the enabler for this?

Microsoft Moves into the Clouds


As you’ve probably seen and I mentioned here earlier Microsoft are laying out their vision for Microsoft-centric cloud computing this week at their Professional Developers Conference.

If you’re short of time to understand this there is a good quick overview here, here and here, apologies for lack of posting recently which has been due to the awful cold I’ve had and a backlog of “real” work to deal with.

I’m attending Microsoft TechEd next week in Barcelona,  so I’m hoping to get more real information about how this will work in the real world and I’ll be blogging as much of that content as possible.

Not sure I can live up to the level of posts Scott managed earlier in the year at TechEd US but I’ll try 🙂

Cloud is the new Mesh 🙂

Cloud Wars: VMWare vs Microsoft vs Google vs Amazon Clouds


A short time ago in a data centre, far far away…..

All the big players are setting out their cloud pitches, Microsoft are set to make some big announcements at their Professional Developer Conference at the end of October and VMWare made their VDC-OS announcements at VMWorld a couple of weeks ago, Google have had their App Engine in beta for a while and Amazon AWS is pretty well established.

With this post I hope to give a quick overview of each, I’ll freely admit I’m more knowledgeable on the VMWare/Microsoft offerings… and I stand to be corrected on any assumptions I’ve made on Google/AWS based on my web reading.

So, What’s the difference between them…?

VMWare vCloud – infrastructure led play

VMWare come from the infrastructure space, to-date they have dominated the x86 virtualization market, they have some key strategic partnerships with storage and network vendors to deliver integrated solutions.

The VMWare VDC-OS pitch is about providing a flexible underlying architecture through servers, network and storage virtualisation. why? because making everything ‘virtual’ makes for quick reconfiguration – reallocating resource from one service to another is a configuration/allocation change rather than requiring an engineer visit (see my other post on this for more info)

because VMWare’s pitch is infrastructure led it has a significant practical advantage in that it’s essentially technology agnostic (as long as it’s x86 based) you, or a service provider have the ability to build and maintain an automated birth–>death bare ‘virtual metal’ provisioning and lifecycle system for application servers/services as there is no longer a tight dependency for everything on physical hardware, cabling etc

There is no one size fits all product in this space so a bespoke solution based around a standard framework tool like Tivoli, SMS, etc. is typically required depending on organisational/service requirements.

No re-development is necessarily required to move your applications into a vCloud (hosted or internal) you just move your VMWare virtual machines to a different underlying VDC-OS infrastructure, or you use P2V, X2V tools like Platespin to migrate to a VDC-OS infrastructure.

In terms of limitations – apps can’t necessarily scale horizontally (yet) as they are constrained by their traditional server based roots. The ability to add a 2nd node doesn’t necessarily make your app scale – there are all kinds of issues around state, concurrency etc. that the application framework needs to manage.

VMWare are building frameworks to build scale-out provisioning tools – but this would only work for certain types of applications and is currently reactive unless you build some intelligence into the provisioning system.

Scott Lowe has a good round-up of VDC-OS information here & VMWare’s official page is online here

Google AppEngine– pure app framework play

An application framework for you to develop your apps within – it provides a vastly parallel application and storage framework – excellent for developing large applications (i.e Google’s bread & butter)

Disadvantage is it’s a complete redevelopment of you applications into Google compatible code, services & frameworks. You are tied into Google services – you can’t (as I understand it) take your developed applications elsewhere without significant re-development/porting.

The Google AppEngine blog is here

Microsoft Cloud Services Hosted Application stack & Infrastructure play

An interesting offering, they will technically have the ability to host .net applications from a shared hosting service, as well as integrating future versions of their traditional and well established office/productivity applications into their cloud platform; almost offering the subscription based/Software+Services model they’ve been mooting for a long time.

Given Microsoft’s market current dominance, they are very well positioned to make this successful as large shops will be able to modify existing internal .net services and applications to leverage portions of their cloud offering.

With the future developments of Hyper-V Microsoft will be well positioned to offer an infrastructure driven equivalent of VMWare’s VDC-OS proposition to service and support migration from existing dedicated Windows and Linux servers to an internal or externally hosted cloud type platform.

David Chou at Microsoft has a good post on Microsoft and clouds here

Amazon Web Services – established app framework with canned virtualization

the AWS platform provides a range of the same sort of functionality as Google AppEngine with SimpleDB,  SQS and S3 but with the recently announced ability to run Windows within their EC2 cloud makes for an interesting offering with the existing ability to pick & choose from Linux based virtual machine instances.

I believe EC2 makes heavy use of Xen under the hood; which I assume is how they are going to be delivering the Windows based services, EC2 also allows you to choose from a number of standard Linux virtual machine offerings (Amazon Machine Image, AMI).

This is an interesting offering, allowing you to develop your applications into their framework and possibly port or build your Linux/Windows application services into their managed EC2 service.

Same caveat applies though, your apps and virtual machines could be tied to the AWS framework – so you loose your portability without significant re-engineering. on the flip-side they do seem to have the best defined commercial and support models and have been well established for a while with the S3 service.

Amazon’s AWS blog is available here


Microsoft & VMWare are best positioned to pick up businesses from the corporate’s who will likely have a large existing investment in code and infrastructure but are looking to take advantage of reduced cost and complexity by hosting portions of their app/infrastructure with a service-provider.

Microsoft & VMWare offerings easily lend themselves to this internal/external cloud architecture as you can build your own internal cloud using their off-the-shelf technology, something that isn’t possible with AWS or Google. This is likely to be the preferred model for most large businesses who need to retain ownership of data and certain systems for legal/compliance reasons.

leveraging virtualization and commercial X2V or X2X conversion tools will make transition between internal and external clouds simple and quick – which gives organisations a lot of flexibility to operate their systems in the most cost/load-effective manner as well as retain detailed control of the application/server infrastructure but freed up from the day-day hardware/capacity management roles.

AWS/Google are ideal for Web 2.0 ,start-ups and the SME sector where there is typically no existing or large code-base investment that would need to be leveraged. For a greenfield implementation these services offer low start-up cost and simple development tools to build applications that would be complicated & expensive to build if you had to worry about and develop supporting infrastructure without significant up-front capital backing.

AWS/Google are also great for people wanting to build applications that need to scale to lots of users, but without a deep understanding of the required underlying infrastructure, whilst this is appealing to corporate’s  I think the cost of porting and data ownership/risk issues will be a blocker for a significant amount of time.

Google Apps are a good entry point for the SME/start-up sector and startups, and could well draw people into building AppEngine services as the business grows in size and complexity, so we may see a drift towards this over time. Microsoft have a competing model and could leverage their established brand to win over customers if they can make the entry point free/cheap and cross-platform compatible, lots of those SME/start-ups are using Mac’s or Netbooks for example.

Virtualization – the key to delivering "cloud based architecture" NOW.


There is a lot of talk about delivering cloud or elastic computing platforms, a lot of CxO’s are taking this all in and nodding enthusiastically, they can see the benefits.. so make it happen!….yesterday.

Moving your services to the cloud, isn’t always about giving your apps and data to Google, Amazon or Microsoft.

You can build your own cloud, and be choosy about what you give to others. building your own cloud makes a lot of sense, it’s not always cheap but its the kind of thing you can scale up (or down..) with a bit of up-front investment, in this article I’ll look at some of the practical; and more infrastructure focused ways in which you can do so.


Your “cloud platform” is essentially an internal shared services system where you can actually and practically implement a “platform” team that operates and capacity plans for the cloud platform; they manage it’s availability and maintenance day-day and expansion/contraction.

You then have a number of “service/application” teams that subscribe to services provided by your cloud platform team… they are essentially developers/support teams that manage individual applications or services (for example payroll or SAP, web sites etc.), business units and stakeholders etc.

Using the technology we discuss here you can delegate control to them over most aspects of the service they maintian – full access to app servers etc. and an interface (human or automated) to raise issues with the platform team or log change requests.

I’ve seen many attempts to implement this in the physical/old world and it just ends in tears as it builds a high level of expectation that the server/infrastructure team must be able to respond very quickly to the end-“customer” the customer/supplier relationship is very different… regardless of what OLA/SLA you put in place.

However the reality of traditional infrastructure is that the platform team can’t usually react as quick as the service/application teams need/want/expect because they need to have an engineer on-site, wait for an order and a delivery, a network provisioning order etc. etc (although banks do seems to have this down quite well, it’s still a delay.. and time is money, etc.)

Virtualization and some of the technology we discuss here enable the platform team to keep one step ahead of the service/application teams by allowing them to do proper capacity planning and maintain a pragmatic headroom of capacity and make their lives easier by consolidating the physical estate they manage. This extra headroom capacity can be quickly back-filled when it’s taken up by adopting a modular hardware architecture to keep ahead of the next requirement.

Traditional infrastructure = OS/App Installations

  • 1 server per ‘workload’
  • Silo’d servers for support
  • Individually underused on average = overall wastage
  • No easy way to move workload about
  • Change = slow, person in DC, unplug, uninstall, move reinstall etc.
  • HP/Dell/Sun Rack Mount Servers
  • Cat 6 Cables, Racks and structured cabling

The ideal is to have an OS/app stack that can have workloads moved from host A to host B; this is a nice idea but there are a whole heap of dependencies with the typlical applications of today (IIS/apache + scripts, RoR, SQL DB, custom .net applications). Most big/important line of business apps are monolithic and today make this hard. Ever tried to move a SQL installation from OLD-SERVER-A to SHINY-NEW-SERVER-B? exactly. *NIX better at this, but not that much better.. downtime required or complicated fail over.

This can all be done today, virtualization is the key to doing it – makes it easy to move a workload from a to b we don’t care about the OS/hardware integration – we standardise/abstract/virtualize it and that allows us to quickly move it – it’s just a file and a bunch of configuration information in a text file… no obscure array controller firmware to extract data from or outdated NIC/video drivers to worry about.

Combine this with server (blade) hardware, modern VLAN/L3 switches with trunked connections, and virtualised firewalls then you have a very compelling solution that is not only quick to change, but makes more efficient use of the hardware you’ve purchased… so each KW/hr you consume brings more return, not less as you expand.

Now, move this forward and change the hardware for something much more commodity/standardised

Requirement: Fast, Scalable shared storage, filexible allocation of disk space and ability to de-duplicate data, reduce overhead etc, thin provisioning.

Solution: SAN Storage, EMC Clariion, HP-EVA, Sun StorageTek, iSCSI for lower requirements, or storage over single Ethernet fabric – NetApp/Equalogic

Requirement: Requirement Common chassis and server modules for quick, easy rip and replace and efficient power/cooling.

Solution: HP/Sun/Dell Blades

Requirement: quick change of network configurations, cross connects, increase & decrease bandwidth

Solution: Cisco switching, trunked interconnects, 10Gb/bonded 1GbE, VLAN isolation, quick change enabled as beyond initial installation there are fewer requirements to send an engineer to plug something in or move it, Checkpoint VSX firewalls to allow delegated firewall configurations or to allow multiple autonomous business units (or customers) to operate from a shared, high bandwidth platform.

Requirement: Ability to load balance and consolidate individual server workloads

Solution: VMWare Infrastructure 3 + management toolset (SCOM, Virtual Centre, Custom you-specific integrations using API/SDK etc.)

Requirement: Delegated control of systems to allow autonomy to teams, but within a controlled/auditable framework

Solution: Normal OS/app security delegation, Active Directory, NIS etc. Virtual Center, Checkpoint VSX, custom change request workflow and automation systems which are plugged into platform API/SDK’s etc.

the following diagram is my reference architecture for how I see these cloud platforms hanging together


As ever more services move into the “cloud” or the “mesh” then integrating them becomes simpler, you have less of a focus on the platform that runs it – and just build what you need to operate your business etc.

In future maybe you’ll be able to use the public cloud services like Amazon AWS to integrate with your own internal cloud, allowing you to retain the important internal company data but take advantage of external, utility computing as required, on demand etc.

I don’t think we’ll ever get to.. (or want) to be 100% in a public cloud, but this private/internal cloud allows an organisation to retain it’s own internal agility and data ownership.

I hope this post has demonstrated that whilst, architecturally “cloud” computing sounds a bit out-there, you can practically implement it now by adopting this approach for the underlying infrastructure for your current application landscape.

New Microsoft Data Centre is Container Based


Article here, it’s coming people!

Some interesting discussions on how you can measure the productivity of a container and come up with some common metrics to compare and contrast and handle charge-back.

Microsoft Offering Hosted Exchange & Sharepoint


Interesting to note this post and register post here of a beta version of hosted Exchange and MOS (MS Office Sharepoint) offered by Microsoft itself.

Would assume this is one of the reasons they are building out vast new datacentres as they try to keep pace with Google’s range of online applications.

Working for a service provider, I’ve seen the technical challenges of offering multi-tenanted versions of these applications in the past (show stopper for most service providers that need to offer an SLA), even that MS won’t support them unless they have helped build and design it themselves via their consulting arm.

I have to wonder if MS are adopting virtualization under the hood and some kind of on-demand provisioning to handle the isolation required or just piling them onto a shared AD/SQL/Exchange infrastructure. There are a huge number of questionably supportable “tweaks” required to achieve the latter.

Hopefully it’s better in the the current 2007/8 round of products. Microsoft do support some those products under VS2005r2 virtualization for end-customers – so would be interesting to know if they do it in-house or are {planning to} moving to Hyper-V.