Virtualization, Cloud, Infrastructure and all that stuff in-between
My ramblings on the stuff that holds it all together
Category Archives: Virtualization
Installing Windows 7 from a USB Flash Drive and Multi-Boot from VHD
This is a useful tool which I haven’t come across before – Windows 7 USB/DVD Download tool – it will create a bootable USB flash drive which you can use to install Windows 7
Combine this with a boot from .VHD setup and you have a very flexible multi-boot solution, it also seems to work with Windows 2008 R2 if you need to install Hyper-V on your laptop, and then combine this with virtualized ESXi in VMware Workstation (or boot ESXi from USB) and you have an excellent hypervisor demo machine and general Windows laptop.
8 Node ESXi Cluster running 60 Virtual Machines – all Running from a Single 500GBP Physical Server
I am currently presenting a follow-up to my previous vTARDIS session for the London VMware Users Group where I demonstrated a 2-node ESX cluster on cheap PC-grade hardware (ML-115g5).
The goal of this build is to create a system you can use for VCP and VCDX type study without spending thousands on normal production type hardware (see the slides at the end of this page for more info on why this is useful..) – Techhead and I have a series of joint postings in the pipeline about how to configure the environment and the best hardware to use.
As a bit of a tangent I have been seeing how complex an environment I can get out of a single server (which I have dubbed v.T.A.R.D.I.S: Nano Edition) using virtualized ESXi hosts, the goals were;
- Distributed vSwitch and/or Cisco NX100V
- Cluster with HA/DRS enabled
- Large number of virtual machines
- Single cheap server solution
- No External hardware networking (all internal v/dvSwitch traffic)
The main stumbling block I ran into with the previous build was the performance of the SATA hard disks I was using, SCSI was out of my budget and SATA soon gets bogged down with concurrent requests which makes it slow; so I started to investigate solid state storage (previous posts here).
By keeping the virtual machine configurations light and using thin-provisioning I hoped to squeeze a lot of virtual machines into a single disk, previous findings seem to prove that cheap-er consumer grade SSD’s can support massive amount of IOps when compared to SATA (Eric Sloof has a similar post on this here)
So, I voted with my credit card and purchased one of these from Amazon – it wasn’t “cheap” at c.£200 but it will let me scale my environment bigger than I could previously manage which means less power, cost, CO2 and all the other usual arguments you try to convince yourself that a gadget is REQUIRED.
So the configuration I ended up with is as follows;
1 x HP ML115G5, 8Gb RAM, 144Gb SATA HDD | c.£300 (see here) but with more RAM |
1 x 128Gb Kingston 2.5” SSDNow V-Series SSD | c£205 |
I installed ESX4U1 classic on the physical hardware then installed 8 x ESXi 4U1 instances as virtual machines inside that ESX installation
This diagram shows the physical server’s network configuration
In order for virtualized ESXi instances to talk to each other you need to update the security setting on the physical host’s vSwitch only as shown below;
This diagram shows the virtual network configuration within each virtualized ESXi VM with vSwitch and dvSwitch config side-side.
I then built a Windows 2008R2 Virtual Machine with vCenter 4 Update 1 as a virtual machine and added all the hosts to it to manage
I clustered all the virtual ESXi instances into a single DRS/HA cluster (turning off admission control as we will be heavily oversubscribing the resources of the cluster and this is just a lab/PoC setup
Cluster Summary – 8 x virtualized ESXi instances – note the heavy RAM oversubscription, this server only has 8Gb of physical RAM – the cluster thinks it has nearly 64Gb
I then built an OpenFiler Virtual Machine and hooked it up to the internal vSwitch so that the virtualized ESXi VMs can access it via iSCSI, it has a virtual disk installed on the SSD presenting a 30Gb VMFS volume over iSCSI to the virtual cluster nodes (and all the iSCSI traffic is essentially in-memory as there is no physical networking for it to traverse.
Each virtualized ESXi node then runs a number of nested virtual machines (VM’s running inside VMs)
In order to get Nested virtual machines to work; you need to enable this setting on each virtualized ESXi host (the nested VM’s themselves don’t need any special configuration)
Once this was done and all my ESXi nodes were running and settled down, I have a script to build out a whole bunch of nested virtual machines to execute on my 8-node cluster. the VM’s aren’t anything special – each has 512Mb allocated to it and won’t actually boot past the BIOS because my goal here is just to simulate a large number of virtual machines and their configuration within vCenter, rather than meet an actual workload – remember this is a single server configuration and you can’t override the laws of physics, there is only really 8Gb or RAM and 4 CPU cores available.
Each of the virtual machines was connected to a dvSwitch for VM traffic – which you can see here in action (the dvUplink is actually a virtual NIC on the ESXi host).
I power up the virtual machines in batches of 10 to avoid swamping the host, but the SSD is holding up very well against the I/O
With all 60 of the nested VMs and virtualized ESXi instances loaded these are the load stats
I left it to idle overnight and these are the performance charts for the physical host; the big spike @15:00 was the scripts running to deploy the 60 virtual machines
Disk Latency
Physical memory consumption – still a way to go to get it to 8Gb – who says oversubscription has no use? 🙂
So, in conclusion – this shows that you can host a large number of virtual machines for a lab setup, this obviously isn’t of much use in a production environment because as soon as those 60VM’s actually start doing something they will consume real memory and CPU and you will run out of raw resources.
The key to making this usable is the solid state disk – in my previous experiments I found SATA disks just got soaked under load and caused things like access to the VMFS to fail (see this post for more details)
Whilst not a production solution, this sort of setup is ideal for VCP/VCDX study as it allows you to play with all the enterprise level features like dvSwitch and DRS/HA that really need more than just a couple of hosts and VMs to understand how they really work. for example; you can power-off one of the virtual ESXi nodes to simulate a host failure and invoke the HA response, similarly you can disconnect the virtual NIC from the ESXi VM to simulate the host isolation response.
Whilst this post has focused on non-production/lab scenarios it could be used to test VMware patch releases for production services if you are short on hardware and you can quite happily run Update manager in this solution.
If you run this lab at home it’s also very power-efficient and quiet, there are no external cables or switches other than a cross-over cable to a laptop to run the VI Client and administer it; you could comfortably have it in your house without it bothering anyone – and with an SSD there is no hard disk noise under load either 🙂
Thin-provisioning also makes good use of an SSD in this situation as this screenshot from a 30Gb virtual VMFS volume shows.
The only thing you won’t be able to play around with seriously in this environment is the new VMware FT feature – it is possible to enable it using the information in this post and learn how to enable/disable but it won’t remain stable and the secondary VM will loose sync with the primary after a while as it doesn’t seem to work very well as a nested VM. If you need to use FT for now you’ll need at least 2 physical FT servers (as shown in the original vTARDIS demo)
If you are wondering how noisy it it at power-up/down TechHead has this video on YouTube showing the scary sounding start-up noise but how quiet it gets once the fan control kicks-in.
Having completed my VCP4 and 3 I’m on the path to my VCDX and next up is the enterprise exam so this lab is going to be key to my study when the vSphere exams are released.
Using the VCE/vBlock concept to aid disaster relief in situations like the Haiti Earthquake
Seeing the tragic events of the last couple of days in Haiti played out on the news spurred me into evolving some thinking that I had been working on, the sheer scale of infrastructure destruction left by the earthquake in Haiti is making it hard to get relief distributed via road, so airlifting and military assistance is the only realistic method of getting help around.
Whilst providing physical, medical, food and engineering relief is of paramount importance during a crisis, communications networks are vital to co-ordinate efforts between agencies, it is likely that whatever civil communications infrastructure, cell towers, landlines etc. are badly impacted by the earthquake so aid agencies rely on radio based systems, however as in the “business as usual” world the Internet can act as a well-understood common medium for exchanging digital information and services – if you can get access.
Crisis Camp is a very interesting and noble concept for gathering technically minded volunteers around the world to collaborate on producing useful tools for relief staff on the ground, missing people databases, geo-mapping mashups on Google Earth etc. using open source tools and donated people time makes this a free/low-cost soft-solution for relief agencies.
However, with the scale of infrastructure destruction in large disasters getting access to shared networks, bandwidth and cellular communications networks on the ground is likely to be difficult – in this post I propose a vendor neutral solution, whilst I reference the VCE/vBlock concept which is essentially an EMC/Cisco/VMware product line; the concept of a packaged, pre-built and quick to deploy infrastructure solution can apply equally to a single or multi-vendor “infrastructure care package” – standardisation and/or abstraction are the key to making it flexible (sound familiar to your day job?) by using virtual machines as the building blocks of useful services able to run on any donated/purchased/loaned hardware.
These care packages would typically be required for 2-3 months to aid disaster relief during the worst periods and whilst civil infrastructure is re-established. None of this stuff is free in the normal world, it’s a physical product, it’s tin, cables, margin and invoices but is flexible enough that it could be redeployed again and again as needs dictate, with my UN or DEC hat on this is a pool of shared equipment that can be sent around the world and deployed in 24hours to aid on the ground relief efforts, donated, loaned by vendors or sponsors.
What is it?
A bunch of low-power footprint commodity servers, storage and communications gear packed into a single, specialised shock-rack with a generator (gas/diesel/solar as available) and battery backup.
It makes heavy use of virtualization technologies to provide high-availability of data and services to work around individual equipment and/or rack failures due to damage or loss of power (generator out of fuel or localized aftershock etc.)
Because systems running to support relief operations typically will only be required for short term use, virtual appliances are an ideal platform, for example a pre-configured database cluster or web server farm, technologies like SpringSource can be used to deploy and bootstrap web applications around the infrastructure into virtual appliances.
Data storage and replication is achieved not using expensive hardware array based solutions but DAS storage within the blades (or shared disk stores) using virtual storage appliances like the HP Lefthand networks VSA or Celerra VSA or OpenFiler – allowing the use of cheap, commodity storage but achieving block-level replication between multiple storage locations via software – each blade uses storage within the same rack, if access to the storage fails it can be restarted on an alternative blade or an alternative rack (like the HA feature of vSphere)
These racks are deployed across a wide geographic area – creating a meshed wireless network using something like WiMax to handle inter-mesh and backhaul transit and local Femtocell/WiFi technology, providing 3 services
- private communications – for inter rack replication and data backhaul
- public data communications – wireless IP based internet access with a local proxy server/cache (backhaul via satellite or whatever is available – distributed across the mesh)
- local access to a public cellular system femtocell (GSM, or whatever the local standard is)
The availability/load balancing features of modern hypervisors like VMware’s HA/DRS and FT technology can re-start virtual machines to an alternative rack should one fail. Because the VSA technology replicates datastores between all racks at a block level using a p2p type protocol it’s always possible to restart a virtual appliance elsewhere within the infrastructure – but on a much wider scale and with a real-impact.
Ok, but what does it do?
Even if you were to establish a meshed communications network to assist with disaster relief activities on the ground, bandwidth and back-haul to the Internet or global public telecoms systems will be at a premium, chances are any high-bandwidth civil infrastructure will be damaged or degraded and satellite technology is expensive and can have limited bandwidth and high-latency.
The mesh system this solution could provide can give a layer of local caching and data storage, thinking particularly with the Google Maps type mashups people at Crisiscamp are discussing to help co-ordinate relief efforts that can require transferring a large amount of data – if you could get a local data cache of all the mapping information within the mesh transfer times would be drastically reduced.
this is really just a bunch of my thoughts on how you can take current hypervisor technology and build a p2p type private cloud infrastructure in a hurry, virtualization technology brings a powerful opportunity in that it can support a large number of services in a small power footprint; the more services that can be moved from dedicated hardware and run inside a virtual machine (for example a VoIP call manager, video conferencing system or GSM base station manager) mean less demand for scarce fuel and power resources on the ground; and virtualization brings portability – less dependence on a dedicated “black-box” that is hard to replace in the field, virtualization means you can use commodity x86 hardware, and have enough spares to keep things working or work around failures.
The technology to build this type of emergency service is available today with some tweaking. The key is having it in-place and ready to ship on a plane to wherever it is needed in the world, some more developed nations have this sort of service in-country for things like emergency cellular networks following hurricanes but it will need a lot of international co-operation to make this a reality on a global scale.
Whilst I’m not aware of any current projects by international relief agencies to build this sort of system I’d like to draw people’s attention to the possibilities.
The DEC are accepting donations for the Haiti earthquake relief fund at the following address.
or the international red-cross appeal here
Is your MS Application Supported under VMware, Hyper-V, Xen? – the DEFINITIVE Statement from Microsoft
A colleague has just made me aware of a new tool on the Microsoft website, it is a wizard that can tell you if specific Microsoft App/OS/Architecture combinations are supported under the SVVP (Server Virtualization Validation Programme) – I previously wrote about the SVVP here, which promised to resolve many of the pains we were experiencing.
The output from the SVVP programme has been compiled into a great web based wizard that saves all the previous leg work of reading several (sometimes conflicting) whitepapers.. here you get it straight from the horses mouth (so to speak).
You can access the Wizard via this Link
http://www.windowsservercatalog.com/svvp.aspx?svvppage=svvpwizard.htm
The wizard lists all Microsoft products
The list of hypervisor platforms supported is shown below, and you can choose the OS version (Windows 2000 and later) and the CPU architecture (x86, x64 etc.)
And, finally the most important part – a definitive statement on support for this combination
Excellent work Microsoft – come on other vendors (Oracle, Sun this means you…)
Applying “Agile” to Infrastructure…? Virtualization is Your Friend
I have been looking at this for a while, in the traditional model of delivering an IT solution there is an extended phase of analysis and design, which leads through to build and hand-over stages; there are various formalised methodologies for this – in general all rely on having good upfront requirements to deliver a successful project against. and in infrastructure terms this means you need to know exactly what is going to be built (typically a software product) before you can design and implement the required infrastructure to support it.
This has always been a point of contention between customers, development teams and infrastructure teams because it’s hard to produce meaningful sizing data without a lot of up-front work and prototyping unless you really are building something which is easily repeatable (in which case is using a SaaS provider a more appropriate model?)
In any case the extended period these steps require on larger projects often doesn’t keep pace with the rate of technical and organisational change that is typical in modern business; end result – the tech teams are looking after an infrastructure that was designed to outdated or at worst made-up requirements; the developers are having to retro-fit changes to the code to support changing requirements and the customer has something which is expensive to manage and they wonder why they aren’t using the latest whizzy technology that is more cost-effective and they are looking at a refresh early into it’s life-cycle – which means more money thrown at a solution.
With the growing popularity of Agile-type methodologies to solve these sort of issues for software projects infrastructure teams are facing a much harder time, even if they are integrated into the Agile process which they should be (the attitude should be that you can’t deliver a service without infrastructure and vice-versa) they struggle to keep up with the rate of change because of the physical and operational constraints they work within.
Other than some basic training and some hands-on experience I’m definitely not an Agile expert – but to me “agile” means starting from an overall vision of what needs to be delivered and iteratively breaking a solution into bite-sized chunks and tackling them in small parts, delivering small incremental pieces of functionality through a series of “sprints” – for example delivering basic UI and customer details screen for an order entry application and letting people use it in production then layering further functionality through further sprints and releases. a key part of this process is reviewing work done and feeding that experience back into the subsequent sprints and the overall project.
Typically in Agile you would try to tackle the hardest parts of a solution from day one – these are the parts that make or break a project – if you can’t solve it in the 1st or 2nd iteration maybe it actually is impossible and you have a more informed decision on if the project actually is feasible, or at a minimum you take further the learning and practical experience of trying to solve the problem and what does/doesn’t work and are able to produce better estimates.
This has another very important benefit; end-user involvement – the real user feedback means it’s easier to get their buy-in to the solution and the feedback they give from using something tangible day to day rather than a bunch of upfront UI workflow diagrams or a finally delivered solution is invaluable – you get it BEFORE it’s too late (or too expensive) to change it; fail early (cheaply) rather than at the end (costly).
For me, this is how Google have released the various “beta” products like gMail over the last few years; I don’t know if they used “Agile” methodologies but; set expectations that it’s still a work in progress; it’s “good-enough” and “safe” you (the user) have the feedback channel to get something changed to how you think it should be.
Imagine if Google had spent the 2 years doing an upfront design and build project for gMail only for it to become unpopular because it only supported a single font in an email because they hadn’t captured that in their upfront requirements – something that for argument’s sake could be implemented in weeks during a sprint but would take months to implement post-release as it meant re-architecting all the dependent modules that were developed later on.
In application development terms this is fine – this Agile thing is just a continual release/review cycle and just means deploying application code to a bunch of servers – but how does that map to the underlying infrastructure platform where you need to provide and run something more tangible and physical? every incremental piece of functionality may need more server roles or more capacity to service the load this functionality places on databases, web servers, firewalls etc.
With physical hardware implementing this sort of change means physical intervention – people in data centres, cabling, server builds, lead time, purchase orders deliveries, racking, cabling etc. every time there is a release – with typical sprints being 2/4 week iterations quite often traditional physical infrastructure can’t keep up with the rate of change, or at a basic level can’t do so in a managed risk fashion with planned changes.
What if the development sprint radically changes the amount of storage that is required by a host?, needs a totally different firewall and network topology or needs more CPU or RAM resource than you can physically support in current hardware.
What if the release has an unexpected and undesirable effect on the platform as a whole – for example a service places a heavy load on a CPU because of some inefficient coding that had not shown up through testing phases and is not trivial to patch – you have 2 choices; roll back the change or scale the production hardware to work around it until it can be resolved in a subsequent release.
Both of these examples mean you may need servers to be upgraded/replaced and all adds up to increased time to deliver – in this case the infrastructure becomes a roadblock not a facility.
Add to this the complication of doing this “online” as the system this functionality is being delivered to is in production with real, live users – that makes things difficult to do with a low-risk or no downtime.
The traditional approach to this lack of accurate requirements and uncertainty has been to over-specify the infrastructure from day one and build in a lot of headroom and redundancy to deal with on-line maintenance, however with traditional infrastructure you can’t easily and quickly move services (web services, applications, code) and capacity (compute, storage, network) from one host to another without downtime, engineering time, risk etc.
Enter virtualization.
Rather than making developers or customers specify a raft of non-functional requirements before any detailed work has started on design; what if you could start with some hardware (compute, network, storage) that you can scale out in an incremental and horizontal manner.
If you abstract the underlying hardware from the server instance through virtualization it suddenly become much more agile – cloud like, even.
You can start small, with a moderate investment in platform infrastructure and scale it out as the incremental releases require more, maintain a pragmatic headroom within the infrastructure capacity and you can easily react straight away as long as you are diligent at back-filling that capacity to maintain the headroom.
Virtualization, and particularly at the moment with vMotion, DRS and Live Migration type technologies you have an infrastructure that is capable of horizontal scaling far beyond anything that you could achieve with physical platforms – even with the most advanced automated bare-metal server and application provisioning platforms.
Virtualization has a place in horizontal scaling where individual hosts need more CPU, Compute etc. even if you need to upgrade the underlying physical hardware to support more CPU cores virtualization allows you to do most of this online by moving server instances to and from upgraded hardware online.
VMware vSphere for example supports up to 8 virtual CPU’s and 256GB RAM presented to an individual virtual machine. You can add new higher capacity servers to a VMware ESX/vSphere cluster and then present these increased resources to the virtual machine sometimes without downtime to the server instance – this seamless upgrade technology will improve as modern operating systems become more adapted to virtualization – in any case vMotion allows you to move server instances around online to support such maintenance of the underlying infrastructure platform in a way that was never possible before virtualization.
This approach allows you to right-size your infrastructure solution based on real-world usage, you are running the service in production with some flex/headroom capacity not only to deal with spikes to can satisfy immediate demands but also with a view to capacity planning for the future – backed up with real statistics.
Maybe at day one you don’t even need to purchase any hardware or infrastructure to build your 1st couple of platform iterations – you could take advantage of a number of cloud solutions like EC2 and VMware vCloud to rent capacity to support the initial stages of your product development;
This avoids any upfront investment whilst you are still establishing the real feasibility of the project and outsources the infrastructure pain to someone else for the initial phases; once you are sure your project is going to succeed (or at least you have identified the major technical roadblocks and have a plan) you can design and specify a dedicated platform based on real-world usage rather than best-guesses – the abstraction that virtualization offers makes it much easier to do this kind of transition once you have a dedicated platform in place, or even another service provider.
To solve the release/risk complexity virtualization allows you to snapshot and rollback entire software and infrastructure platform stacks in their entirety – something that is almost impossible in the physical world – you can also clone your production system off to an isolated network for staging/destructive type testing or even disaster recovery.
Hopefully this has given you some food for thought on how Agile can apply to infrastructure and where virtualization can help you out – I only ever see the Agile topic being discussed in relation to software development – virtualization can help your infrastructure to work with Agile methodologies. However it’s important to remember that neither Agile methodologies or Virtualization a panacea – they are not the cure for all ills and you will need to carefully evaluate your own needs, they are both valuable tools in the architect’s toolbox.
Using Virtualization to Extend The Hardware Lifecycle
In harder economic times getting real money to spend on server refreshes is difficult. There are the arguments that new kit is more power efficient; supports higher VM/CPU core densities but the reality is that even if you can show a cost saving over time most current project budgets are at best frozen until the economic uncertainty passes, at worst eliminated.
Although power costs have become increasingly visible because they’ve risen so much over the last 18 months this is still a hidden cost to many organisations, particularly if you run servers in your offices where a facilities team picks up the bill the overall energy savings through virtualization and hardware refresh don’t always get through.
So, I propose some alternative thinking to ride out the recession and make the kit you have and can’t get budget to replace last longer, as well as delivering a basic disaster recovery or test & development platform (business value) in the meantime.
Breaking the Cycle
In the traditional Wintel world server, OS, app and configuration are all tightly integrated. It’s hard to move a Windows install from an HP server to a cheaper Dell server for example without reinstalling or at least some in-depth registry surgery – you can use PlateSpin products to do P2P conversion but they come at a cost (see point above).
Let’s take an example; you have a Microsoft Windows 2003 server loaded with BizTalk server and a bunch of custom orchestrations running on an HP DL380g2. If the motherboard on that server were to die could you get a replacement quickly or at all? do you have to carry the cost of a care-pack on that server and because it’s gone “end of life” what is the SLA around any replacement hardware that is becoming increasingly scarce as supplier stocks are used up.
If you can’t get hold of replacement hardware in time, what about restoring it to an alternative server that you do have spare? For example a Dell Power Edge – that type of bare-metal recovery is still not a simple task due to the drivers/OS level components required and is laden with risks & 3rd party backup software which you needed to have.
Are your backups/recovery procedures good, tested last week…? yes they should be, but are they? – will the new array controller drivers or old firmware cause problems with your AV software or management agents for example.
Virtualization makes this simpler – the hypervisor layer abstracts the complicated bit that you care about (OS/App configuration “workload”) from the underlying hardware – which is essentially a commodity these days, it’s just a “server”.
So, if you virtualize your workload and the underlying hardware dies (for example that old HP DL380g2) restarting that workload on an alternative piece of hardware like the Dell is very simple – no complicated drivers or OS reinstallation, just start it up and go. If you have shared storage then this is even simpler, you might even have had a chance to proactively move workloads away from a failing server using vMotion.
Even if you only run 1 VM per piece of physical hardware to maintain almost equivalent performance because you can’t purchase a new, more powerful host(VMware call this containment) you’ve broken the hardware/OS ties and have made replacement easier as & when you are able to do so. VMware provide the VMware convertor tool, which is free/cheap, version 4 does almost everything you could ever want in a P2V tool to achieve this virtualization goal, if not PlateSpin powerConvert is cheap for a one-hit conversion.
So, this leads to my point – this can effectively extend the life of your server hardware, if it’s gone out of official vendor support – do you care as much? The hypervisor has broken the tight workload/hardware integration you are less tied to a continual refresh cycle of hardware as it goes in/out of vendor support – you can almost treat it as disposable – when it dies or has problems throw it away, cannibalise it for spare parts to keep other similar servers going – it’s just “capacity”.
Shiny New or 2nd Hand?
Another angle on this is that businesses almost always buy new hardware, direct from a reseller or manufacturer – traditionally because it’s best-practice and you are less likely to have problems with new kit. The reality is that with virtualization; server hardware is actually pretty flexible, serviceable and as I hope I’ve demonstrated here, disposable.
For example, look on eBay there are hundreds of recent 2nd hand servers and storage arrays on the open market, maybe that’s really something to do with the numbers of companies currently going into administration (hmm).
What’s to stop your department or project from buying some 2nd hand or liquidated servers, you’ll probably pay a tiny fraction of the “new” price and as I hope I’ve shown here if it dies then you would probably have saved enough money overall to replace it or buy some spares up-front to deal with any failures in a controlled way.
This type of commoditisation is where Google really have things sorted – this is exactly the same approach they have taken to their infrastructure, and virtualization is what gets you there now.
Recycle for DR/Dev/Test
Alternatively, if you can show a cost saving through a production kit refresh and are lucky enough to get some budget to buy servers you can recycle the older kit and use ESXi to setup a lab or very basic DR facility.
Cannibalise the de-commissioned servers to build fewer, loaded up hosts that can run restored copies of virtual machines in the event of a DR situation – your organization has already purchased this equipment so this is a good way to show your management how you are extending the life-cycle of previous hardware “investments”, greater RoI etc. heck I’m sure you could get a “green message” out of that as well 🙂
If you are able to do so, you can run this in parallel at an alternative site to the refreshed production system and act as a DR site – virtualization makes the “workloads” entirely portable across sites, servers and storage.
Summary
I do realise that this is post is somewhat of a simplification and ignores the power/hosting cost and new functionality of new hardware but the reality is that this is still often a sunk/invisible cost to many small/medium businesses.
There is still a wide perception that purchased hardware is an investment by a business, rather than the commodity that the IT community regards it as.
An analogy I often use is with company cars/vans, they are well established as depreciating, disposable assets to a business and more often than not are leased and regularly replaced because of this very reason. If you can’t get management to buy into this mindset for IT hardware; virtualization is your only sane solution.
in summary, you can show the powers that be that you can make servers last longer by virtualizing and cannibalising them, and this was a lot harder to do with before virtualization came along as it all meant downtime, hands-on and risk, now it’s just configuration and change.
Microsoft Virtualization User Group Meeting (UK)
I’ll be attending this user group event this evening in London; if you’re local and interested then I believe it’s never too late to register.
If you’re not local then you can view the webcast (details below) online
Looks to be some interesting content, and always good to speak to customers who have done it in real-life, the Microsoft virtualization user group UK site is here
Next In-Person Meeting
Microsoft Virtualisation User Group – January 2009 Meeting
Location:
Microsoft London (Cardinal Place)
http://download.microsoft.com/documents/uk/about/downloads/victoria_map.pdf
Date & Time:
Thursday 29th January 2009
18:00 – 21:30
Agenda:
18:00 – 18:15
Arrivals
18:15 – 18:45
Simon Cleland (Unisys) & Colin Power (Slough Borough Council)
Case study: Hyper-V RDP deployment at Slough Borough Council
18:45 – 19:30
Aaron Parker (TFL)
Application virtualisation – what is App-V?
Benefits of App-V & a look inside an enterprise implementation
19:30 – 20:00
Food
20:00 – 21:15
Justin Zarb (Microsoft)
Application virtualisation – in-depth look at App-V architecture
21:15 – 21:30
Q/A and wrap up
Registrations:
Register at the forums for this event here
Or email meeting@mvug.co.uk
Live Meeting:
Click Here
No need for a meeting ID
Room opens at 5.30pm – meeting at 6.30pm
Windows Azure under the hood
There is a an excellent video interview with Manuvir Das from the Azure team on the MSDN Channel 9 site here.
The interview is quite long, but I’ve tried to summarise it for infrastructure people/architects like me as follows;
Azure is an overall “OS” for the cloud, akin to VMWare and their VDC initiative but with a much richer re-usable services and applications framework layer.
In terms of describing the overall architecture diagram (below), Azure is sort of the”kernel for the cloud”, “Xbox for the cloud?” buy it in increments and (ab)use it – don’t worry about building the individual infrastructure components – you get all the tools in the box and the underlying infrastructure is abstracted so you don’t have to worry about it.
The services layer Microsoft provide on top of Azure are as follows
Live Services Mesh (high level user/data sync – will run as app on Azure, doing some now) will be migrated to run on Azure over time
.net services (Zurich) high level services to enable rich scenarios like authentication, Federation, liveID, OpenID, Active Directory Federation Services etc.
SQL – premium Database services in the cloud offering data warehousing, and I would assume massive scalability options – but I’m not sure how this would be implemented.
Sharepoint/Dynamics I understand are coming soon but would offer the same sort of functionality in the cloud.
It’s based around modified Windows with Dave Cutler’s involvement (no specifics offered yet) virtualized server instances are the base building blocks with an allocated and guaranteed amount of resource – 1×1.9GHz CPU, 2gb ram, 160gb disk) which is dedicated to your machine and not contended, which would mean MS are doing no over-subscription under the hood? that seems unlikely, and maybe wasteful to me; DRS anyone?
Dell have provided the underlying physical hardware hosted in Microsoft’s data centres with a customised server model, as noted here – and you can see a video tour inside one of the hosting data centres here from BBC news
There is an overall Fabric Controller which is essentially a resource manager, it continually monitors hosts, VMs, storage via agents and deploys/allocates/moves .net code packages around hosts.
to deploy your service to the Azure cloud;
You build your application as a code package (.net, others coming later)
You build a service model, this describes the number, type of hosts, dependencies etc.
The Azure storage layer a distributed, flat table-based storage system with a a distributed lock manager and keeps 3 copies of data for availability – it’s not SQL based (interesting) uses a REST API and is more akin to a file system so sounds like it’s been written from the ground up.
Interestingly it seems that the storage layer is deployed as a service on Azure itself and is controlled by the fabric manager, parts of the current live mesh services are using it now in production.
Interestingly Manuvir describes your service as containing routers, load balancers as well as traditional services so it sounds like they may have either built a complex provisioning framework for physical devices, or have implemented virtualized versions of such devices (Cisco Nexus type devices implemented as VM’s maybe?)
Azure can maintain staging and production platforms within the cloud, you can swap between production/stage etc. with an API command that re-points DNS.
There is a concept of an upgrade domain; where VMs are taken out of service for updates/deployments etc. – your service description I assume describes what are key dependencies and it works out the least-impact sequence?
No automatic paralellism, you can’t just issue a job and have it execute in a distributed fashion using all the Azure resources without being designed/built as such, which I think Amazon offer (but I may be wrong, as that does sound like something v.complicated to do)
Azure strategy for scale out is the traditional MS one, make the most use of individual resource allocation for your VMs (see above), scale out multiple independent instances with a shared nothing architecture
Azure is a programmable API, it’s not an end-user product, it’s a platform for developers to build services on.
There is no absolute requirement for asp.net will provide PHP/RoR/Python facilities over time and .net and visual studio integration out of the box – but can use other developer tools too.
A “Developer fabric” is available – it can run on a desktop, it mocks up the whole Azure platform on your desktop and behaves the same way so developers can understand how it works and debug applications on their desktops before pushing out to the cloud – this is an important shiny for Microsoft, as it’s a simple and quick way to get developers hands-on with understanding how to use Azure.
The cool part is that you can export your service model and code packages directly to Azure from your developer tool, akin to a compile and public option for the cloud. it’s part of SDK which can be downloaded here.
You can debug service copies locally using the SDK and developer fabric, no debugging in the cloud {yet} but provides an API to get logs and are working on an end-end transaction tracing API
Microsoft have made references to making Azure on-premise as well as in Microsoft’s own data centres in the same way that VMWare have with the VDC-OS stuff… but I would think that’s going to need some more details on what the Azure OS is to understand how that would be feasible.
As I concluded in an earlier blog post here, Microsoft could be poised to clean up here if they execute quickly and well – they have the most comprehensive offering for the corporate space due to having a very rich applications/services layer that is directly aligned to the desktop & application technology choices of the bigger customers (.net), they just need to solve the trust in the cloud issue first; and the on-premise piece of the puzzle is key to this… Maybe a server version of Windows 7 or MiniWin or Singularity is the enabler for this?
Cloud Wars: VMWare vs Microsoft vs Google vs Amazon Clouds
A short time ago in a data centre, far far away…..
All the big players are setting out their cloud pitches, Microsoft are set to make some big announcements at their Professional Developer Conference at the end of October and VMWare made their VDC-OS announcements at VMWorld a couple of weeks ago, Google have had their App Engine in beta for a while and Amazon AWS is pretty well established.
With this post I hope to give a quick overview of each, I’ll freely admit I’m more knowledgeable on the VMWare/Microsoft offerings… and I stand to be corrected on any assumptions I’ve made on Google/AWS based on my web reading.
So, What’s the difference between them…?
VMWare vCloud – infrastructure led play
VMWare come from the infrastructure space, to-date they have dominated the x86 virtualization market, they have some key strategic partnerships with storage and network vendors to deliver integrated solutions.
The VMWare VDC-OS pitch is about providing a flexible underlying architecture through servers, network and storage virtualisation. why? because making everything ‘virtual’ makes for quick reconfiguration – reallocating resource from one service to another is a configuration/allocation change rather than requiring an engineer visit (see my other post on this for more info)
because VMWare’s pitch is infrastructure led it has a significant practical advantage in that it’s essentially technology agnostic (as long as it’s x86 based) you, or a service provider have the ability to build and maintain an automated birth–>death bare ‘virtual metal’ provisioning and lifecycle system for application servers/services as there is no longer a tight dependency for everything on physical hardware, cabling etc
There is no one size fits all product in this space so a bespoke solution based around a standard framework tool like Tivoli, SMS, etc. is typically required depending on organisational/service requirements.
No re-development is necessarily required to move your applications into a vCloud (hosted or internal) you just move your VMWare virtual machines to a different underlying VDC-OS infrastructure, or you use P2V, X2V tools like Platespin to migrate to a VDC-OS infrastructure.
In terms of limitations – apps can’t necessarily scale horizontally (yet) as they are constrained by their traditional server based roots. The ability to add a 2nd node doesn’t necessarily make your app scale – there are all kinds of issues around state, concurrency etc. that the application framework needs to manage.
VMWare are building frameworks to build scale-out provisioning tools – but this would only work for certain types of applications and is currently reactive unless you build some intelligence into the provisioning system.
Scott Lowe has a good round-up of VDC-OS information here & VMWare’s official page is online here
Google AppEngine– pure app framework play
An application framework for you to develop your apps within – it provides a vastly parallel application and storage framework – excellent for developing large applications (i.e Google’s bread & butter)
Disadvantage is it’s a complete redevelopment of you applications into Google compatible code, services & frameworks. You are tied into Google services – you can’t (as I understand it) take your developed applications elsewhere without significant re-development/porting.
The Google AppEngine blog is here
Microsoft Cloud Services Hosted Application stack & Infrastructure play
An interesting offering, they will technically have the ability to host .net applications from a shared hosting service, as well as integrating future versions of their traditional and well established office/productivity applications into their cloud platform; almost offering the subscription based/Software+Services model they’ve been mooting for a long time.
Given Microsoft’s market current dominance, they are very well positioned to make this successful as large shops will be able to modify existing internal .net services and applications to leverage portions of their cloud offering.
With the future developments of Hyper-V Microsoft will be well positioned to offer an infrastructure driven equivalent of VMWare’s VDC-OS proposition to service and support migration from existing dedicated Windows and Linux servers to an internal or externally hosted cloud type platform.
David Chou at Microsoft has a good post on Microsoft and clouds here
Amazon Web Services – established app framework with canned virtualization
the AWS platform provides a range of the same sort of functionality as Google AppEngine with SimpleDB, SQS and S3 but with the recently announced ability to run Windows within their EC2 cloud makes for an interesting offering with the existing ability to pick & choose from Linux based virtual machine instances.
I believe EC2 makes heavy use of Xen under the hood; which I assume is how they are going to be delivering the Windows based services, EC2 also allows you to choose from a number of standard Linux virtual machine offerings (Amazon Machine Image, AMI).
This is an interesting offering, allowing you to develop your applications into their framework and possibly port or build your Linux/Windows application services into their managed EC2 service.
Same caveat applies though, your apps and virtual machines could be tied to the AWS framework – so you loose your portability without significant re-engineering. on the flip-side they do seem to have the best defined commercial and support models and have been well established for a while with the S3 service.
Amazon’s AWS blog is available here
Conclusion
Microsoft & VMWare are best positioned to pick up businesses from the corporate’s who will likely have a large existing investment in code and infrastructure but are looking to take advantage of reduced cost and complexity by hosting portions of their app/infrastructure with a service-provider.
Microsoft & VMWare offerings easily lend themselves to this internal/external cloud architecture as you can build your own internal cloud using their off-the-shelf technology, something that isn’t possible with AWS or Google. This is likely to be the preferred model for most large businesses who need to retain ownership of data and certain systems for legal/compliance reasons.
leveraging virtualization and commercial X2V or X2X conversion tools will make transition between internal and external clouds simple and quick – which gives organisations a lot of flexibility to operate their systems in the most cost/load-effective manner as well as retain detailed control of the application/server infrastructure but freed up from the day-day hardware/capacity management roles.
AWS/Google are ideal for Web 2.0 ,start-ups and the SME sector where there is typically no existing or large code-base investment that would need to be leveraged. For a greenfield implementation these services offer low start-up cost and simple development tools to build applications that would be complicated & expensive to build if you had to worry about and develop supporting infrastructure without significant up-front capital backing.
AWS/Google are also great for people wanting to build applications that need to scale to lots of users, but without a deep understanding of the required underlying infrastructure, whilst this is appealing to corporate’s I think the cost of porting and data ownership/risk issues will be a blocker for a significant amount of time.
Google Apps are a good entry point for the SME/start-up sector and startups, and could well draw people into building AppEngine services as the business grows in size and complexity, so we may see a drift towards this over time. Microsoft have a competing model and could leverage their established brand to win over customers if they can make the entry point free/cheap and cross-platform compatible, lots of those SME/start-ups are using Mac’s or Netbooks for example.
Free EMC Celerra for your Home/Lab
Virtualgeek has an interesting post here about a freely downloadable VM version of their Celerra product, including an HA version. This is an excellent idea for testing and lab setups, and a powerful tool in your VM Lab arsenal alongside other offerings like Xtravirt Virtual SAN and OpenFiler.
I’ve been saying for a while that companies that make embedded h/w devices and appliances should try to offer versions of the software running their devices as VM’s so people can get them into lab/test environments quickly, most tech folk would rather download and play with something now, rather than have to book and take delivery of an eval with sales drones (apologies to any readers who work in sales) and pre-sales professional services, evaluation criteria etc. if your product is good it’s going to get recommended, no smoke and mirrors required.
As such VM appliances are an excellent pre-sales/eval tool, rather than stopping people buying products. Heck, they could even licence the VM versions directly for production use (as Zeus do with their ZXTM products); this is a very flexible approach and something that is important if you get into clouds as an internal or external service provider – the more you standardise on commodity hardware with a clever software layer the more you can recycle, reuse and redeploy without being tied into specific vendor hardware etc.
Most “appliances” in-use today are actually low-end PC motherboards with some clever software in a sealed box – for example I really like the Juniper SA range of SSL VPN appliances, I recently helped out with a problem on one which was caused by a failed HDD – if you hook up the console interface its a commodity PC motherboard in a sealed case running a proprietary secure OS – as it’s all intel based, no reason it couldn’t also run as a VM (SLL accelerator h/w can be turned off in the software so there can’t be any hard dependency on any SSL accelerator cards inside the sealed box) – adopting VM’s for these appliances provides the same (maybe even better) level of standard {virtual} hardware that appliance vendors need to make their devices reliable/serviceable.
Another example, the firmware that is embedded in the HP Virtual Connect modules I wrote about a while back runs under VMWare Workstation, HP have an internal use version for engineers to do some development and testing against, sadly they won’t redistribute it as far as I am aware.