Virtualization, Cloud, Infrastructure and all that stuff in-between

My ramblings on the stuff that holds it all together

Category Archives: Virtualization

Virtualization – the key to delivering "cloud based architecture" NOW.

 

There is a lot of talk about delivering cloud or elastic computing platforms, a lot of CxO’s are taking this all in and nodding enthusiastically, they can see the benefits.. so make it happen!….yesterday.

Moving your services to the cloud, isn’t always about giving your apps and data to Google, Amazon or Microsoft.

You can build your own cloud, and be choosy about what you give to others. building your own cloud makes a lot of sense, it’s not always cheap but its the kind of thing you can scale up (or down..) with a bit of up-front investment, in this article I’ll look at some of the practical; and more infrastructure focused ways in which you can do so.

image

Your “cloud platform” is essentially an internal shared services system where you can actually and practically implement a “platform” team that operates and capacity plans for the cloud platform; they manage it’s availability and maintenance day-day and expansion/contraction.

You then have a number of “service/application” teams that subscribe to services provided by your cloud platform team… they are essentially developers/support teams that manage individual applications or services (for example payroll or SAP, web sites etc.), business units and stakeholders etc.

Using the technology we discuss here you can delegate control to them over most aspects of the service they maintian – full access to app servers etc. and an interface (human or automated) to raise issues with the platform team or log change requests.

I’ve seen many attempts to implement this in the physical/old world and it just ends in tears as it builds a high level of expectation that the server/infrastructure team must be able to respond very quickly to the end-“customer” the customer/supplier relationship is very different… regardless of what OLA/SLA you put in place.

However the reality of traditional infrastructure is that the platform team can’t usually react as quick as the service/application teams need/want/expect because they need to have an engineer on-site, wait for an order and a delivery, a network provisioning order etc. etc (although banks do seems to have this down quite well, it’s still a delay.. and time is money, etc.)

Virtualization and some of the technology we discuss here enable the platform team to keep one step ahead of the service/application teams by allowing them to do proper capacity planning and maintain a pragmatic headroom of capacity and make their lives easier by consolidating the physical estate they manage. This extra headroom capacity can be quickly back-filled when it’s taken up by adopting a modular hardware architecture to keep ahead of the next requirement.

Traditional infrastructure = OS/App Installations

  • 1 server per ‘workload’
  • Silo’d servers for support
  • Individually underused on average = overall wastage
  • No easy way to move workload about
  • Change = slow, person in DC, unplug, uninstall, move reinstall etc.
  • HP/Dell/Sun Rack Mount Servers
  • Cat 6 Cables, Racks and structured cabling

The ideal is to have an OS/app stack that can have workloads moved from host A to host B; this is a nice idea but there are a whole heap of dependencies with the typlical applications of today (IIS/apache + scripts, RoR, SQL DB, custom .net applications). Most big/important line of business apps are monolithic and today make this hard. Ever tried to move a SQL installation from OLD-SERVER-A to SHINY-NEW-SERVER-B? exactly. *NIX better at this, but not that much better.. downtime required or complicated fail over.

This can all be done today, virtualization is the key to doing it – makes it easy to move a workload from a to b we don’t care about the OS/hardware integration – we standardise/abstract/virtualize it and that allows us to quickly move it – it’s just a file and a bunch of configuration information in a text file… no obscure array controller firmware to extract data from or outdated NIC/video drivers to worry about.

Combine this with server (blade) hardware, modern VLAN/L3 switches with trunked connections, and virtualised firewalls then you have a very compelling solution that is not only quick to change, but makes more efficient use of the hardware you’ve purchased… so each KW/hr you consume brings more return, not less as you expand.

Now, move this forward and change the hardware for something much more commodity/standardised

Requirement: Fast, Scalable shared storage, filexible allocation of disk space and ability to de-duplicate data, reduce overhead etc, thin provisioning.

Solution: SAN Storage, EMC Clariion, HP-EVA, Sun StorageTek, iSCSI for lower requirements, or storage over single Ethernet fabric – NetApp/Equalogic

Requirement: Requirement Common chassis and server modules for quick, easy rip and replace and efficient power/cooling.

Solution: HP/Sun/Dell Blades

Requirement: quick change of network configurations, cross connects, increase & decrease bandwidth

Solution: Cisco switching, trunked interconnects, 10Gb/bonded 1GbE, VLAN isolation, quick change enabled as beyond initial installation there are fewer requirements to send an engineer to plug something in or move it, Checkpoint VSX firewalls to allow delegated firewall configurations or to allow multiple autonomous business units (or customers) to operate from a shared, high bandwidth platform.

Requirement: Ability to load balance and consolidate individual server workloads

Solution: VMWare Infrastructure 3 + management toolset (SCOM, Virtual Centre, Custom you-specific integrations using API/SDK etc.)

Requirement: Delegated control of systems to allow autonomy to teams, but within a controlled/auditable framework

Solution: Normal OS/app security delegation, Active Directory, NIS etc. Virtual Center, Checkpoint VSX, custom change request workflow and automation systems which are plugged into platform API/SDK’s etc.

the following diagram is my reference architecture for how I see these cloud platforms hanging together

image 

As ever more services move into the “cloud” or the “mesh” then integrating them becomes simpler, you have less of a focus on the platform that runs it – and just build what you need to operate your business etc.

In future maybe you’ll be able to use the public cloud services like Amazon AWS to integrate with your own internal cloud, allowing you to retain the important internal company data but take advantage of external, utility computing as required, on demand etc.

I don’t think we’ll ever get to.. (or want) to be 100% in a public cloud, but this private/internal cloud allows an organisation to retain it’s own internal agility and data ownership.

I hope this post has demonstrated that whilst, architecturally “cloud” computing sounds a bit out-there, you can practically implement it now by adopting this approach for the underlying infrastructure for your current application landscape.

VMWare/Cisco Switching Integration

 

As noted here there is a doc that has been jointly produced between VMWare and Cisco which has all the details required for integrating VI virtual switches with physical switching.

Especially handy if you need to work with networking teams to make sure things are configured correctly to allow failover properly between redundant switches/fabrics etc. – it’s not as simple as it looks, and people often forget the switch-side configurations that are required.

Doc available here (c.3Mb PDF)

LiveBlogging from TechEd

 

Scott has an excellent series of articles that he’s relaying from sessions at Microsoft TechEd US..

looks good for Hyper V and SCVMM content..

http://blog.scottlowe.org/

Slow vMotion..

 

Note to remember, don’t forget to check the duplex settings on NICs handling your vMotion traffic.

My updated clustered ESX test lab is progressing (more posts on that in the next week or so)… and I’m kind of limited in that I only have an old 24-port 100Mb Cisco hub for the networking at the moment.

vMotion warns about the switch speed as a possible issue.

image

I had my Service Console/ vMotion NICit forced to 100/full and when I 1st tried it vMotion took 2hrs to get to 10%, I changed it to auto-negotiate whilst the task was running and it completed without breaking the vMotion task ain a couple of seconds, dropped only 1 ping to the VM I moved.

Cool, it’s not production or doing a lot of workload but useful to know despite the warning it will work even if you’ve only got an old hub for your networking, and worth remembering that Duplex mis-matches can literally add hours and days onto network transfers.

Free SAN for your Home/Work ESX Lab

 

VM/Etc have posted an excellent article about a free iSCSI SAN VM appliance that you can download from Xtravirt

it uses replication between 2 ESX hosts to allow you to configure DRS/HA etc.

Excellent, I’m going to procure another cheap ESX host in the next couple of weeks so will post back on my experiences with setting this up, my previous plan meant I’d have to get a 3rd box to run an iSCSI server like OpenFiler to enable this functionality, but I really like this approach.

Sidenote  – Xtravirt also have some other useful downloads like Viso templates and an ESX deployment appliance available here

A Closer look at Green IT and Microsoft’s new Container Data Centre in Chicago

 

Link here – good visualisation about 10mins in of how their new Chicago data centre is laid out internally.

With virtualisation breaking the traditional hardware/OS ties; this is becoming an increasingly appealing way of managing commodity compute grid resources for large organisations. Mike makes some good points about the de-comissioning of servers on a large scale where you are adding 10’s of thousands on a regular basis – you need to take them out at some point too, and that’s time consuming. at this scale of operation It’s more efficient to make the the container and/or datacentre the field replaceable unit (as I discussed a while back) in this scenario.

Also interesting point that water consumption may be the next environmental touch paper for legislation and disclosure for IT shops.

Running ESX 3.5 and 3i Under VMWare Workstation 6.5 Beta Build 91182

 

Following on from my earlier post I upgraded my installation to the new build of 6.5. it un-installed the old build and re-installed the latest without a problem, took about 30mins and required a reboot of the host OS.

All my previously suspended XP/2003 VM’s resumed ok without a restart but needed an upgrade to the VMTools which did require a restart of the guest OS – all completed with no problems.

Now, onto installing ESX….

I used the settings from Eric’s post here to edit my .vmx file

ethernet0.virtualDev = “e1000”

monitor.virtual_exec = “hardware”
monitor_control.restrict_backdoor = “true”

Note – you need to select an x64 Linux version from the VM type drop down, if you have to go back and change it via the GUI after you’ve edited the .vmx file it overwrites the Ethernet card “e1000” setting to “vlance” so you need to edit again otherwise the ESX installer won’t find a compatible NIC and won’t install.

it was initially very slow to boot; 5mins on my dual core laptop with only one error – which was expected..

imageimage

To improve the performance I changed my installation to run the non-debug version of the Workstation binaries (rename the vmware-vmx.exe to vmware-vmx-debug.exe)

note: this isn’t recommended unless you know what you are doing, VMWare will rely on the output from the debug version of the code if you need to report any issues)

It also seems to work for the installable version of ESX 3i… (although I’ve not quite figured out the point of that version yet :)).

image

Install prompt

image

it did fail with an error the 1st time round..

image

this was because I had specified an IDE disk as per the ESX instructions, I changed it to a SCSI one and it worked ok.

image

Finished..

imageimage 

The ESX 3i install has a footprint of about 200Mb on disk, and ESX 3.5 uses 1.5Gb.

I’m going to keep the 3.5 install on my laptop and will try to use linked clones to maintain a couple of different versions/configs to save disk space.. I’m sure I could knock up a quick script to change the hostname/IP of each clone – if I do I’ll post it here.

Why would you want to do this? well because you can, of course 🙂 and its handy for testing patch updates and scripts for ESX management etc.

I will  also try to get a ESX DRS cluster running under workstation with a couple of ESX hosts and shared storage over iSCSI using something like OpenFiler as shown here. won’t exactly be production performance, but useful for testing and demo’ing.

New VMWare Workstation 6.5 Build(s) and ability to run ESX 3.5

 

As a result of this post from Eric Sloof I note there is a new build of Workstation 6.5 available; I hadn’t noticed this as I haven’t had much time to follow the forums and my beta/RC (as used in this post and installed here is build 84113) hasn’t notified me there is a new release as all the previous 4.x/5.x beta’s have.

Oddly I checked this morning before I saw Eric’s post and it reported no new builds available – assume this is because its still a beta programme.

Anyway – if you downloaded the previous build before 14th May then go to this page and you can update your registration for the new build (below).

image

I’ll be trying this out in the coming week and hopefully will be able to get ESX running on my laptop under VMWare Workstation (very handy mobile demo platform).

Deleting a Virtual Machine from Virtual Center and Disk

 

If you deploy your VM’s from a master image using Virtual Center’s Deploy from template functionality (below).

image

When you try and delete a virtual machine you’ve created from disk

image 

You get the following prompt

Are you sure you want to delete this VM and it’s associated base disk?

Please note if other VMs are sharing this base disk, they will no longer have access to this disk.

image

This does not refer to the master VM image you deployed from; in other words if you delete the VM it does not break all other VMs deployed from the initial template.

One other point to note, when you perform “Deploy virtual machine from template” operation, the target field (below) is actually the name of the base image you are cloning, rather than the name of the eventual VM you are creating from it – odd, but that’s how it is (below)

image

Solid Sate SAN, Storage vMotion and VMWare – HSM for your VMs

 

You’ve been able to buy solid state SAN technology like the Tera-RAMSAN from TMS which gives you up to 1Tb of storage, presented over 4Gb/s fibre channel or Infiniband @10Gb/s… with the cost of flash storage dropping its going to soon fall in to the realms of affordability (from memory a year ago 1Tb SSD SAN was about £250k, so would assume that’s maybe £150k now – would be happy to see current pricing if anyone has it though).

If you were able to combine this with a set of ESX hosts dual-connected to the RAMSAN and traditional equipment (like an HP EVA or EMC Clariion) over a FC or iSCSI fabric then you could possibly leverage the new Storage vMotion features that are included in ESX 3.5 to achieve a 2nd level of performance and load levelling for a VM farm.

image

It’s pretty common knowledge that you can use vMotion and the DRS features to effectively load level or average VM CPU and memory load across a number of VMWare nodes within a cluster.

Using the infrastructure discussed above could add a second tier of load balancing without downtime to a DRS cluster. If a VM needs more disk throughput or is suffering from latency then you could move them to/from the more expensive solid-state storage tiers to FC-SCSI or even FATA disks, this ensures you are making the best use of fast, expensive storage vs. cheap, slow commodity storage.

Even if Virtual Center doesn’t have a native API for exposing this type of functionality or criteria for the DRS configuration you could leverage the plug-in or scripting architecture to use a manager of managers (or here) to map this across an enterprise and across multiple hypervisors (Sun, Xen, Hyper V)

I also see EMC integrating flash storage into the array itself, would be even better if you could transparently migrate LUNS to/from different arrays and disk storage without having to touch ESX at all.

Note: This is just a theory I’ve not actually tried this – but am hoping to get some eval kit and do a proof on concept…