Virtualization, Cloud, Infrastructure and all that stuff in-between
My ramblings on the stuff that holds it all together
Category Archives: GigE
Virtualization – the key to delivering "cloud based architecture" NOW.
There is a lot of talk about delivering cloud or elastic computing platforms, a lot of CxO’s are taking this all in and nodding enthusiastically, they can see the benefits.. so make it happen!….yesterday.
Moving your services to the cloud, isn’t always about giving your apps and data to Google, Amazon or Microsoft.
You can build your own cloud, and be choosy about what you give to others. building your own cloud makes a lot of sense, it’s not always cheap but its the kind of thing you can scale up (or down..) with a bit of up-front investment, in this article I’ll look at some of the practical; and more infrastructure focused ways in which you can do so.
Your “cloud platform” is essentially an internal shared services system where you can actually and practically implement a “platform” team that operates and capacity plans for the cloud platform; they manage it’s availability and maintenance day-day and expansion/contraction.
You then have a number of “service/application” teams that subscribe to services provided by your cloud platform team… they are essentially developers/support teams that manage individual applications or services (for example payroll or SAP, web sites etc.), business units and stakeholders etc.
Using the technology we discuss here you can delegate control to them over most aspects of the service they maintian – full access to app servers etc. and an interface (human or automated) to raise issues with the platform team or log change requests.
I’ve seen many attempts to implement this in the physical/old world and it just ends in tears as it builds a high level of expectation that the server/infrastructure team must be able to respond very quickly to the end-“customer” the customer/supplier relationship is very different… regardless of what OLA/SLA you put in place.
However the reality of traditional infrastructure is that the platform team can’t usually react as quick as the service/application teams need/want/expect because they need to have an engineer on-site, wait for an order and a delivery, a network provisioning order etc. etc (although banks do seems to have this down quite well, it’s still a delay.. and time is money, etc.)
Virtualization and some of the technology we discuss here enable the platform team to keep one step ahead of the service/application teams by allowing them to do proper capacity planning and maintain a pragmatic headroom of capacity and make their lives easier by consolidating the physical estate they manage. This extra headroom capacity can be quickly back-filled when it’s taken up by adopting a modular hardware architecture to keep ahead of the next requirement.
Traditional infrastructure = OS/App Installations
- 1 server per ‘workload’
- Silo’d servers for support
- Individually underused on average = overall wastage
- No easy way to move workload about
- Change = slow, person in DC, unplug, uninstall, move reinstall etc.
- HP/Dell/Sun Rack Mount Servers
- Cat 6 Cables, Racks and structured cabling
The ideal is to have an OS/app stack that can have workloads moved from host A to host B; this is a nice idea but there are a whole heap of dependencies with the typlical applications of today (IIS/apache + scripts, RoR, SQL DB, custom .net applications). Most big/important line of business apps are monolithic and today make this hard. Ever tried to move a SQL installation from OLD-SERVER-A to SHINY-NEW-SERVER-B? exactly. *NIX better at this, but not that much better.. downtime required or complicated fail over.
This can all be done today, virtualization is the key to doing it – makes it easy to move a workload from a to b we don’t care about the OS/hardware integration – we standardise/abstract/virtualize it and that allows us to quickly move it – it’s just a file and a bunch of configuration information in a text file… no obscure array controller firmware to extract data from or outdated NIC/video drivers to worry about.
Combine this with server (blade) hardware, modern VLAN/L3 switches with trunked connections, and virtualised firewalls then you have a very compelling solution that is not only quick to change, but makes more efficient use of the hardware you’ve purchased… so each KW/hr you consume brings more return, not less as you expand.
Now, move this forward and change the hardware for something much more commodity/standardised
Requirement: Fast, Scalable shared storage, filexible allocation of disk space and ability to de-duplicate data, reduce overhead etc, thin provisioning.
Solution: SAN Storage, EMC Clariion, HP-EVA, Sun StorageTek, iSCSI for lower requirements, or storage over single Ethernet fabric – NetApp/Equalogic
Requirement: Requirement Common chassis and server modules for quick, easy rip and replace and efficient power/cooling.
Solution: HP/Sun/Dell Blades
Requirement: quick change of network configurations, cross connects, increase & decrease bandwidth
Solution: Cisco switching, trunked interconnects, 10Gb/bonded 1GbE, VLAN isolation, quick change enabled as beyond initial installation there are fewer requirements to send an engineer to plug something in or move it, Checkpoint VSX firewalls to allow delegated firewall configurations or to allow multiple autonomous business units (or customers) to operate from a shared, high bandwidth platform.
Requirement: Ability to load balance and consolidate individual server workloads
Solution: VMWare Infrastructure 3 + management toolset (SCOM, Virtual Centre, Custom you-specific integrations using API/SDK etc.)
Requirement: Delegated control of systems to allow autonomy to teams, but within a controlled/auditable framework
Solution: Normal OS/app security delegation, Active Directory, NIS etc. Virtual Center, Checkpoint VSX, custom change request workflow and automation systems which are plugged into platform API/SDK’s etc.
the following diagram is my reference architecture for how I see these cloud platforms hanging together
As ever more services move into the “cloud” or the “mesh” then integrating them becomes simpler, you have less of a focus on the platform that runs it – and just build what you need to operate your business etc.
In future maybe you’ll be able to use the public cloud services like Amazon AWS to integrate with your own internal cloud, allowing you to retain the important internal company data but take advantage of external, utility computing as required, on demand etc.
I don’t think we’ll ever get to.. (or want) to be 100% in a public cloud, but this private/internal cloud allows an organisation to retain it’s own internal agility and data ownership.
I hope this post has demonstrated that whilst, architecturally “cloud” computing sounds a bit out-there, you can practically implement it now by adopting this approach for the underlying infrastructure for your current application landscape.
VMWare/Cisco Switching Integration
As noted here there is a doc that has been jointly produced between VMWare and Cisco which has all the details required for integrating VI virtual switches with physical switching.
Especially handy if you need to work with networking teams to make sure things are configured correctly to allow failover properly between redundant switches/fabrics etc. – it’s not as simple as it looks, and people often forget the switch-side configurations that are required.
Doc available here (c.3Mb PDF)
HP iLo Very Slow for Installing an OS
A bit of a disappointment; we’re trying to do a WinPE 2.0 CD/DVD based installation for our Windows 2003/2008 standard blade servers in an HP c7000 enclosure.
Installing from a .ISO image presented to the iLo via the virtual media applet is dog-slow (5-10 times slower than from a physical CD/DVD- why is this? – surely its technically possible to make this access run faster and GigE chipsets are cheap-as these days. I’ve been through every combination of switching/duplex/port config and even via a cable directly into the Blade OA.
The same issue seems to manifest itself on traditional rack mount HP servers – the iLo just isn’t fast enough to make this a workable solution, unless you are really patient.
I know we could use the RDP and do it as a PXE type installation over the network to each blade, but this doesn’t really achieve what I want…
Most customers maintain an OoB (Out of Band) network to which all of the management interfaces (iLo, DRAC, etc.) are connected to. the reasoning for this is obvious; if you loose your main core switching network you can get access via a totally different physical network and path to assist in troubleshooting.
For this same reasoning I would like to use this method to build servers from a master boot CD/DVD image, you can present a .ISO image to a server via the virtual media applet on the iLo. We have a fully end-end build process that sets up the HP array controllers, flashes BIOS and installs the OS and drivers etc. from a CD/DVD.
We just update the boot CD .ISO file as required and its flexible and it doesn’t rely on any deployment infrastructure (PXE server, RDP server etc.) so we can port it between customers and data centres, VM’s and physical machines and do a bare-metal builds without requiring any build/network infrastructure in place.
This isn’t just limited to a Windows OS – I tried the same with an ESX installation; took over an hour (compared to 5-10 mins from a local CD)
Interesting Article on how DreamWorks are Speeding up Access for Animators
I have a geeky secret; I used to be really into ray-tracing and 3D graphics not so much from an “art” point of view – although I do have an interest in that and computer modelling/visualisation checks a lot of boxes for me as I always wanted to be a civil engineer or architect (well, I kind of am… but with computers..!)
it was one of the only applications I found in the early/mid 90’s that could really tax a machine and I spent a lot of time playing with large render jobs using PovRay and progressed to 3D studio for DOS and then a bit of a dabble with building render farms using 3DS Max before I had to go and get a “proper” job with less spare time.
I would love the time to get back into it, with the power available today you could produce some awesome images, although maybe I am somewhat hampered through lack of talent… maybe that will be downloadable now?
….So anyway, here’s an interesting article on how DreamWorks Animation have sped up access to their render farm using Ibrix Parallel file server software… they shift a lot of data!
I’ve worked on a project where we’ve tried to implement similar high-performance grid-based storage systems for large media files; but they were somewhat less successful/undeveloped; this one looks promising.
I wonder if these kind of vendors will start moving into the virtualization space; it’s essentially the same principal.
Deliver large flat files (.VMDK), over cheap/scalable commodity media (GigE) as quick a possible
This would reduce the depende.ncy on expensive back-end fibre channel SANs, and you could invest more in flexible Ethernet – or maybe Infiniband to deliver networking and storage within a “virtual fabric”
If it’s “virtual” and “grid” based the quality/features of individual hardware devices (DL380, NAS device etc.) that make it up the overall grid are less important and a 100% software approach gives you the flexibility to pick & choose building blocks from the most appropriate/affordable manufacturer rather than be locked into a costly single vendor solution (HP EVA, EMC Clariion, DMX etc.)
Thanks to Martin at Bladewatch for the link.