My ramblings on the stuff that holds it all together
VMware ESX 5
Ok, so vSphere (ESX4) has only just been released, but what would you like to see in the next major version? Hyper V R2 will be out soon, and I would expect it’s successor within a further 18 months. whilst vSphere is a technically better product now Microsoft are going to be throwing a significant amount of resource at building up the Hyper V product line so VMware need to keep innovating to be significantly ahead.
As the VMware vendor and partner ecosystem grows will it stifle growth in the core product? – I see this happening with Microsoft – they don’t want to produce an all singing and dancing core product as there are literally thousands of ISV’s that they don’t necessarily want to put out of business; so Microsoft core products are “good-enough” but for more advanced features you turn to an ISV (think Terminal Services & Citrix)
So, open question really – here’s my starter for 10 – What would you like to see in ESX 5?
Host Based Replication
SAN storage brings a single point of failure; even with all the best HA controllers and disk arrangements, it’s still one unit –human error or a bad firmware could corrupt all your disks – you can buy a 2nd one and do replication but that’s expensive (twice as expensive infact) and failover can require downtime (automated with SRM etc.).. and what if you need to physically move it to another datacentre? that’s a lot of risk.
In this previous post I proposed a slightly different architecture, leveraging the FT features for a branch office solution – that same model could mean a more distributed architecture with n+1, 2 or 3 x ESX nodes running FT’d VMs for high availability on cheap, commodity hardware – using DAS storage and replicating over standard IP networks.
if you look at companies like Amazon, Google etc. their cloud platforms leverage virtualization (Xen) but I would bet they don’t rely on enormous SANs to run them, they use DAS storage and replication, they expect individual (or even datacentre) failures and can work around them by keeping multiple copies of everything – but they don’t have an expensive storage model – they use cheap commodity kit and provide the HA in the software – with some enhancements the FT feature could provide an equivalent;
Host based replication also makes long-distance clustering more realistic – relying on plain old IP to do the replication, rather than proprietary SAN-SAN replication (previous thoughts on this here)
Microsoft have already moved in this direction with core products like Exchange and SQL, Exchange CCR and SQL Mirroring are pure-IP based replication technologies that address the issues with traditional single copy clusters
Now, with VMware being owned by EMC I could see this as being something of a problem but I hope they can see the opportunity here, you can achieve some of this using storage virtual machines (like Openfiler+Replication in a VM, or Datacore).
Stateless ESX Nodes
A mode where nodes can be PXE booted (or from firmware like ESXi) and have their configurations assigned/downloaded – no manual installs, all DHCP (or reserved DHCP) addressing
when combined with cheap, automatically provisioned and managed virtualization nodes with commodity DAS storage, you could envisage the following scenario..
- Rack a new HP DL360g7 with ESX 5i server on a USB key (or PXE booted), attach power, network and walk away
- it registers itself at boot time with a management node(s) downloads its configuration
- based on dynamically assigned HA policy it replicates copies of virtual machines from elsewhere in the ESX cloud, once up to speed it becomes a secondary or tertiary copy.
You can imagine a policy-driven intelligent load and availability controller (vCenter 5) which ensures there are always copies of a VM on at least 2 or 3 physical machines in more than one location
This is getting a bit sci-fi, but the foundations in infrastructure and technology are being laid now with high-speed interconnects like Infiniband…
With more operating systems and applications starting to optimize for multi-core and hot-add CPU and memory, a very advanced hypervisor scheduler combined with very fast host interconnects like Infiniband or 10GbE could see actual CPU load and memory access being distributed across multiple physical hypervisors;
For example; imagine a 24 vCPU SQL Server virtual machine with 1Tb of vRAM having it’s code executed across 10 quad-CPU physical hosts. effectively multi-core processing but across multiple physical machines – moving what currently happens within the a single physical CPU and bus across the network between disparate machines.
The advantage of this is that developers would only have to write apps that work within current SMP technology – the hypervisor masks the complexity of doing this across multiple hosts, CPUs and networks with a high degree of caching and manages concurrency between processes.
You could combine this with support for hot-add CPU and memory features for apps that could scale massively on-demand and then down again, without having to engineer complex layer 7 type solutions.
Anyway, and please note this is pure personal conjecture rather than anything I have heard from VMware or elsewhere – enough from me; what would YOU like to see…?