Virtualization, Cloud, Infrastructure and all that stuff in-between
My ramblings on the stuff that holds it all together
vSphere – How to Enable FT for a Nested VM
As in my previous post; I am working on a lab with virtual ESX4 servers in it – I can vMotion VMs from a physical vSphere cluster into the virtual vSphere cluster perfectly and performance is very good (just 1 dropped ping in my testing)
One of the physical hosts belongs to www.techhead.co.uk which he has kindly lent for this joint experiment – see his posts here, here and here on running vSphere on these HP ML115g5 servers and their FT compatibility. We have some joint postings in the pipeline on guest performance with complicated apps like SQL & Exchange when protected via FT , so keep your eyes peeled.
As the physical ESX hosts themselves are FT compatible I thought I’d see if I can enable FT for a VM running inside a virtual ESX server cluster, so a VM running inside a hypervisor, inside another hypervisor..!
Our of the box, unfortunately not; as it gives the following error message 😦
Power On virtual machine Record/Replay is not supported on this CPU for this guest operating system. Vou may have an incompatible CPU, you may have specified the wrong guest operating system type, or you may have conflicting options set in your config file. See the online help fot a list of supported guest operating systems, CPUs and associated config options. Unable to enter fault tolerance mode. |
To work around this you can enable the following advanced (and likely totally unsupported) settings to enable FT on the nested VM (the default is/was false) (thanks to the comment on this post for the replay.allowBTOnly = TRUE setting!)
And here it is – Nested VM running, with FT enabled
Very nice
Later on you can see some warnings about hosts getting a bit behind, also I had some initial problems getting FT to bring up the 2nd VM properly, the UI said it was restarting and it got stuck there, I dropped the virtual ESXi host down to a single vCPU rather than two and it worked ok from then on. I decided to do this as the virtual ESXi nodes were coming up reporting 2 x Quad core CPUs; whilst the physical host only has a 1 x Quad Core CPU so I guess that was causing some confusion.
At this point both of my virtual ESXi hosts were on the same physical vSphere server, and I seemed to have problems with the secondary getting behind. (vLockstep interval)
In this instance my nested VM is running an x86 Windows 2003 unattended setup.
I vMotioned one of the virtual ESXi hosts to the second physical vSphere server (very cool in itself) and it seemed to be better for a while, I assume there was some CPU contention from the nested VM.
However in the end it flagged up similar errors, I assume this is due to the overhead of running a VM inside a hypervisor, inside another hypervisor 🙂 this is a lab setup but will prove very useful if you have to learn about this stuff or experiment with different configurations.
This is probably totally unsupported, use at your own risk – but it does work well enough to play about with in the lab.
Hey there, I found the same problem as you in regards to the vm flicking between protected and not protected, Ive dropped down to a single CPU per vsphere host but its makes no difference. Any other ideas on this ?
I still find the FT’d VM does that as well – keeps up for a while; then gets behind again.. no real solution found so far – assume its something to do with available CPU time and overhead from the hypervisor. works well enough to figure out how it works and do a bit of experimentation
Pingback: It’s voting time.. « Virtualization, Windows, Infrastructure and all that “stuff” in-between
Pingback: Link – How to Enable FT in a Nested VM (running on ML115 G5’s!) | TechHead.co.uk
Hi, nice post, but when I turn on the virtual esx machine, it changes the vmx file, and puts replay.support = false.
Do you have any idea to block the changes that it makes?
I have a two cpu machine with Intel Xeon E5506.
Thanks in advance,
Miguel
Hi, I have a similar setup to yours – ML115 G5, 8GB RAM, AMD Opteron 1352. Try as I might, I can’t get FT to work. When I go to the Summary tab of the esxi 4.1 VM and click on the callout beside “Host Configured for FT:” it says “Host CPU does not support hardware virtualisation which is required for FT”. Everything else works – HA, vMotion, DRS, nested VMs. Any ideas?
Mmm, that’s odd – it definitley works ok on mine – did you make all the .vmx file tweaks?
do you have AMD-V enabled in the BIOS
I’m getting the same as Miguel. I make the changes to the .VMX file but as soon as I power on the vESX server it changes to replay.supported = FALSE. The underlying host has FT capable hardware (the summary tab says it’s fine, Intel Xeon 5440) but the vESX host says the CPU isn’t compatible. I’ve tried both vESX and vESXi with the same outcome. If time permits I’ll try vESX4 instead of v4.1 in case that’s any better. Shame, looks like a useful lab setting.
Sorry for the late reply. Yes, I made the .vnx changes and enabled AMD-V. The weird thing is that the physical ESX host says that FT is supported but the vESX hosts say not. It looks like something is being filtered out as you go from pESX to vESX.
Pingback: VCAP-DCA Study Notes – 4.2 Deploy and test VMware FT | www.vExperienced.co.uk
Pingback: FT in einer nested Umgebung | VMwareBlog.de
Do not forget to enable the advanced settings on the NESTED VM that needs FT to be enabled.
When enabling replay.supported on the vESX this DOES get reset to ‘false’ when the nested ESX boots.
Pingback: Link – How to Enable FT in a Nested VM (running on ML115 G5’s!) - TechHead
This was even more problematic then I originally thought. Eventually had chance to test FT in nested environment, but I guess when you don’t have supported CPU this option doesn’t work properly. Tested this with 2 computers (CPU Intel i7 920, 2.67GHz) , with workstation in each (other supported vCenter and iSCSI server, other 2 ESXi-servers). Best results I got with FT were like this:
-Managed to turn FT on and migrate secondary to other host
-Manged to power on 32-bit Ubuntu VM, startup took roughly 15mins and VM’s vCPU was at 100% usage whole time. This caused alarms etc. vCPU usage dropped below 10% at some point but even little tasks caused it to go near 100%.
-While FT was on, I powered off primary’s ESXi-host and still managed to ping this Ubuntu host. FT did work as intented and execution of this guest went to another host but while I managed to ping the guest the console windows didn’t work even after 15mins of waiting…