Subscribe to my RSS Feed
Join 2,575 other subscribers
My ramblings on the stuff that holds it all together
Now that VMware are moving away from ESX classic (with service console) to the ESXi model I have experienced a couple of issues recently that got me wondering if NFS will be a more appropriate model for VM storage going forward. in recent versions of ESX (3.5 and 4) NFS has moved away from just being recommended for .ISO/template storage and has some big names behind it for production VM storage.
I’m far from a storage expert, but I know enough to be dangerous… feel free to comment if you see it differently.
“out of band” Access speed
Because VMFS is a proprietary block-storage file system you are only able to access it via an ESX host you can’t easily go direct (VCB…maybe, but it’s not easy), in the past this hasn’t been too much of an issue; however whilst building a new ESXi lab environment on standard hardware I found excessive transfer times using the Datastore browser in the VI Client, 45mins+ to copy a 1.8GB .ISO file to a VMFS datastore, or import virtual machines and appliances; even using Veeam FastSCP didn’t make a significant difference.
I spent ages checking out network/duplex issues but in desperation I tried it against ESX classic (based on this blog post I found) installed on the same host and that transfer time was less than 1/2 (22mins) – which still wasn’t brilliant – but I cranked out Veeam FastSCP and did it in 6mins!
So, lesson learnt? relying on the VI client/native interfaces to transfer large .ISO files or VMs into datastores slow and you have to go via the Hypervisor layer, which oddly doesn’t seem optimized for this sort of access. Veeam FastSCP fixes most of this – but only on ESX classic as it has some service-console cleverness that just isn’t possible on ESXi.
With ESX classic going away in favour of ESXi, there will need to be an alternative for out of band access to datastores – either direct access or an improved network stack for datastore browsing
This is important where you manage standalone ESX hosts (SME), or want to perform a lot of P2V operations as all of those transfers use this method.
In the case of using NFS, given appropriate permissions you can go direct to the volume holding the VMs using a standard network protocol which is entirely outside of the ESX/vCenter. upload/download transfers thus are at at the speed of the data mover or server hosting the NFS mount point so are not constrained by ESX.
To me, Fibre Channel was always more desirable for VM storage as it offered lossless bandwidth up to 4Gb/s (now 8Gb/s) but Ethernet (which is obviously required to serve NFS) now has 10Gb/s bandwidth and loss-less technology like FCoE, some materials put NFS about 10% slower than VMFS – considering the vast cost difference between dedicated FC hardware and commodity Ethernet/NAS storage I think that’s a pretty marginal difference when you factor in the simplicity of managing NFS vs. FC (VLANs, IPs vs. Zoning, Masking etc.).
FCoE maybe addresses the balance and provides the best solution to performance and complexity but doesn’t really address the out of band access issue I’ve mentioned here as it’s a block-storage protocol.
If you have a problem with your vCenter/ESX installation you are essentially locked out of access to the virtual machines, it’s not easy to just mount up the VMFS volume on a host with a different operating system and pull out/recover the raw virtual machines.
With NFS you have more options in this situation, particularly in small environments.
Storage Host Based Replication
For smaller environments SAN-SAN replication is expensive, and using NFS presents some interesting options for data replication across multiple storage hosts using software solutions.
I’d love to hear your thoughts..
I agree with you on the need of fast out-of-band access and emergency access, and NFS seems to be a balanced solution.
The only drawback that I see, is that a 10 Gb/s Ethernet infrastructure (even if used only for storage) is quite expensive yet and the whole network infrastructure must be coherent (switch, nic and so on).
I think that the whole thing will cost like Fibre Channel.
In small environment, where lan teaming can support the load, I agree with you that NFS can be the best solution.
I don’t agree. 10gb is getting cheap.
Thanks to copper;
10gb switch port : 500 euros per port
10gb dual port nic : less than 500 euros
Copper cable SFP+ (5m) : 150 euros
I don’t see the same level of price with FC.
One other big factor with ESXi right now is alignment of the vmdk’s. Currently you can use vOptimizer or NetApp’s mbralign tools to fix the issues. The problem is that you need a service console to run them today! So, if you are using NFS with ESXi, you can mount them to Linux and run the tools. There is no way to align vmdk’s with ESXi on VMFS today. This is my single biggest reason to resist ESXi today.
you dont need esx to use mbralign, all you need is a linux guest that can mount your nfs datastore, and run mbralign from there.
Great post. You give compelling arguments for using NFS over FC or FCoE. The FCoE standards for some things like losslessness still haven’t been finalized, but are in the works. Whenever designing a storage solution, everyone always asks “How much space do I need?” The question should always be answered “You really need to worry about IOPS and MBPS performance.” Even at 1GbE, you can use NFS if designed properly.