Join 2,575 other subscribers
My ramblings on the stuff that holds it all together
I’m currently doing some work on SSD storage and virtual machines so I need an easy way of comparing I/O performance between a couple of virtual machines, each backed onto different types of storage.
I normally use IOmeter for this kind of work but generally only in a standalone manner – i.e I can run IOmeter inside a single VM guest and get statistics on the console etc.
With a bit of a read of the manual I quickly realised IOmeter was capable of so much more! (amazing things, manuals :)).
You can run a central console which runs the IOmeter Windows GUI application then add any number of “managers” – which are machines doing the actual benchmarking activities (disk thrashing etc.)
The use of the term manager is a bit confusing to me as you would think the “manager” is the machine running the IOmeter console, but actually each VM or physical server you want to load-test is a known as a manager, which in turn runs a number of workers which carry out the I/O tasks you specify and reports the results back to a central console (the IOmeter GUI application shown above).
Each VM that you want to test runs the dynamo.exe command with some switches to point it at an appropriate IOmeter console to report results.
On the logging machine run IOmeter.exe
on each VM (or indeed physical machine) that you want to benchmark at the same time run the dynamo.exe command with the following switches
dynamo.exe /i <IP of machine running IOmeter.exe> /n <display name of this machine – can be anything> /m <IP address or hostname of this machine>
in my case;
dynamo.exe /i 192.168.66.11 /n SATA-VM /m 192.168.66.153
You will then see output similar to the following;
The IOmeter console will now show all the managers you have logged on – in my case I have one VM backed to a SATA disk and one VM backed to an SSD disk.
I can now assign some disk targets and access specifications to each worker and hit start to make it “do stuff which I can measure” 🙂 for more info on how to do this see the rather comprehensive IOmeter manual
If you want to watch in realtime, click the results display tab and move the update frequency slider to as few seconds as possible
If you want to compare figures from multiple managers (VMs) against each other you can just drag and drop them on to the results tab
Then chose the metric you want to compare from the boxes – which don’t look like normal drop down elements so you probably didn’t notice them.
You can now compare the throughput of both machines in real-time next to each other – in this instance the SSD backed VM achieves less throughput than the SATA drive backed VM (more on this consumer-grade SSD in a later post)
Depending on the options you chose when starting the test run the results may have been logged out to a CSV file for later analysis.
Hope that helps get you going – if you want to use this approach to benchmark your storage array with a standard set of representative IOmeter loads – see these VMware communities threads
from a quick scan of the thread, this file seems to be the baseline everyone is measuring against
To use the above file you need to open it with IOmeter, then start up your VMs that you want to benchmark as described earlier in this post.
You will need to manually assign the disk target to each worker once you have opened that .icf file in IOmeter unless you set them in the .icf file manually.
This is the test whilst running with the display adjusted to show interesting figures – note the standard test contains a number of different iterations and access profiles – this is just showing averages since the start of the test and are not final figures.
This screenshot shows the final results of the run, and the verdict is; overall consumer-grade SSD sucks when compared against a single 7.2k RPM 1Tb SATA drive plugged into an OpenFiler 🙂 I still have some analysis to do on that one – and it’s not quite that simple as there are a number of different tests run as part of the sequence some of which are better suited to SSD’s
More posts on this to follow on SSD & SATA performance for your lab in the coming weeks, stay tuned..
The following screen dump is from an HP DL380G5 server that runs all the core infrastructure under VMWare Server (the free one) for a friend’s company which I admin sometimes.
It is housed in some co-lo space and runs the average range of Windows servers used by a small but global business, Exchange SQL, Windows 2003 Terminal Services.
As a result of some planned (but not very well communicated!) power maintenance the whole building lost power earlier today, when it was restored I grabbed the following screenshot as the 15 or so Virtual Machines automatically booted.
interesting to note that all the VM’s had been configured to auto-start with the guest OS, meaning there wasn’t any manual intervention required, even though it was a totally dirty shutdown for both the host and guest OS’es (No UPS, as the building and suite is supposed to have redundant power feeds to each rack – in this instance the planned maintenance was on the building wiring so required taking down all power feeds for a 5 yearly inspection..)
There are no startup delay settings in the free version of VMWare Server so they all start at the same time, interesting to note the following points..
The blue line that makes a rapid drop is the pages/second counter, and the 2nd big drop (green) is the disk queue length. the hilighted (white) line is the overall %CPU time, note the sample frequency was 15 seconds on this perfmon.
After it had settled down, I took the following screenshot, it hardly breaks a sweat during its working day. there are usually 10-15 concurrent users on this system from around the world (access provisioned via an SSL VPN device) and a pretty heavily used Exchange mail system.
The box is an HP DL380 G5 with 2 x quad core CPUs (8 cores in total) and 16Gb of RAM, it has 8 x 146Gb 15k HDDs in a single RAID 5 set + hot-spare, it was purchased in early 2007 and cost c.£8,000 (UK Prices)
It runs Windows 2003 Enterprise Edition x64 edition with VMWare Server 1.0.2 (yes, its an old build.. but if it ain’t broke..) and they have purchased multiple w2k3 ent-edition licences to take advantage of the virtualisation use-rights to cover the installed virtual OS’es.
It’s been in-place for a year and hardly ever has to be touched, its rock-solidly available and the company have noticed several marked improvements since they P2V’d their old servers onto this platform, as follows;
Hopefully this goes to show the free version of VMWare’s server products can work almost as well if budget is a big concern, ESX would definitely give some better features and make backup easier, they are considering upgrading and combining with something like Veeam Backup to handle failover/backup.
Interesting article here on some stress testing VMWare have done running Exchange 2007 under virtualization on VI3.5.
It’s working.. .and working well, now – official support?
Martin’s post here prompted me to blog something I’ve been meaning to do for a while.
Virtualization projects and services are cool; we all understand the advantages in power/cooling and the flexibility it can bring to our infrastructures.
But what about support, if you are a service provider (internal or outsourcing) you normally need to be able to offer an end-end SLA on your services. typically this would be backed off against a vendor like Microsoft or Oracle via one of their premium support arrangements.
From what I see in the industry, with most software vendors especially Microsoft there is almost no way a service provider can underwrite an SLA as application/OS vendors give themselves significant scope to say “unsupported configuration” if you are running it under a hypervisor or other VM technology… Microsoft use the term commercially reasonable in their official policy – who decides what this is?
I would totally accept that a vendor would not guarantee performance under a hypervisor – that’s understandable and we have tools to analyse, monitor and improve (Virtual Centre, MOM, DRS, increase resources etc.). but too many vendors seem to use it as a universal “get out of jail free card”.
Issues of applications with dependency on physical hardware aside (fax cards, realtime CPU, DSP, PCI cards etc.) In my entire career working with VM technology I’ve only ever seen one issue that could be directly attributed to being caused by virtualization – and to be fair that was really a VMTools issue; rather than VMWare itself.
Microsoft have an official list of their applications that are not supported here – why is this? speech server I could maybe understand as it would probably be timer/DSP sensitive – but the rest? Sharepoint? I know for a fact ISA does work under VMWare as I use it all the time.
Microsoft Virtual Server support policy http://support.microsoft.com/kb/897613
Support policy for Microsoft software running in non-Microsoft hardware virtualization software http://support.microsoft.com/kb/897615/
Exchange is specifically excluded (depending on how you read the articles)
· On the Exchange Server 2007 System requirements page it only mentioned Unified messaging as being unsupportable in a virtual environment http://technet.microsoft.com/en-us/library/aa996719.aspx
· Yet on TechNet it is clear stated that “Neither Exchange 2007 nor Exchange 2007 SP1 is supported in production in a virtual environment” http://technet.microsoft.com/en-us/library/bb232170(EXCHG.80).aspx
Credit due to a colleague for pulling together the relevant Microsoft linkage
But I know it….
a) works fully – I do it all the time.
b) Lots of people are doing this in production with lots of users (many people at VMWorld US last year)
c) VMWare have a fully-supportable x64 hypervisor – It’s just MS that don’t
Dont’ tell/ask – 99% of the time a tech support rep won’t know its running under VMWare/a.n.other hypervisor so why complicate matters by telling them – could of course back-fire on you!
Threaten – “If you won’t support under VMWare we’ll use one of your competitors applications”; however this only really works if you are the US govt. or Globocorp Inc. or operate in a very niche application market.
Mitigate – reflect this uncertainty in an SLA, best-endeavours etc. this would kill most virtualization efforts in their tracks for an enterprise customer.
The same support issue has been around for a long time; Citrix/Terminal Services, application packaging, automated installations, etc. are treated as “get out of jail free cards” by support organisations…
But whilst there are some technical constraints (usually only affecting badly written apps) with terminal services and packaging, virtualization changes the game and should make it simpler for a vendor to support as there is no complex runtime integration with a host OS + bolt-ons/hacks it’s just an emulated CPU/disk/RAM you can do whatever you like within it.
So – the open debate; what do you do? and how do you manage it?
I’ve not done anything with my home ESX server this week as I’ve been busy with work; so this will be interesting – it’s been powered up all the time with all the VM’s spinning; but not doing very much.
Whist running this set of VMs.. (the CPU stats for VMEX01 and VMEX02 are a bit skewed as I added this bit after the original post and they are both running seti@home (hence increased CPU)
So, nothing interesting to see here – but might be worth bearing in mind for some kind of sizing estimate; this is a single core CPU (HT enabled) PC with 4Gb RAM and a single 500Gb SATA disk
I have a geeky secret; I used to be really into ray-tracing and 3D graphics not so much from an “art” point of view – although I do have an interest in that and computer modelling/visualisation checks a lot of boxes for me as I always wanted to be a civil engineer or architect (well, I kind of am… but with computers..!)
it was one of the only applications I found in the early/mid 90’s that could really tax a machine and I spent a lot of time playing with large render jobs using PovRay and progressed to 3D studio for DOS and then a bit of a dabble with building render farms using 3DS Max before I had to go and get a “proper” job with less spare time.
I would love the time to get back into it, with the power available today you could produce some awesome images, although maybe I am somewhat hampered through lack of talent… maybe that will be downloadable now?
….So anyway, here’s an interesting article on how DreamWorks Animation have sped up access to their render farm using Ibrix Parallel file server software… they shift a lot of data!
I’ve worked on a project where we’ve tried to implement similar high-performance grid-based storage systems for large media files; but they were somewhat less successful/undeveloped; this one looks promising.
I wonder if these kind of vendors will start moving into the virtualization space; it’s essentially the same principal.
Deliver large flat files (.VMDK), over cheap/scalable commodity media (GigE) as quick a possible
This would reduce the depende.ncy on expensive back-end fibre channel SANs, and you could invest more in flexible Ethernet – or maybe Infiniband to deliver networking and storage within a “virtual fabric”
If it’s “virtual” and “grid” based the quality/features of individual hardware devices (DL380, NAS device etc.) that make it up the overall grid are less important and a 100% software approach gives you the flexibility to pick & choose building blocks from the most appropriate/affordable manufacturer rather than be locked into a costly single vendor solution (HP EVA, EMC Clariion, DMX etc.)
Thanks to Martin at Bladewatch for the link.
I thought I’d post some performance graphs from my cheap HP D530 ESX server using the Virtual Centre console (which incidentally, is good for getting this info quickly and simply).
Screenshot of the UI for querying performance stats.
View of currently running VMs – a mix of Windows 2003/2008 VMs
Current Overall ESX Host statistics (with a clone from template going on)
As I noted elsewhere on my blog it has 4Gb RAM and a single 2.8GHz HT CPU – and with this VM load it gives an average CPU load of 25-30%. Almost all of these VM’s are idling but all respond in good time to network access/TS etc- not bad at all for a desktop PC!
CPU usage for the last 24 hours
The big spike around 22:00 was when I cloned up a whole load more VM’s – seems to have upset the stats so need to try and have a look at that..
It’s also interesting to note that I added 4 Windows 2003 VM’s last night but that hasn’t actually increased the overall CPU average – ESX must be quite efficient at time-slicing all those idle VMs.
I had 3-4 “deploy from template..” operations going on at the same time and it really bogged down the performance of the VM’s (usable, but only just..) but it is just a single SATA disk drive so I can live with that.
Deploying 1 VM at a time had little or no impact – slight CPU spike to ~50% as you’ll see to the far right of the chart as I kicked off another one just now.
When i get time I’m going to drop some jobs into the VM’s that will tax the virtual CPUs a bit more and compare results – maybe some Folding@Home activity Mmmmm that would definitley tax it.