My ramblings on the stuff that holds it all together
The Traditional Approach to Test Labs & Why I Think it’s Wrong..
If you’re an infrastructure person there are all kinds of business drivers and company schemes that mean you are often getting mad-cap requests** for changes to your nice, stable, secure infrastructure that lets everyone get on with their job and not have to worry about it.
Not all of them make sense, hopefully most of them are worthwhile but they all need to be done NOW!.
Now, as much as they would like you to, you’re unlikely to give John Doe, head of marketing full access to install complicated software like Microsoft Symantec CA WizzBang Enterprise v666sp4 into your production environment without some kind of testing.
- if the production system falls over once someone has installed it people are going to come and beat you until you get it going again
- It’s seemingly innocuous & lightweight (the salesman said so…) data exchange agent that you need to install on all your key hosts may have some nasty incompatibilities with your standard installation – for example it breaks the AV agent or vice-versa, it needs a .net framework that is incompatible or untested with your production .net apps.
Herein lies the problem, you need to install it to check it out, run it against your standard configs, builds, apps etc. and achieve a reasonable level of confidence that the moment you click setup.exe it won’t be a case of not seeing your family again for a long time as servers and networks come crashing down around your ears.
The traditional approach to doing this, and I say traditional as this is what I’ve seen MSCS and all the big consultancy orgs and vendors recommending. Is to get a bunch of servers install the OS and the new app and see what happens.
This has a number of problems in my eyes.
It’s a vanilla/fresh install, it doesn’t have all the upgrades, patches, badly uninstalled apps, lurking data corruption issues that your production hosts have been exposed to.
Change control – great you’ve got it, and everyone has been on the ITIL courses but there is ALWAYS deviation from standard config.
I’ve yet to see any installation that looks exactly like the CMDB says it should, It’s both a management and a technical issue;
- Engineers are people, if someone is on the on-call rota and something needs fixing at 3am they may hack away at it to get things running again, it doesn’t necessarily mean they can remember every non-successful change they made the following morning when they update the problem ticket and incident report.
- Every app under the sun want’s to auto update itself these days, are you really 100% confident you have regression tested every combination of Windows Updates etc. – particularly security updates where there is a security compliance driver to get them out ASAP.
- There is a huge amount of inter-dependency in Windows infrastructures, apps rely on authentication, which relies on AD, which relies on a bunch of .DLLs, which rely on an OS, which relies on sub-components that rely on drivers, frameworks, runtimes etc. if you build from scratch can you really capture all of them – particularly on those hosts that have been around since the dawn of time and nobody knows exactly what they do or, sometimes where they physically (apart from that contractor that left last week :)). are but when they go wrong you know about it straight away.
- someone may have adjusted permissions or group memberships in AD or on a server at some point in the past in a way that would cause that app to fail but would not be picked up by just installing a server and the new app.
I’m not advocating spending so much time testing that the change you wanted to implement is out of date by the time you’ve finished; you’ll never catch everything but within this group of posts I’m hoping to propose a more pragmatic approach using copies of real systems and real data rather than just making some up and hoping for the best.
it’s always easier to install something in a clean environment than in an existing one.
*are you new here?
**For example; “Hi, are you the computer people? We’ve decided to re-brand the company this week and we need to change the Active Directory domain name thing to wizzynewcorpname.net, oh yeah and we need it doing at 1pm tomorrow in-line with the press release, thanks <click>” (yep, that’s a real example!)