How is a data centre like a software development house? It sounds like the beginning of a bar bet, but the truth is that the two have a great deal in common. Besides the long hours, the need for pizza at odd hours of the night, and the always-on air conditioning in the office, both organisations have to tread very lightly when making changes to their work.
When writing software, programmers test all changes against the previous versions of their work to make sure that the changes have not damaged the delicate balance between different elements of a program, and a program’s interaction with a computer or network.
The best way to do this, professionals agree, is via regression testing – where programmers test changes to code in software and compare those changes with the performance of software before the changes were implemented.
>See also: What does 2017 have in store for the data centre industry?
It’s the best way to counteract the “butterfly effect” – where even a slight change in one part of the system” can throw an entire project out of whack – whether it’s a program with thousands of lines of code that needs to interact with a computer, network, and user input, or a data centre that has thousands of programs and hundreds of pieces of hardware and peripherals that need to work in tandem.
For programmers, regression testing represents an important tool to ensure that changes to code do not negatively impact the performance of the program. Using regressive testing tools, programmers can test changes to the code they are working on and examine the performance of the program before and after those changes. And the regressive testing tools in use today are automated – allowing for the examination of changes for each line of code changed, added, or removed, a task that would be impossible without automated tools.
For example, an application that enables medical staff to update details of patients in a health clinic could include buttons that allow for Adding, Deleting, or Saving information. The program does what it is supposed to do, but now management has decided that it wants a refresh function added to update a record in real time, Programmers add that module, and it works as well – but overall performance of the program suffers. Why? And what can be done about it?
With regression testing tools, the programmers can evaluate the full impact of changes wrought by the addition of the module on the rest of the computing environment, and on the rest of the program. If a change negatively impacts the performance of a program in any way – if, for example, a change to the user interface results in a delay in connecting to the Internet – programmers know right away, and can correct or revise the changes they made in order to ensure optimal performance.
>See also: Data dependence: the need for a reliable data centre
Data centre managers face the same dilemma. The addition of new software, services, equipment, and clients to the “mix” can impact the performance of the whole, creating havoc – not just for a group of programmers working on deadline, but for thousands of workers in hundreds of firms that have migrated most of their major operations to the cloud, which resides on the servers of the data centre.
An automated regression testing system – where the impact of each change or addition is analysed, and performance pre- and post-change is examined and compared – could save data centre managers, along with their cloud customers, a great deal of anguish, frustration, and money.
Could something as simple as a software upgrade to a single system – one of thousands in a data centre – really have such an impact? Ask the folks at the New York Stock Exchange, who found themselves facing a major outage on July 8, 2015, a day that will live in infamy for the many traders who were stuck for hours while the world’s most important stock trading platform was out of commission.
>See also: Fortune telling: what’s in store for the data centre in 2017
A subsequent examination proved that it wasn’t something as scary – or sexy – as hackers or cyber-terrorists that had brought the market to a halt, but a “glitch” in the rollout of a new version of software used by the NYSE. According to the Exchange, “the rollout of a software release” that was loaded onto computers “not loaded with the proper configuration compatible with the new release” caused the outage. Although not a data centre per se, the NYSE’s platform is depended upon by a great many people – just as a data centre is.
The same applies to any change. Outages are far too common at data centres, and can occur for any number of reasons – even something as simple as adding more storage or applying routine patches and updates can cause one, and incorrect driver and firmware configuration, or patches inconsistently applied, could leave systems exposed.
Of course, many changes performed to the data centre are far more complex – for example, major updates of the virtualisation software that involves adapting to hundreds of new vendor best practices, and, if incorrectly designed and performed, might affect thousands of VMs. With millions of possibilities, many of them interconnected and with mutual dependencies, there is no way for a human being to figure it out.
>See also: The small data centres’ renewable journey
In fact, according to a recent study by the University of Chicago, the most common reason for outages in organisations is – “unknown.” If a regression testing approach were adopted for IT change validation, a process could be established to thoroughly assess the correctness of each change by validating that no critical vendor best practices had been breached, that changes were consistently applied throughout the entire datacenter, and that important resiliency KPIs were still met.
Of course, given the complexity of a modern data centre and the frequency of change, it would make sense to fully automate regression testing – with a workflow that flags each deviation in the IT change control environment. An approach like this could guarantee that even small changes that might escape the attention of workers will be tested individually.
Sourced by Yaniv Valik, VP Product of Continuity Software
The UK’s largest conference for tech leadership, Tech Leaders Summit, returns on 14 September with 40+ top execs signed up to speak about the challenges and opportunities surrounding the most disruptive innovations facing the enterprise today. Secure your place at this prestigious summit by registering here