When two of the UK’s financial services giants, Halifax and Bank of Scotland, merged in 2001, IT executives at the new entity were only too aware of vast volumes of data they were bringing together. After the initial surge, however, they had anticipated a more modest rise in the quantity of data under management. Instead, it has kept on growing at an unbridled pace of over 60% a year.
In an effort to bring that under control, managers have embraced a concept that has been often talked of but rarely adopted comprehensively: information lifecycle management (ILM) – the classification, storing and movement of data according to its value to the organisation.
Before starting its ILM project, though, HBOS first needed to know where it stood, says Bob Sibley, senior infrastructure architect at HBOS.
As of early 2006, Sibley estimates that HBOS has somewhere in excess of a petabyte of data stored in its data centres (there is even more in remote locations), with total volumes growing at around 63% a year. Around 400 terabytes of that is associated with mainframe systems, and a further 600 terabytes of data relates to its open systems.
To date, it has been “cheaper and easier” for business units to “just add more storage, rather than think about how to manage [the data] in the long term”, says Sibley. But, in common with many other organisations, HBOS has naturally been practicing some aspects of ILM for many years.
On assessing its ILM credentials against a five-stage ILM model, it ranked itself as between the third and fourth levels – “somewhere between ‘proactive’ and ‘optimised’,” says Sibley. “But we’re a bank; we anticipated that we’d be reasonably sophisticated.”
A more sophisticated level of ILM relies on the classification of all data, so that the most essential data can be stored on always-accessible high-end arrays, while data that is deemed of little importance can be streamed off to tape archives.
But for that to be feasible, users have to accept a new discipline. “Getting the users to classify their data is really hard,” says Sibley. It requires line-of-business managers to accept that some data “isn’t so important”– something that is met with innate resistance.
Then there is the question of compliance. “Finding out from our business units what the regulatory requirements are for [storing] data is a nightmare”, says Sibley. The confusing and sometimes contradictory array of regulations makes it simpler for managers to keep everything, permanently, rather than work out what actually needs to be kept.
One solution is to make them pay. Currently, HBOS’s IT infrastructure group uses a chargeback model for its services and as managers begin to be charged per gigabyte of stored data used per month, some units will find that their storage costs rise. Getting the managers facing an increase in costs to buy-in to the scheme remains a challenge, says Sibley.
Despite the difficulties, the way data is handled within HBOS has to change, says Sibley. The ultimate goal is to automate data movement, so that data can be moved on to an appropriate storage level as its value deteriorates. “This will be done by defining policies and doing it manually until the process is proven. Once we’ve got that trust we can look to automate it.”
The transition to ILM is still in its infancy, admits Sibley. But HBOS “has so much data that we need to manage it better, and ideally we need it to ‘self-manage’’,” he says. If you have a petabyte of data, ILM seems inevitable.