Storage – not the most glamorous segment of the IT industry.
At least compared with UC’s (unified communications) distinctly-cultish promotion of abstract utopian concepts of technological connectedness, or IT Security’s ongoing three-way battle between the Good, the Bad and the User.
Storage, by comparison, is mundane and physical; unlike business intelligence, it does what it says on the tin.
In fact I’ve often (OK, not really) wondered at the reaction of EMC execs when faced with the dinner party question: “so, what do you do?”
To the layman, “storage” as a response probably conjures up images of besuited salesmen canvassing suburban neighbourhoods selling tupperware to bored housewives in floral print dresses; a necessary, indeed noble pursuit, but not one about to make the cover of TIME.
So it’s significant that two of the most profound IT developments in the last year have come from the storage industry: the rise of enterprise-grade Flash storage, and the validation of deduplication (dedupe), most recently dramatised in the frenetic and unprecedented bidding war between EMC and NetApp for dedupe leader Data Domain. Result: a win for EMC at a cost of $2.1 billion, and a house in the Caribbean for the lucky creator of a hashing algorithm.
When we wrote about dedupe in June last year, the mere thought of an appliance deleting data under their care was enough to cause palpitations amongst the data centre managers we spoke to. That role was one to be filled only by viruses and butter-fingered work experience students delivering coffee.
Perceptions have changed, because the dedupe concept is just so compelling. According to analyst firm IDC’s estimates, the amount of stored data is growing at the rate of 60% a year. Furthermore, a massive 75% of that data is thought to be replicated, across innumerable storage servers, tapes and local PCs, all sucking power, needing maintenance, purchase and disposal.
That hasn’t really been a worry until now. With storage so cheap and plentiful, when you ran out of room, you just bought more. However the curve showing stored data to capacity growth – the storage industry’s Moore’s Law eqivalent – is levelling, and businesses are realising that the inefficient storage of data wastes space, power, and can make compliance (more of) a headache.
Advanced dedupe technologies scan at a sub-file level, removing duplicate data and replacing it with ‘pointers’ to an original. Unlike inferior single-instance storage (SIS), which works at a file level, proper dedupe will recognise a company logo stored on all corporate correspondence, office stationery and every page of every presentation, and will store it only once for the entire organisation.
The effects can be dramatic. Vendors might boast of compressing databases up to 20 times, but the real benefits shine through alongside technologies like virtualisation that involve a large amount of replicated data: with VMware, dedupe compression rates of 40-60 times are not uncommon.
There’s big potential savings behind numbers like those – and big profits – which is why EMC pursued Data Domain so rabidly: “Deduplication is going to exist in every part of your business… in many forms,” said EMC CEO Joe Tucci, a full year ago.
By dramatically increasing the efficiency of storage, dedupe also meshes well with the storage industry’s other recent epiphany: enterprise-grade Flash. Flash is great stuff – no moving parts, predictable lifespan, lightning-fast read speed, durable and requires next to no power. The catch is that it’s far more expensive per gigabyte than conventional spinning disk, and likely to remain so for the foreseeable future.
With dedupe optimising what’s stored, Flash becomes even more compelling, further justifying EMC’s ‘whatever they’re paying you, we’ll double it’ approach to acquiring Data Domain.
You can bet that company’s execs now have no problem explaining what they do at dinner parties: Golf. Pass the canapés.