An insurance company that recently wished to undertake some cross-site data replication for disaster recovery and business continuity purposes found that it couldn’t achieve the network performance it needed to back up its data.
To overcome this issue, the firm manually backed up its data to disks and then to tapes, before manually transporting them to the organisation’s other disaster recovery data centre.
This kind of manual process is very slow and it requires more resources than an automated back-up necessitates over a virtual private network (VPN) or a wide area network (WAN).
“People are doing cross-site replication, but there are two flavours of it,” says David Trossell, CEO of Bridgeworks. In his view the first flavour is a “continuous drip feed between storage arrays for failover capabilities – whether that be synchronous or asynchronous replication; and then the next is a security back-up to fall back on whenever a disaster occurs in order to recover the situation”.
Under consideration
Companies want to cross-site replicate. The problem is that it’s something that’s more under consideration at the moment than put into practice – this is down to the speed of the networks.
Clive Longbottom, client services director at analyst firm Quocirca, adds that cross-site replication for business continuity through live synchronisation is considered to be too expensive for the majority of companies.
“Therefore, most still go for disaster recovery, using more of a batch-mode copy function, and images and incrementals to provide the capability for restoring data to a prime site if it’s ever required,” he says.
>See also: How to build a big data infrastructure
“With modern capabilities there are several things that fall in between these two extremes: one is to replicate the data remotely to a known point (semi-synchronous to avoid the replication of any data corruption), and the other is to hold application image files in storage as well.”
This means that when a disaster occurs the images can be rapidly spun up and run against the data that’s stored on the second site. He therefore claims that this is a relatively cheaper approach than a “full hot-hot business continuity strategy”.
Performance challenges
So why isn’t it possible for some customers to gain the performance they require to back-up growing amounts of data with their existing WAN optimisation tools?
According to Trossell, a lot of WAN optimisation products are doing dedupe and compression, which is a core component of the product – and many won’t handle block storage data. He therefore found that there are few that can cope with the cross-site replication transfer requirements.
“It’s going to vary but we are talking gigabytes up, and – as well as network latency – the availability of high-speed WAN is also limiting,” he explains.
With these points in mind, Longbottom seems to concur. “If a company is heavy on images, voice, video or encrypted data,” he says, “then WAN acceleration is pretty useless as it tends to major on deduplication, compression and packet shaping, which don’t work on these data types.”
In his experience, caching is the only area where it works, and yet he believes this is not particularly useful when it comes to developing and putting in place a disaster recovery and business continuity strategy.
Making a difference
Yet a difference can be made with the right solution. So how did the insurance company manage to change its approach to cross-site replication? With 2 x 10Gb links, they were able to move 2.2GB of data per second in each direction at the same time.
The company had invested in dark fibre due to its perception of data movement restriction,
By removing its dark fibre, which it had previously deployed due to its perception of data movement restrictions, the company reverted back to its two original 10Gb pipes.
Its equipment restricted its ability to do high-speed transfer across sites, but using machine intelligence it was able to monitor the data from ingress to egress and everywhere in between – the software automatically made the necessary adjustments.
This highlighted problem that is common with IT vendors when data can’t be injected quickly enough. Trossell explains that it’s not good enough to just have the connectivity because on its own it won’t allow organisations to analyse the data flow across a WAN.
The further away the two data centres are, the more latency there will be whenever cross-site data replication is enacted.
By deploying a solution for self-configuring infrastructure and optimised networks (SCIONs), the company’s data flow was able to run at a constant speed – improving its data transfer performance by 50-60% and enabling it to undertake cross-site replication.
Back-up comparisons
But with the insurance company using disks on-premise rather than back-up as a service (BaaS), the question arises about whether backing up to disk is as secure as its cloud counterpart.
“If you are backing up to physical media and it gets stolen, then everything is available to the thief who steals the drives,” says Longbottom, “or if it backed up to the cloud, then a middle man could gain access to the data as well.”
Trossell thinks that tape is the greenest and most efficient means of long-term storage. And while he is unaware of SLA guarantees around the speed and performance of BaaS, cross-site replication is nevertheless needed for security purposes as it provides more control.
>See also: Beneath big data: building the unbreakable
In spite of his comments, he adds that there is not one single solution that fits all purposes. In other words, an organisation could operate a cross-site replication strategy along with cloud back-up, as well as tape and security back-ups.
Longbottom adds that only certain types of data encryption can occur when the data is on the move as only certain kinds of WAN acceleration can act on the data.
“So the physical pigs (NAS-based storage units used to get the data from one data centre to another for at least the initial copy) can be more effective and secure than trying to copy large amounts of encrypted data across the WAN,” he explains.
To back-up, the insurance company used a snapshot to disk before it was backed up to tape and then to another tape before being sent to a remote data centre at another site.
This process took too long to complete because if it didn’t finish on time the company would take the first snapshot to architect around the network latency problem, and this would mean the data would be up to 24 hours old, Trossell explains. This was unsatisfactory.
However, there are methods that allow organisations’ databases to track any changes that have been made – such as redo locks – and he says that asynchronous cross-site replication can keep the data updated.