Cashing in on open data

In 2009, the UK government asked Tim Berners-Lee, the inventor of the ‘World Wide Web’, and Nigel Shadbolt, a renowned computer scientist, to help make non-personal public sector data openly available through a single point of access. That project led to data.gov.uk, the online portal for open government data in the UK.

Political commitment to open data survived the transition to the current coalition government. “The phrase ‘open data’ and its use in transparency were everywhere in the Conservative manifesto and so, when they came to power, the commitments they were giving about releasing more data were clearly going to continue,” explains Shadbolt.

That is because the cause of open data traverses the political spectrum, he says. “If you believe in the use of non-personal government data for the public good, then it becomes a little bit like clean air – why wouldn’t you have it?”

This is not to say that everyone is in agreement about how open data should be produced, managed and used. One bone of contention relates to the notion, present in the Labour Party’s policy but accentuated by the coalition government, that open data should provide private businesses with commercial opportunities.

On the eve of the 2010 election, now-Cabinet Office minister Francis Maude promised that open government data would “boost British jobs”. He cited research by Dr Rufus Pollock of Cambridge University which claimed that open data would create an estimated £6 billion in additional value for the UK.

There are many ways in which open government data can aid businesses. For example, government spending data reveals how much public sector bodies are paying for goods and services, allowing alternative suppliers to pitch for business.

Another approach is for businesses to charge for commercial services that are based on open government data. This is a more contentious issue, however, as Spikes Cavell, a financial analytics service provider to local government organisations, found last year after it set up a website that published its clients’ spending data online.

Spikes Cavell said that SpotlightOnSpend, which was a free add-on service for its paying customers, helped local government organisations meet their transparency obligations, but some open data campaigners disagreed. In particular, they objected to the format in which the data was published, which was non-machine-readable and in summary form.

At the time, open data expert and blogger Chris Taggart wrote that “this is not open data, not a desirable approach, will not achieve the results of transparency or of equality of access, and is not good for the public sector”.

The storm blew over after Spikes Cavell agreed to publish the data in machine-readable format. But the episode showed that the commercial exploitation of open government, despite cross-party support, is still a matter of ideological debate.

Shadbolt is equivocal. “One notion is that the taxpayer has paid for collecting the data once already, and if a company makes a product that makes revenues, who’s going to benefit?” he says. “But there’s another argument that says the exchequer is going to benefit – it’s going to take tax revenue from it, and that’s better than it sitting there unused.”

He supports the likes of SpotlightOnSpend benefiting commercially from open government data. “They are taking data that is being published every month by local authorities, putting their own additional information and analytics onto that, and selling it back as a set of services around intelligent procurement or business intelligence. Why shouldn’t they do that?”

However, there are many unanswered questions, he admits. “There is a sense that public data is a new natural resource, and we don’t necessarily know at this point how the value will be generated. But we’ve got a good idea that some of it has high demand and there are people who want to do innovative things with it.”

Opening up Companies House data

One of those people is Chris Taggart himself. Last year, he launched OpenCorporates, an attempt to make government data about businesses more usable and more valuable.

“I’m old enough to remember going to Companies House and doing a search [for business information] on the microfiche,” says ex-journalist Taggart. “The problem is that the way companies are structured today is sufficiently complex that the old system doesn’t work. Any company of significant size will probably have numerous subsidiaries, and larger companies will almost certainly have overseas subsidiaries.”

The inefficiencies of the current system can therefore conceal the actions of a business from public scrutiny, he says. “The reason we have company registers is so that society can take an informed decision about doing business with those companies, and see whether they are behaving in the way society wants.”

The first step towards fixing this problem, Taggart says, is to have a framework for company information to hang on – a ‘who’s who’ of companies. OpenCorporates is building this from sources around the world; in the UK, these sources include the London Gazette, a journal of record that has been published by the government since the 17th century.

Every one of the 20 million-plus companies listed on OpenCorporates is allocated a unique URL. This will make other datasets that are integrated with OpenCorporates more useful, Taggart argues. “One of the things that we’re importing at the moment is the data protection register (DPR), which every UK organisation that holds data in electronic form has to register with,” he explains. “We’re matching that to OpenCorporates, which will show us which companies haven’t bothered registering and what purpose the others are registered for. That information has always been kept in the DPR, but actually, separated from all the other information, it’s pretty darn useless.”

While motivated to improve the current company registry system, Taggart wants OpenCorporates to make money, not least in order to make it sustainable.

His approach is to publish the data under a “share-alike” licence. This means that anyone can use it for free, as long as they make their own work available under the same licence. If they want to use it behind closed doors, however, they must pay.

“OpenCorporates allows anyone to use the data commercially – we don’t mind if you surround it with ads – as long as they make the datasets available under the same share-alike licence that we do,” Taggart explains. “But if someone wants to have it as a closed database – for a mailing list, say – then they must buy a copy. If [credit rating agency] Experian wanted to use it, for example, they wouldn’t have to open up the rest of their database, they could just buy a licence.”

Taggart admits that this model is experimental, however. “We are right at the beginning of this, and as yet there aren’t any open data business models that are really proven,” he says.

How the government sells data

A further issue with the monetisation of public sector data is that the government is already at it. Selling valuable datasets is a source of income for many government departments, including the Ordnance Survey, the Met Office and the Department for Transport. And with IT opening up new uses of data all the time, government datasets that are not being sold today may be commercially viable in the future.

Earlier this year, the Cabinet Office identified this as a sticking point for open data. “Many state agencies face a conflict between maximising revenues from the sale of data and making the data freely available to be exploited for social and economic gain,” Francis Maude acknowledged in January 2011.

Maude’s solution is the establishment of a Public Data Corporation, a single entity that will manage both open and commercial releases of public sector data.

The PDC “will allow us to make data freely available, and where charging for data is appropriate [then] to do so on a consistent basis,” said Maude. “It will be a centre of excellence where expertise in collecting, managing, storing and distributing data can be brought together. [And] it can be a vehicle which will attract private investment.”

In August 2011, the Cabinet Office published consultation documents on how the PDC might operate. The documents discussed the PDC’s various options for charging for public data: it could maintain the status quo, but with a commitment to making more data available for free; it could charge a flat rate for all datasets that are deemed suitable for sale; or it could adopt a ‘freemium’ model, allowing all datasets to be used for free until some limit has been reached – be it the quantity of data or the numbers of users – after which it is charged for.

The concept of the PDC has not been universally welcomed. Michael Cross, a journalist who launched The Guardian newspaper’s Free our Data campaign back in 2006, questioned whether a monolithic public body is best placed to discover new ways of exploiting data, commercially or otherwise.

“Whitehall-created corporations do not have a strong track record of innovation,” he wrote in a recent piece for The Daily Telegraph. “The state no longer tries to earn revenues by building telephones and ships: what makes it think it can do so at the cutting edge of the knowledge economy?”

For the past few years, the open data movement has enjoyed cross-party support and has been surrounded by a general air of optimism. Lately, however, it has become clear that if the government is to derive the maximum possible value from its non-personal datasets, it has to decide who will be the recipient of that value and how it will be created. These are two questions that seem to divide opinion.

Perhaps the topic of open data is more political that it first appeared.

Beatrice Bartlay

Beatrice Bartlay founded 2B Interface, a temporary and permanent staffing agency in 2005 and has since been serving the UK recruitment sector with specialised services. With more than ten years’ experience... More by Beatrice Bartlay

Opening up Companies House data

How the government sells data

Beatrice Bartlay

Related Topics

Related Stories

How do you build an adaptable data platform?

Charting the AI-fuelled evolution of embedded analytics

Data maturity and the squeezed middle – the challenge of going from good to great

How to stop data mesh turning into a data mess

Related Stories

How do you build an adaptable data platform?

Charting the AI-fuelled evolution of embedded analytics

Data maturity and the squeezed middle – the challenge of going from good to great

Looking at the Earth with fresh eyes