Solving the data quality conundrum: how to go from big data to trustworthy data

Lets consider the classic meeting scenario where a Sales Director presents the company’s most profitable customers to the management team.

Before she has got even halfway through her first slide, someone interjects to point out that Cable&Wireless Worldwide is no longer trading and General Electric reduced its regular order significantly last quarter.

People around the table have spotted errors in the data and trust in the analysis is immediately lost – faces around the table tell the story all too clearly. How can that team go on to make an effective decision?

> See also: The key steps to achieving data quality over quantity

Similarly if you’re a data analyst you’ve probably experienced the often tense relationship that can develop with IT colleagues that are integrating and preparing data on your behalf.

Comments such as 'I thought we asked for all three customer databases to be brought together for this analysis' or 'sorry for us to extract, transform and load that much data in different formats it’s going to be another three months' are not unusual.

These scenarios highlight two of the most pressing problems preventing companies from making effective, data-driven decisions. Business leaders are witnessing the impact of poor quality data; data analysts and IT managers are struggling with the challenge of pulling disparate data together from multiple sources.

In a world where companies must make increasingly rapid decisions, both of the above can: frustrate, hamper and sabotage even the best-laid plans.

Making sense of a company’s data typically requires individuals in different roles with different skill sets. The business executives that know what questions to ask of the data, data analysts that hunt through data in order to answer those questions and finally there are IT people that ensure data is prepared and ready for analysts to work with.

While everyone has a role when it comes to preparing data, the challenge comes when empowering all groups with the processes and tools they need to collaboratively manage data.

'Big Data' has of course made the opportunities (or the problems) even bigger, with the promise of unprecedented insights into markets, customers and operations.

But what really matters is trustworthy data – be it 'big', 'regular' or 'small'. And the vast majority of decision makers know they can't trust the quality of the data, or the way it has been manipulated, before it's presented to them as 'intelligence'.

This is not a new problem. It is decades old. Yet until very recently the overwhelming majority of organisations have ignored it. Why?

Put simply, it is because the process of determining which data is required for analysis, pulling it together, preparing it for analysis and then actually producing the analysis, is immensely complex and difficult in any organisation larger than a few dozen employees:

Responsibility for data access, ownership, storage, quality, compilation, preparation and analysis is generally scattered across multiple organisational functions

The sheer detail and complexity of large corporate datasets, coupled with a widespread lack of data governance, quality and management processes, make it effectively impossible to achieve any meaningful collaboration between these owners

Traditional technologies do not support the efficient processing of large or increasingly complex datasets that contain both structured and unstructured data.

After many years of making-do, IT, Data and BI functions are now starting to work together to re-think traditional ETL (Extract, Transform and Load) techniques to deliver new, streamlined, automated processes that turn data into highly trustworthy intelligence – without the endeavour being stressful.

> See also: How data quality analytics can help businesses 'follow the rabbit'

These initiatives are driven by a realisation that huge value can be gained by empowering business analysts to undertake the complete data preparation, quality and analysis process in its entirety.

A new generation of technologies that enable analysts to search for data across the enterprise, join data sets, normalise, filter and transform data all through point and click interfaces now mean the reliance on IT colleagues to undertake such legwork is all but a thing of the past.

Analysts can now rapidly find and combine the data they need before presenting it for business executives using data visualisation tools such as Tableau and QlikView. This leaves IT colleagues more time to focus on managing the systems that hold or run on data as well as setting appropriate access controls for accessing data.

Sourced from Ed Wrazen, VP product management, big data, Trillium Software

Avatar photo

Ben Rossi

Ben was Vitesse Media's editorial director, leading content creation and editorial strategy across all Vitesse products, including its market-leading B2B and consumer magazines, websites, research and...

Related Topics

Analytics
Data