The 530 miles of shelving in Washington’s Library of Congress, the largest library in the world, hold the equivalent of 10 terabytes of information in text form. Search engine turned online media giant Yahoo receives the same amount of data every single day.
Yahoo uses this information, mainly click path and search logs, to help its advertisers target marketing material and advertisements to users with unmatched accuracy. This places data mining – finding salient patterns in data – right at the centre of its business model.
Yahoo’s chief data officer, Dr Usama Fayyad, gives his team of data analysts a free reign over the scope of the work they undertake, only setting them targets for the value they are expected to create for the business. Supporting this flexibility of analysis in the face of so much data has meant Yahoo has amassed an analytics tool box made up of products from a wide variety of sources – both external and internal.
Analysis is conducted using Oracle data marts which are extracted from the vast Yahoo data warehouse. Interrogation of these data marts is carried out using analytical tools from a supplier list that includes SAS, MicroStrategy, DecisionTree and Cognos. Even with this mix, some analyses require the use of in-house developed tools. “Whatever we can’t buy, we build,” says Fayyad.
With a well equipped staff exploring correlations and patterns in the data, Yahoo can work with advertisers to target those customers most receptive to advertisers’ messages, and evaluate those customers who will be of greatest value.
Working with a wireless telecommunications carrier recently, the Yahoo data team built a model to predict any given users’ propensity to switch carriers and their prospective life time value, based upon click and search data (unlike his contemporaries at Google, Fayyad insists the content of users’ email should not provide the basis for analysis). This model provided a profile of key customers and determined what adverts the telecoms company would push to them and where these ads should appear on the Yahoo site.
“The beauty of this,” says Fayyad, “is that underneath there is some extremely complicated analysis, but any business person can easily understand the results.”
Yahoo also applies its analytics capability to the optimisation of its own services. When investigating what factors induced loyalty in users of its instant messaging service, analysts found that loyalty jumped when users progressed from four to five regular correspondents.
“This information gave the sales team the goal of getting each user to sign up five friends or colleagues,” explains Fayyad. Through the detailed analysis of user behaviour, Yahoo was able to optimise its sales targets to gain lasting value from users.
As chief analyst at a site which is used by 70% of the Internet population, Fayyad is in a position to be highly critical of what he sees as the widespread abuse of web analytics.
“Often web analytics tools are used to find out which third-party sites refer to a business’s own home page,” he explains. “That business might find they are receiving many times more visitors from site A than site B. At the moment, this is how deep the analysis goes in most cases.”
This information might inform where and how the business advertises, but simplistic metrics such as site referrals are unlikely to serve the business as well as more penetrating analysis, he says.
“If you do more research, you might find that the visitors from site B are more likely to purchase an item, and then deeper still you might find that purchases made via site B have a far greater margin on them than from site A. The deeper the analysis, the more valuable it will be.”
Fayyad believes that data analysis will become so central to business that his role of chief data officer (CDO) will become commonplace – and reflecting that critical role, other CDOs will be drafted onto their company’s board, as he has been. “Data is becoming one of the primary senses for the business, and its proper place is on the executive table,” says Fayyad.
That is going to change the relationship between data and IT. For businesses to derive the untapped value from their data, argues Fayyad, they must see it as a distinct function from IT. “Utilisation of data is still an art, but a lot of companies relegate it to the IT department which is operational. If you take it out of IT, it solves a lot of problems.”