US Election data — embracing uncertainty when making predictions

Andy Cotgreave, senior director at Tableau Software, explores why people shouldn't take data, like from the US Election, at face value.

Elections can bring out the best and worst in how data on opinions and issues are presented and promoted. This is mainly because there is an inherent bias in many data sets; and of course, because so much is riding on results.

The 2020 US Presidential Election put polling data under the microscope especially as pollsters seem to have made predictions that have not matched reality. To be fair, ahead of the election, many of the US Election poll trackers came with health warnings about room for some uncertainty because of perceived errors in 2016. Prior to election day there was a determination that pollsters did not want people to come away with an expectation that one candidate or another “would” certainly win.

Nonetheless, with polling data failing to spot the narrowness of the election, can lessons be learnt from the 2020 US Election about how data is presented to help interpret outcomes that influence decision-making? How can these learnings be applied to how business data science can be affected by unreliable data? And, how can we guard against this?

Embracing uncertainty

Predictions are uncertain. The US Election proved this once again. In polling, response rates are lower than ever before. People possibly don’t accurately share who they are planning to vote for. Poll trackers try to account for this in their modelling.

In this year’s election cycle, they were less keen to make a solid prediction, instead encouraging you, the reader, to engage in the data more closely. For example, The New York Times’ election tracker didn’t make the final call: it asked you to choose the toss-up states yourself in order to see who would reach 270 Electoral College votes. Fivethirtyeight very deliberately didn’t show the prediction chart until you had scrolled down past descriptions of the models, and written commentary about the state of the polls. In both examples, the sites were forcing the reader to consider the uncertainty of the data.

This approach can and should be applied in your organisation too. To build trust in your data, it’s imperative that the audience you are presenting to understands the levels of uncertainty that the data may contain.

Election hacking: is it the end of democracy as we know it?

Election hacking is a phenomenon that threatens the very fabric of democracy. To combat this threat, cyber security must be front and centre. Read here

Context enables engagement

On the night of the election itself, TV audiences needed clear visualisations that provided context in order to follow the race closely. I live-tweeted my reactions to the TV networks’ coverage through the night, and my final reaction: underwhelmed. I found that although the TV Networks employ incredibly talented anchors (such as the indefatigable “chart-throbs” John King on CNN and Steve Kornacki on MSNBC), the visuals on their telestrations boards didn’t help the audience see or understand their stories.

Through the night, they talked about individual counties. We would see a US map, and then zoom into the county. That’s good, because geographical context is important.

At this point, as the anchor begins their commentary, the physical location is secondary. The anchor described three data-informed points every time they zoomed into a county: what’s the current situation between the candidates, how many votes are counted, and how does this compare to 2016?

Did the screen show this? No, it was still the geographical map, with, at best, the current standing for each candidate.

It is not hard to imagine how to tweak the screen display to support the anchor’s conversation and get the insight to the audience a lot faster. I explored different options in my latest piece for Nightingale, the journal of the Data Visualization Society. When the visuals strongly support the words of the presenter, the inherent uncertainty of the fast-moving election situation is more readily understood by the audience.

The same principle should be applied to any data communications you do inside your business. Upon showing a chart (in a presentation, on a dashboard, in a publication) consider what question the chart is intended to answer. Then consider how to show the answer in as efficient a way as possible. It is a great discipline to stop and make this a part of every data-driven presentation you prepare for. Sometimes asking colleagues is the only way to get an honest appraisal of this.

Interrogate to uncover truths

Polling data and election night coverage reveal the challenges inherent in any organisation’s data strategy. It is tempting to take data at face value, but actually we need to be fully aware of the boundaries of certainty, the assumptions being made, and the progress towards goals.

As individuals, we should not take any data, not just that from elections, at face value. Instead we should use them as ways to guide conversations. As we build data strategies for our own organisations, it’s important to be informed by data, not expect it to make our decisions for us.

Avatar photo

Andy Cotgreave

Senior Director and Data Evangelist at Tableau.

Related Topics

Data
Data Visualisation