Big Data and Confirmation Bias

Tips on overcoming our own foibles and limitations.
Jan | 5 | 2016

Jan | 5 | 2016

“If you torture the data long enough, it will confess.”

—Ronald Coase

The new year has arrived. As such, it’s tough to call Big Data new anymore. (It’s been on my radar for a tad under five years, and Wiley published Too Big to Ignore in March of 2012.) Recent technologies such as Hadoop have matured during that time. Still, tech alone only gets us so far. The need for education on the topic is as strong as ever—if not more so.

Put differently, there’s no shortage of widely held myths around Big Data. Perhaps the most dangerous is that Big Datsaysa knows all and that it obviates the need for human judgment. The almighty “data” will unequivocally tell us what do do and when and how to do it. In this way, data is like Gabbo from The Simpsons.

Nothing could be further from the truth, but don’t take my word for it.

The Essential and Oft-Ignored Human Element of Big Data

Speaking at the recent Rich Data Summit, Nate Silver of 538 said, “Data is a lot messier and noisier than people want to acknowledge.”

I’ll let that sink in for a moment.

As anyone with a modicum of statistics knowledge knows, even mature, ostensibly “objective” statistical and quantitative methods such as regression analysis don’t run themselves, even on small datasets. They require key human elements (read: judgment and decision making). This is why there’s a world of difference between an analyst and a true data scientist. Because of this, they are far from perfect. With regard to regressions, frequent errors from newbies include:

  • Neglecting key independent variables.
  • Stating that a relationship exists among variables when one does not (and vice-versa).
  • Getting the causal chain completely wrong. (For instance, saying that A causes B when B causes A.)

What’s more, we make these mistakes both inadvertently and intentionally. (For more this, see Eli Pariser’s excellent book The Filter Bubble [affiliate link].)

This begs the question, How do we square this circle? How can we realize the legitimate benefits of Big Data while minimizing the chance for error?

Simon Says: Big Data and confirmation bias go hand in hand.

Big Data does not obviate the need for human judgment.

Remember the following when getting started with Big Data. First, recognize that confirmation bias is alive and well. If you’re intent on finding something, you will. The more important question is, What else are you missing?

Second, question everything. Far too few people are willing to go where the data takes them. This is especially pronounced as they ascend to senior levels within organizations. Many senior folks are loathe to challenge preexisting assumptions and to question what they know. At the same time, though, Big Data does not negate or minimize the importance of intuition.


What say you?

Receive my musings, news, and rants in your inbox as soon as they publish.


Blog E Data E Big Data E Big Data and Confirmation Bias

Related Posts

The Wild Wild West of Analytics Programs

s I write these words, I'm in the midst of teaching my fourth year of analytics courses at ASU. To be sure, it feels longer than that. That's probably because, during this time, I have done more than merely fulfill my 4/4 teaching load. I wrote a...

Book Review: Wonder Boy

In 2011, I moved from NJ to Vegas. It didn't take long for me to hook up with the Vegas tech scene and the Downtown Project. Over the course of my five years in Sin City, I attended events, spoke at Zappos's HQ a few times, met plenty of smart cookies, and learned a...

Thoughts on Reaching 500 Google Scholar Citations

Introduction Back when I started writing books in 2008, I largely ignored whether academics cited my work—much less how often. In the whole scheme of things, it just didn't seem to matter at the time. This feeling continued well into 2014. Although I knew that a...


Submit a Comment

Your email address will not be published. Required fields are marked *