It’s hard to find a sexier job these days than data scientist. Demand far exceeds supply and has for some time now. But what if you’re one of the organizations lucky enough to land one of these coveted individuals? Are you home free? Is it only a matter of time before you start to reap massive rewards from Big Data?
In a word, no. For several reasons, even vaunted data scientists can only do so much. In this post, I’ll look at several factors that constrain many organizations that employ proper data scientists.
The Limitations of Data Scientists
Data scientists don’t bring with them magical tools to purify enterprise data. They don’t magically look at a datasets and de-dupe records, provide key metadata, and purge bad data with a wave of their wands. Lamentably, many organizations still don’t know how many customers, products, and employees they have.
Now don’t get me wrong. This doesn’t mean that data scientists are impotent here. Rather, all else being equal, the better an organization’s internal data, data management practices, and governance, the more it will get out of its data scientists.
Better data equals better data scientists.
Data scientists know how to use new and sophisticated tools like R. But they don’t link themselves to data warehouses and start spitting out clusters and nodes. They don’t plug in zip drives that deploy Hadoop or columnar databases throughout the entire organization. To be successful, data scientists need the ability to access and analyze vast troves of information. If you think that your legacy data warehouse or relational database is good enough, you’re probably wrong.
This is perhaps the key ingredient in maximizing the likelihood of data scientists’ success. As my friend Melinda Thielbar recently wrote, “Great customers may not know what they need, but they’re willing to find out.” She continues:
A great customer understands that they’ve employed us because they have a problem that needs solving. They work with us to define the project scope early, and they are open to conversation during the project. A data science project always involves a bit of the unknown, and while clear expectations at the beginning are important, re-checking assumptions and explanations is vital to the process.
Spot on. Organizations afraid of the unknown are unlikely to benefit from the knowledge, skills, and perspectives that data scientists bring to the table. Historically, science has progressed steadily, but not in a linear fashion. Employees–especially senior leaders–who insist upon complete certainty prior to proceeding will more often than not stifle creativity and innovation.
Yes, data scientists matter. Before you make the considerable investment of hiring one, however, heed the advice in this post.
What say you?
I wrote this post as part of the IBM for Midsize Business program. It provides midsize businesses with the tools, expertise, and solutions they need to become engines of a smarter planet. I’ve been compensated to contribute to this program, but the opinions expressed in this post are my own and don’t necessarily represent IBM’s positions.