In my analytics capstone class, we use Analytics: The Agile Way. Indeed, I wrote that book in 2017 primarily because the existing text was lacking. No, my book isn’t short, but one can hardly cover everything that analytics students need to know in a 300-page text.
To this end, I liberally—and legally—incorporate excerpts from other thoughtful websites and tomes during my 13-week course. Here are the latter.
Everything Is Miscellaneous
I certainly didn’t fully appreciate metadata when I was a senior at Carnegie Mellon. Sure, my own book devotes a few pages to “data about data.” Still, in this regard, it doesn’t hold a candle to David Weinberger’s Everything Is Miscellaneous: The Power of the New Digital Disorder.
I usually give students the book’s prologue—titled “Information in Space.” I specifically want the students to think about difference between the physical and digital worlds. It’s particularly important for them to recognize that you can put a physical item in one location—and one location only. This is true with albums, books, and products in brick-and-mortar stores. That same limitation, though, doesn’t exist in the digital world. If I want to label this post with two categories and six tags, I can do so. This dramatically increases findability.
In theory, many of us can access largely the same data. Strangely, though, most experts aren’t terribly accurate at making predictions. The obvious question is, Why?
Enter Superforecasting: The Art and Science of Prediction, Philip E. Tetlock’s excellent book answering that very question. As we cover in my course, descriptive analytics are critical in understanding the past, but the holy grail many data-driven organizations is predicting. Tetlock adroitly summarizes the potential benefits of curiosity and ignorance. Those who believe that they already understand the subtext of complex business, geopolitical, and social issues often find themselves grossly wrong. Confirmation bias is alive and well.
The Signal and the Noise
In keeping with the prediction motif, students will need to refine models over time when turning data into analytics. Those who believe that this process is linear are in for a rude awakening.
Nate Silver’s epic Why So Many Predictions Fail—but Some Don’t discusses the benefits of oft-maligned Bayes’ Theorem. No, it may not be perfect, but there’s a reason that it just won’t die. I am especially fond of Silver’s examples and his use of iterative methods to improve model accuracy—a key point in my own book.
Note how these books focus on concepts and not tools such as Python. Of course, the latter matter, but properly framing a question and thinking about methodology are more important at the start than just reflexively analyzing data.
What say you?