Predictive analytics is one of the most popular IT terms of our day, and like the others (Big Data, Data Science, etc.), it’s often defined far too loosely. People who work in the field of predictive analytics, however, use the term fairly precisely and meaningfully. No one, in my experience, does a better job of explaining predictive analytics—what it is, how it works, and why it’s important—than Eric Siegel, the founder of Predictive Analytics World, Executive Editor of the Predictive Analytics Times, and author of the new best-selling book in the field, Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die.
Predictive analytics is a computer-based application of statistics that has grown out of an academic discipline that is traditionally called machine learning. Yes, even though computers can’t think, they can learn (i.e., acquire useful knowledge from data). Siegel defines predictive analytics as “technology that learns from experience (data) to predict the future behavior of individuals in order to drive better decisions.” (p. 11)
I appreciate the fact that Siegel doesn’t gush about the wonders of data and technology to the hyperbolic degree that is common today; he keeps a level head as he describes what can be done in realistic and practical terms. Here’s what he says about data:
As data piles up, we have ourselves a genuine gold rush. But data isn’t the gold. I repeat, data in its raw form is boring crud. The gold is what’s discovered therein. (p. 4)
And again here:
Big data does not exist. The elephant in the room is that there is no elephant in the room. What’s exciting about data isn’t how much of it there is, but how quickly it is growing. We’re in a persistent state of awe at data’s sheer quantity because of one thing that does not change: There’s always so much more today than yesterday. Size is relative, not absolute. If we use the word big today, we’ll quickly run out of adjectives: “big data,” “bigger data,” “even bigger data,” and “biggest data,” The International Conference on Very Large Databases has bee running since 1975. We have a dearth of vocabulary with which to describe a wealth of data…
There’s a ton of it—so what? What guarantees that all this residual rubbish, this by-product of organizational functions, holds value? It’s no more than an extremely long list of observed events, an obsessive-compulsive enumeration of things that have happened.
The answer is simple. Everything is connected to everything else—if only indirectly—and this is reflected in data…
Data always speaks. It always has a story to tell, and there’s always something to learn from it…Pull some data together and, although you can never be certain what you’ll find, you can be sure you’ll discover valuable connections by decoding the language it speaks and listening. (pp. 78 and 79)
Siegel demonstrates that you can embrace technology without becoming a drooling idiot sitting around the campfire singing Kumbayah and toasting the imminence of the Singularity while chugging homemade wine produced by an algorithm:
I have good news: a little prediction goes a long way. I call this The Prediction Effect, a theme that runs throughout the book. The potency of prediction is pronounced—as long as the predictions are better than guessing. The Effect renders predictive analytics believable. We don’t have to do the impossible and attain true clairvoyance. The story is exciting yet credible: Putting odds on the future to lift the fog just a bit off our hazy view of tomorrow means pay dirt. In this way, predictive analytics combats financial risk, fortifies healthcare, conquers spam, toughens crime fighting, and boosts sales. (p. XVI)
This is a great introduction to predictive analytics. It won’t teach you how to develop predictive models, but it surveys the territory, explains why it’s worthwhile, and points you in the right direction if you want to claim some of this territory as your own.