One question we frequently get asked is “What is Machine Learning, and how is it different than Predictive Analytics?” As with any hot technology trend, Machine Learning has started to become a buzzword that is both misused and misunderstood.
The right question is not “How is Machine Learning different than Predictive Analytics?”, but rather “How can Machine Learning be used for Predictive Analysis, and how is it different than classical statistics?” Said differently, Predictive Analytics is a use and Machine Learning is a technique. To answer this question, let’s decompose this question into it’s component parts.
3 Components that Underlie Predictive Analytics
Let’s start with understanding “Predictive Analytics.” This term originated as an evolution from “Descriptive Analytics,” or just plain “Analytics.” Descriptive analytics refers to the process of distilling large amounts of data into summary information that is more easily consumed by humans. Example techniques used in Descriptive Analytics include counts and averages to answer a question such as “What were my average sales by region last quarter?” By its nature, descriptive analytics is a backward looking view at “what happened.”
As a natural progression, Predictive Analytics attempts to answer the question “what might happen in the future?” In common usage, Predictive Analytics typically applies more advanced classical statistical techniques such as linear regression to answer a question such as “If I increase my advertising spending by 10%, how much will my sales increase next quarter?”
In his HBR article “A Predictive Analytics Primer,” Tom Davenport does a nice job laying out the three basic components that underlie predictive analytics:
The Data: A predictive model is only as good as the historical data that underlies it. Google's Chief Economist Hal Varian was famous for saying that Google doesn’t have better models; it just has more data.
The Statistics: This is the set of mathematical techniques, ranging from basic to advanced that are applied to the data to derive inference, meaning, and insight. The most common statistical technique used in predictive analytics is linear regression, which the author nicely describes as the iterative process of selecting and testing the impact of variables on the outcome.
The Assumptions: These are the things that are presumed to be true, with the most common being that the future will continue to be like the past.
Why Machine Learning Holds So Much Power
With this framework of understanding predictive analytics, we can now contemplate why machine learning holds such potential power. Specifically it is the difference between classical statistics and machine learning techniques. The fundamental difference is that the former relies on a human expert to formulate and test the relationship between cause and effect, i.e. the hypothesis that advertising is a driver of sales.
Machine learning flips this process on its head; it starts with the outcome (i.e. how much were my sales) and teaches a computer to automatically uncover the factors that are driving this particular outcome. These relationships may be incredibly complex, including hundreds of possible causes, interactions, and non-linear responses. If done properly, the result is a far more accurate predictive model that has the ability to automatically adjust and improve over time.
What outcomes are you trying to predict? Learn how to uncover the hidden patterns that drive those outcomes today.