Predictive Analytics: An Overview

by Plaster Group’s Data & Analytics Team

Meet Predictive Analytics

  • “The Dow Jones Industrial Average will rise above 20,000 by the end of 2014”
  • “Our Year over Year Revenue will be 15% higher in 2012 than 2011”
  • “The San Francisco Giants will win the World Series in 2014 and 2016.”

Which of the above predictions seems the most implausible? The first one has been seen on significant financial news web sites, with little to back the assumption that the current rate of increase (at the time) of the Dow Jones value would continue throughout the year. The second prediction was actually the basis for a major retail company’s planning and budgeting, even though the previous two years of recession should have indicated a more modest projection. The last one may seem silly, but it may turn out to be as accurate as the previous two, and it is based on past performance.

Predictive Analytics is the science of using historical data analyzed through descriptive statistics to develop a model (or algorithm) to make predictions about the future state of a given topic. Whether your company is making calculated predictions based on rational analysis of its performance, its customers’ behavior or conditions in the marketplace, your company is still making decisions based on predictions, they just may not be scientific and probabilistic. Whenever you order future inventory, create a hiring and training plan, or plan to expand and open new stores, you are making predictions that business will continue to grow. However, if you are simply ‘guestimating’ or making a gut call, and you are significantly off, you may leave potential revenue ‘on the table’ or you may incur unnecessary costs.


Predictive Analytics is part of the broader discipline of “Data Science” (the definition of which is still being actively debated). Data Science can include data discovery, data management, descriptive statistics and probability, machine learning, pattern recognition, data visualization and other disciplines. But predictive analytics attempts to utilize these techniques to develop models for determining the probabilities of various outcomes given a certain set of input variables. Some common applications of predictive analytics include:

  • Marketing and Sales departments analyzing customer segmentation for targeting of direct marketing campaigns, cross-selling and up-selling and customer retention efforts
  • Operations and supply chain teams forecasting inventory levels, resource allocation, distribution models, etc.
  • Economic forecasting, risk analysis and fraud detection, employed extensively in financial services

The most common and familiar forms of predictive analytics techniques are: .

  • linear regression
  • multivariate regression
  • correlation and cluster analysis
  • nearest neighbor analysis
  • time series analysis

Some more advanced techniques gaining popularity are:

  • network analysis
  • market basket (or affinity) analysis
  • geospatial distribution modeling

While it is not necessary for a data analyst or data scientist to know the intricate details of the mathematics of the methods they apply, they should be well aware of the proper application of these methodologies and the pitfalls of their misuse.

Enter Big Data

The field of predictive analytics is converging with the advent of “Big Data” to provide companies with unique insights into customer behavior at price points and systems availability never seen before. Companies now have the ability to analyze web traffic, mobile application data, as well as billions of traditional point-of-sale transactions in volumes and with methods that were out of the reach of most IT departments in the not too distant past. However, business stakeholders that request these analyses and the data scientists that perform them should keep in mind the diminishing nature of the confidence intervals achieved by processing millions or billions of additional records. It is quite possible that a much smaller randomly selected sampling would yield adequately reliable results.

The technologies which comprise ‘Machine Learning’ leverage the results of predictive analytics as input to develop predictive models. Systems can then accept changing variables as parameters to launch automated programs with actions prescribed by the predictive model. A simple application of this might be if a company’s inventory controls automatically reorder items that might be getting low, or a manufacturing management system might shift production from an over-booked plant to a less utilized plant if conditions warrant. Monitoring system variables can provide feedback to the predictive model to refine its accuracy.

The application of machine learning programs in automated commerce is increasing in speed and consequences every day. The May 6, 2010 “Flash Crash” of the New York Stock Exchange is a famous example of automated systems reacting to algorithms based on predictive models. On that day, at around 2:40pm Eastern time, the Dow Jones Industrial Average plunged 600 points within 5 minutes, only to recover within 20 minutes. The huge spikes in trading volumes observed at the beginning and end of every trading day are caused by automated trading systems placing large volumes of orders based on trading algorithms that are based on machine learning behavior.

Tools and ERPs

The toolbox for employing predictive analytics has grown rapidly in the last few years. The marketplace for these tools has expanded from the traditional industry leaders such as SASDell StatSoftIBM SPSS, and MiniTab.

By adding advanced analytics modules to existing business intelligence packages such as Microsoft SQL Server Analysis ServicesOracle Data MiningMicroStrategy Data Mining, or SAP Predictive Analysis Library these vendors now offer incredibly powerful desktop capabilities to any data scientist, data analyst or business analyst. Extremely powerful Open Source tools also exist for predictive analytics, notably R and KNIME.

In Conclusion

Companies that seek to augment their existing business intelligence reporting capabilities with predictive analytics should evaluate their existing data management processes to ensure they are analyzing good, high quality data. In a high functioning organization, the insights discovered by the data science team can provide input and guidance to the data management team to improve the data quality overall, which in turn leads to better analysis and insight.

If you and your company would like to leverage the power of advanced analytics methodologies Plaster Group Data & Analytics Consulting offers expert guidance. Our team of seasoned professionals offer expertise in the evaluation, design and implementation of predictive analytics, machine learning, Big Data analytics and more, that can help you extract insight and business value from your ever increasing data assets.