The Role of Standards in Predictive Analytics: A Series

December 19, 2013
103 Views

I am working on a paper, for publication in early 2014, on the role of standards such as R, Hadoop and PMML in the mainstreaming of predictive analytics.  As I do so I will be publishing a few blog posts. I thought I would start with a quick introduction to the topic now and then finish the series in the new year.

I am working on a paper, for publication in early 2014, on the role of standards such as R, Hadoop and PMML in the mainstreaming of predictive analytics.  As I do so I will be publishing a few blog posts. I thought I would start with a quick introduction to the topic now and then finish the series in the new year.

Just a few years ago it was common to develop a predictive analytic model using a single proprietary tool against a sample of structured data. This would then be applied in batch, storing scores for future use in a database or data warehouse. This model has been disrupted in recent years:

  • There is a move to real-time scoring, calculating the value of predictive analytic models when they are needed.
  • At the same time the variety of model execution platforms has expanded with in-database analytics as well as MapReduce-based execution becoming increasingly common.
  • The open source analytic modeling language R has become extremely popular with up to 70% of analytic professionals using it at least occasionally (see the Rexer survey).
  • Big Data is starting to have an impact, especially on advanced teams (as we saw in our Predictive Analytics in the Cloud work)

This increasingly complex and multi-vendor environment has increased the value of both published standards and open source standards. The paper is going to explore the growing role of standards in predictive analytics. It will discuss the role of R in expanding the analytic ecosystem, the way Hadoop helps organizations handle Big Data for in the context of predictive analytics, and the way PMML supports the move to real-time scoring. Plus of course we have the longer term impact of standards like the Decision Model and Notation standard.

I’ll write a blog post about each of these areas in the new year. I’d like to thank the Data Mining Group, Revolution Analytics and Zementis for their support of this research.

You may be interested

The Direst Security Breaches of 2017 and How Data Centers Are Responding
Best Practices
108 shares1,305 views
Best Practices
108 shares1,305 views

The Direst Security Breaches of 2017 and How Data Centers Are Responding

Diana Hope - June 20, 2017

Cybersecurity is becoming a tremendous concern. By 2021, security breach cost will exceed $6 trillion a year. A number of…

Why Smart Data is the Key to Future Lending
Best Practices
55 shares1,080 views
Best Practices
55 shares1,080 views

Why Smart Data is the Key to Future Lending

Patrick Köck - June 20, 2017

Last month, big data and investment trading collided on the front page of The Wall Street Journal. The article titled…

Opportunities with Merging Microsoft Access With Big Data
Big Data
75 shares1,154 views
Big Data
75 shares1,154 views

Opportunities with Merging Microsoft Access With Big Data

Rehan Ijaz - June 19, 2017

Big data may be changing some older tools in some interesting ways. Microsoft Access is a prime example. Microsoft first…