With PMML, interoperability is truly attainable

January 17, 2011
127 Views

Developed by the Data Mining Group (DMG), an independent, vendor led committee, PMML provides an open standard for representing data mining models. In this way, models can easily be shared between different applications avoiding proprietary issues and incompatibilities. Currently, all major commercial and open source data mining tools already support PMML. These include IBM/SPSS, SAS, KXEN, TIBCO, STATISTICA, Microstrategy, R, KNIME, and RapidMiner (for a list of PMML-compliant tools, see of PMML-powered tools at DMG.org).

Developed by the Data Mining Group (DMG), an independent, vendor led committee, PMML provides an open standard for representing data mining models. In this way, models can easily be shared between different applications avoiding proprietary issues and incompatibilities. Currently, all major commercial and open source data mining tools already support PMML. These include IBM/SPSS, SAS, KXEN, TIBCO, STATISTICA, Microstrategy, R, KNIME, and RapidMiner (for a list of PMML-compliant tools, see of PMML-powered tools at DMG.org).

PMML is an XML-based language which follows a very intuitive structure to describe data pre- and post-processing as well as predictive algorithms. Not only does PMML represent a wide range of statistical techniques, but it can also be used to represent input data as well as the data transformations necessary to transform raw data into meaningful features.

As part of the Data Mining Group, Zementis is committed to the continual development of PMML. It is our vision for the community that users will be free to share models among many solutions, benefiting from an environment in which interoperability is truly attainable.

In this spirit, Zementis has made available a tool called the PMML Converter which converts older versions of PMML to its latest, Version 4.0. The converter is also used to validate a data mining model against the PMML specification for versions 2.0, 2.1, 3.0, 3.1, 3.2, and 4.0. If validation is not successful, the converter gives back a file containing explanations for why the validation failed (click on the “details” button).

Before actual conversion takes place, the validation phase needs to be successful, i.e. the model file needs to conform to the PMML specification as published by the DMG (for any of the older PMML versions listed above). For known PMML issues (from a variety of sources/vendors), the PMML Converter will actually correct the model file so that it can be converted appropriately.

The PMML converter currently converts the following model elements to PMML 4.0:

  • Association Rules
  • Clustering Models
  • Decision Trees
  • General Regression Models Regression
  • Naive Bayes Classifiers
  • Neural Networks Regression Models
  • Ruleset Models
  • Support Vector Machines

It will also convert pre- and post-processing PMML elements.

The PMML Converter can be accessed directly from the Data Mining Group (DMG) website or it can be found in the Zementis PMML Resources page.

For more information on how to use the converter, please refer to the how-to guide.