Readability of Decision Trees

November 26, 2008
50 Views

One of the most often cited advantage of decision trees is their readability. Several data miners (to whom I belong) justify the use of this technique since it is quite easy to understand the obtained model (no black box). However, there are certain issues that make decision trees unreadable.

First, there is normalization (or standardization). In most projects, data have to be normalized before using decision tree. Therefore, once you plot the tr


One of the most often cited advantage of decision trees is their readability. Several data miners (to whom I belong) justify the use of this technique since it is quite easy to understand the obtained model (no black box). However, there are certain issues that make decision trees unreadable.

First, there is normalization (or standardization). In most projects, data have to be normalized before using decision tree. Therefore, once you plot the tree, values are meaningless. Of course, you can map the data back in the original format, but it has to be done.

Second is the number of trees. In the project I carry on at my job, I can have 100 or more decision trees by month (see this post for more details). It is clearly impossible to read all these trees even if they are independently understandable. The same happens with random forests. When there are 1000 trees voting for a given class, how can one understand the process (or rules) that produce the class output?

Decision trees still have a lot of advantages. However, the “readability” advantage must be taken with care. It may be valid in some applications, but can often be a mirage.


Link to original post

You may be interested

5 Effective Strategies for Boosting IoT Security
Internet of Things
79 shares1,127 views
Internet of Things
79 shares1,127 views

5 Effective Strategies for Boosting IoT Security

Ryan Kh - July 25, 2017

With the emergence of IoT devices that are being rolled out from time to time, the serious IoT security issues…

The Future of Healthcare and Big Pharma is in Big Data Analytics
Analytics
418 views
Analytics
418 views

The Future of Healthcare and Big Pharma is in Big Data Analytics

riteshmehta - July 25, 2017

The healthcare industry recognizes that Big Data as and opportunity and a challenge for the whole sector. Nevertheless, systems and…

How Companies Are Rethinking Promotional Materials In Light Of Big Data
Big Data
108 shares1,735 views
Big Data
108 shares1,735 views

How Companies Are Rethinking Promotional Materials In Light Of Big Data

Larry Alton - July 25, 2017

Most people have a pile of promotional items sitting around their homes, from water bottles and t-shirts to calendars and…