9.5 Feature Importance

In addition to their primary purpose (prediction), tree models can be used to learn about the dataset itself.

After a random forest model has been fitted, a model can view a table of feature importances. The higher that some variable appears in this table, the more effective it was at separating the records into distinct groups throughout the entire ensemble. Variables here are ranked by their overall contribution to Gini impurity reduction.

The numbers shown are not expressible in any particular units; instead, they are proportions that sum to 1.

When assessing feature importance, be careful with their interpretation. A relatively high feature importance simply tells us that a variable was very effective for separating records into distinct groups; it does not, however, tell us which outcome class higher values for this feature are associated with. The top two features in the table above are visits and homestate_NJ. As it turns out, higher values for visits are associated with greater probability of renewal, whereas a 1 value for homestate_NJ is associated with a decreased probability of renewal, relative to homestate_NY and homestate_CT.