9.7 Ensemble Modeling and the Wisdom of Crowds

Other learning ensembles are built from entirely distinct types of models.

In a “hard-voting” ensemble, each model gets one vote towards the outcome. For instance, let’s say that a modeler performing a classification task is not sure whether she wants to use logistic regression, k-nearest neighbors, or a support vector machine. Rather than have to choose one, she could instead model the data with each of these methods. Each model’s prediction could be considered one “vote” towards the outcome, and the majority vote would be the winner.

In the example shown below, the hard-voting ensemble would predict an outcome class of “1”, based on the majority of the three ensemble members.

In a “soft-voting” ensemble, the underlying probabilities associated with each model’s predictions are averaged together, and the model’s predicted outcome is the arithmetic mean of those probabilities. In the table below, assume that the values shown are the predicted probabilities of “1” class membership:

The logic behind using an ensemble is akin to the logic associated with any other situation in which someone seeking guidance asks for input from more than one source. Just as a medical patient might seek an opinion from more than one doctor, or someone facing a pending legal matter might seek advice from multiple counsels, a modeler may wish to pool the collective “wisdom” of various modeling types in order to generate a predicted outcome.