7.2 Accuracy and Error

Accuracy & Error

A model’s accuracy rate essentially answers the question, “how many answers did you get right out of all the predictions that you made?”

To calculate the model’s accuracy rate, we need to sum the values in the cells for which the model’s predictions matched the actual results, and divide that sum by the total number of records that were included in the model.

	actual renew	actual non-renew	TOTAL
predict renew	541	59	600
predict non-renew	38	112	150
TOTAL	579	171	750

For the season pass renewal model, the highlighted cells in the table above indicate the cases in which the model’s classification was correct, as well as the cell indicating the total number of records analyzed in the model. Since there are 653 (541+112) correct predictions, from a total of 750 records, we can say that this model achieved an accuracy of 87.07%.

To find the error rate, we can simply subtract the accuracy percentage from 1, or we can sum the number of instances in which the model’s prediction was incorrect, and divide that sum by the overall total. Since the model made 97 incorrect predictions (38+59) the error rate is 97/750, or 12.93%.

Often, a modeler’s primary goal will be to achieve high accuracy. However, this will not always be the case.

Suppose we were building a model that aimed to detect credit card fraud. For the sake of this example, let’s also assume that 99.95% of credit card transactions are NOT fraudulent.

If our model simply predicted that every transaction was non-fraudulent, its accuracy would be phenomenal. However, such a model would offer no practical value, as it would fail to detect the more noteworthy outcome class, the fraudulent transactions. In order to think properly about models whose goal is to identify members of a particular outcome class, we will need to examine some classification metrics beyond just accuracy and error.