# Thread: Measuring the accuracy of a prediction

1. This is perhaps as much of a maths question as it is an Excel question, but I know that a number of Excel gurus here are likely to be skilled in both fields.
I have a large list of customers, each of which was assigned a 'probability' score 12 months ago, based on the likihood of them renewing their contract with their supplier. I now know which of those customers renewed and which didn't.

For example, my data set reads :-
ID_____Customer ______ Probabiliity of renewal _______Contract Renewed (1 = true)
1_____ customer 1_____, 97.4% __________ 1
2_____ customer 2_____, 94.1% __________ 0
etc
etc
10000__ customer 10000, 5.4% __________ 1

Some customers with high probability of renewal did renew but others didn't. Conversly some customers witha low probility of renewal did renew, whilst others didnt.

Overall, historical data shows that 20% of customers would renew their contracts.
I would like to calculate whether the overall predictive scores are better or worse than if the predictive rates had been chosen at random?

Many thanks

Rob

2. I suggest that you need a LOT more information about the customer to predict this accurately.

3. Something simple...
Enter a zero for no renewal, enter 1 for renewal.
Sort ascending by forecast percent.
Count the number of zeros in the bottom half vs. the top half.

Or almost a simple...
Use the Correl function on the above sorted data.
'--
Jim Cone
Portland, Oregon USA
30+ ways to sort (excel add-in)

4. To determine the actual number add the "contract renewed column" (1s and zeroes).
To determine the "predicted number" of renewals, [Essentially a weighted average] just add up the list of Probabilities 97.4% + 94.1%+ ... + 5.4% = a number from 0 to total number of customers [Note remove the % in the sum, the per 100 is not meaningful...]

You can compare that predicted number vs some random number. You don't mention how you would determine this raqndom number, but if 20% is the historical number, then you would multiply the number of customers by 20% to get the "random prediction".

You can determine the % Accuracy by the (predicted - Actual)/Actual

Then you can compare the Accuracy using each prediction and see which is better. If you have historical values of each you can see if they are different they are year to year...

Steve

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•