1. ## Statistics

I have a frequency distribution table. One column gives the age, the second column the frequency. How do I get Excel97 to give me the mean and SD for this data

2. ## Re: Statistics

Either you work with the raw data and calculate the average and stdev with the worksheetfunctions or create a pivottable. Therefore, you need to define classes for your age date, e.g. class 1: from 0 to 10 years old, class 2 : from 11 to 20 years old etc. This may correspond to the classes you used to calculate the frequency (eventually these classes only have a width of 1). Then put your data in two columns, one with the age data (and with label age in the first row) and the second column with the class data, that is, put next to each age-value the corresponding class value. Then select Data >> Pivottable, and drag the label class to the Row field and drag it also to the data field. Double click it and change the function to 'count'. Now drag the label age to the data field, double click it and change the function into 'average'. repeat the last step and change the function into 'stdev'. This should give you a pivottable with all you want.

3. ## Re: Statistics

With an example

4. ## Re: Statistics

Thanks for the suggestion. I was hoping that Excel could calculate the mean and SD direct rather than long hand. The long hand involves using the mean m = sum of the frequencies times the values and divided by the sum of frequencies i.e. m=Sum(f*x)/sum(f). Excel only seems to calculate the simple mean of sum of values divided by number of values. Similarly the SD calculation does not cater for frequencies of data. Any ideas?

Thanks

Chris

5. ## Re: Statistics

See if =sumproduct() helps for true mean. See <A target="_blank" HREF=http://www.wopr.com/cgi-bin/w3t/showflat.pl?Cat=&Board=xl&Number=36909&page=1&view =expanded&sb=5>this thread</A> for a similar discussion.

6. ## Re: Statistics

It's been a mere 30 years since I've had to work with standard deviation, so I can't help there, but <A target="_blank" HREF=http://www.beyondtechnology.com/tips016.shtml>beyondtechnology</A> looks to have "soup to nuts" coverage.

7. ## Re: Statistics

I don't think Excel has a formula for calculating mean and sd from grouped data. The long-hand formulae are

mean = sum(f*x)/sum(f)

sd = sqrt( (sum(x*x*f) - (sum(f*x)^2)/sum(f))/(sum(f)-1) )
or
sd = sqrt( sum(f*((x-mean)^2))/(sum(f)-1) )
(the second version is less prone to rounding error, but requires you to have the mean calculated separately)

8. ## Re: Statistics

Can you see if this SD is correct? I can't figure how to enable the second formula, because I don't understand how to derive "(x-mean)" for multiple x values.

9. ## Re: Statistics

I have attached a word document with instructions on how to compute the weighted Std Deviation in XL. It is essentially the same as below, except for the failure of the subscripts to carry over from what I copied and pasted from Word below:

If you have NI values of age AI, then you get the weighted mean, WA, as follows:
2. insert = NI * AI and copying it down the length of your age column;
3. sum the new column; and
4. divide that sum by the sum of the frequencies

The data points for the weighted variance calculation are (NI * AI ), so the variance can be computed as follows:
a) create an other column for (AI - WA) and another for [NI * (AI - WA)]^2; and
[img]/forums/images/smilies/cool.gif[/img] add these [NI * (AI - WA)]^2 up and divide by sum(NI )

You get the weighted Std Deviation by taking the square root of the number calculated in [img]/forums/images/smilies/cool.gif[/img], above.

10. ## Re: Statistics

Hi John:

{=SQRT( SUM(Units*(Cost-B31)^2)/(SUM(Units)-1))}

since you have stored the mean of the Costs in B31.
This gives the same value as you get.

Ian.

11. ## Re: Statistics

epic60sman's version is the same formula as mine, but with the terms in the sum spread out over a column of cells. It would be better for explaining the calculations, though takes up more space on the sheet.

Incidentally, does anyone know if there is any performance difference between using SUMPRODUCT or SUM in formulae like these?

eg is it 'better' to use
= SUMPRODUCT(A1:A1000,B1:B1000)
or
{=SUM(A1:A1000*B1:B1000)}
where the latter is entered as an array formula (ctrl/shift/enter)

Ian.

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•