Results 1 to 11 of 11
Thread: Statistics

20010508, 16:49 #1Chris GloverGuest
Statistics
I have a frequency distribution table. One column gives the age, the second column the frequency. How do I get Excel97 to give me the mean and SD for this data

20010508, 17:51 #2
 Join Date
 Jan 2001
 Location
 Kortrijk, Belgium
 Posts
 571
 Thanks
 0
 Thanked 0 Times in 0 Posts
Re: Statistics
Either you work with the raw data and calculate the average and stdev with the worksheetfunctions or create a pivottable. Therefore, you need to define classes for your age date, e.g. class 1: from 0 to 10 years old, class 2 : from 11 to 20 years old etc. This may correspond to the classes you used to calculate the frequency (eventually these classes only have a width of 1). Then put your data in two columns, one with the age data (and with label age in the first row) and the second column with the class data, that is, put next to each agevalue the corresponding class value. Then select Data >> Pivottable, and drag the label class to the Row field and drag it also to the data field. Double click it and change the function to 'count'. Now drag the label age to the data field, double click it and change the function into 'average'. repeat the last step and change the function into 'stdev'. This should give you a pivottable with all you want.

20010508, 17:55 #3
 Join Date
 Jan 2001
 Location
 Kortrijk, Belgium
 Posts
 571
 Thanks
 0
 Thanked 0 Times in 0 Posts
Re: Statistics
With an example

20010509, 15:36 #4Chris GloverGuest
Re: Statistics
Thanks for the suggestion. I was hoping that Excel could calculate the mean and SD direct rather than long hand. The long hand involves using the mean m = sum of the frequencies times the values and divided by the sum of frequencies i.e. m=Sum(f*x)/sum(f). Excel only seems to calculate the simple mean of sum of values divided by number of values. Similarly the SD calculation does not cater for frequencies of data. Any ideas?
Thanks
Chris

20010509, 15:46 #5
 Join Date
 Dec 2000
 Location
 Salt Lake City, Utah, USA
 Posts
 9,508
 Thanks
 0
 Thanked 6 Times in 6 Posts
Re: Statistics
See if =sumproduct() helps for true mean. See <A target="_blank" HREF=http://www.wopr.com/cgibin/w3t/showflat.pl?Cat=&Board=xl&Number=36909&page=1&view =expanded&sb=5>this thread</A> for a similar discussion.
John ... I float in liquid gardens
UTC 7ąDS

20010509, 16:41 #6
 Join Date
 Dec 2000
 Location
 Salt Lake City, Utah, USA
 Posts
 9,508
 Thanks
 0
 Thanked 6 Times in 6 Posts
Re: Statistics
It's been a mere 30 years since I've had to work with standard deviation, so I can't help there, but <A target="_blank" HREF=http://www.beyondtechnology.com/tips016.shtml>beyondtechnology</A> looks to have "soup to nuts" coverage.
John ... I float in liquid gardens
UTC 7ąDS

20010510, 03:54 #7
 Join Date
 Jan 2001
 Location
 Adelaide, South Australia, Australia
 Posts
 85
 Thanks
 0
 Thanked 0 Times in 0 Posts
Re: Statistics
I don't think Excel has a formula for calculating mean and sd from grouped data. The longhand formulae are
mean = sum(f*x)/sum(f)
sd = sqrt( (sum(x*x*f)  (sum(f*x)^2)/sum(f))/(sum(f)1) )
or
sd = sqrt( sum(f*((xmean)^2))/(sum(f)1) )
(the second version is less prone to rounding error, but requires you to have the mean calculated separately)

20010510, 22:38 #8
 Join Date
 Dec 2000
 Location
 Salt Lake City, Utah, USA
 Posts
 9,508
 Thanks
 0
 Thanked 6 Times in 6 Posts
Re: Statistics
Can you see if this SD is correct? I can't figure how to enable the second formula, because I don't understand how to derive "(xmean)" for multiple x values.
John ... I float in liquid gardens
UTC 7ąDS

20010511, 00:55 #9
 Join Date
 Jan 2001
 Location
 Baltimore, MD, Maryland, USA
 Posts
 254
 Thanks
 0
 Thanked 0 Times in 0 Posts
Re: Statistics
I have attached a word document with instructions on how to compute the weighted Std Deviation in XL. It is essentially the same as below, except for the failure of the subscripts to carry over from what I copied and pasted from Word below:
If you have NI values of age AI, then you get the weighted mean, WA, as follows:
1. create an additional column;
2. insert = NI * AI and copying it down the length of your age column;
3. sum the new column; and
4. divide that sum by the sum of the frequencies
The data points for the weighted variance calculation are (NI * AI ), so the variance can be computed as follows:
a) create an other column for (AI  WA) and another for [NI * (AI  WA)]^2; and
[img]/forums/images/smilies/cool.gif[/img] add these [NI * (AI  WA)]^2 up and divide by sum(NI )
You get the weighted Std Deviation by taking the square root of the number calculated in [img]/forums/images/smilies/cool.gif[/img], above.

20010511, 01:11 #10
 Join Date
 Jan 2001
 Location
 Adelaide, South Australia, Australia
 Posts
 85
 Thanks
 0
 Thanked 0 Times in 0 Posts
Re: Statistics
Hi John:
Yes, your formula is correct. The second version of the formula for your spreadsheet is
{=SQRT( SUM(Units*(CostB31)^2)/(SUM(Units)1))}
since you have stored the mean of the Costs in B31.
This gives the same value as you get.
Ian.

20010511, 01:21 #11
 Join Date
 Jan 2001
 Location
 Adelaide, South Australia, Australia
 Posts
 85
 Thanks
 0
 Thanked 0 Times in 0 Posts
Re: Statistics
epic60sman's version is the same formula as mine, but with the terms in the sum spread out over a column of cells. It would be better for explaining the calculations, though takes up more space on the sheet.
Incidentally, does anyone know if there is any performance difference between using SUMPRODUCT or SUM in formulae like these?
eg is it 'better' to use
= SUMPRODUCT(A1:A1000,B1:B1000)
or
{=SUM(A1:A1000*B1:B1000)}
where the latter is entered as an array formula (ctrl/shift/enter)
Ian.