Results 1 to 9 of 9

20060429, 17:45 #1
 Join Date
 Mar 2002
 Location
 Buenos Aires, Argentina
 Posts
 877
 Thanks
 0
 Thanked 0 Times in 0 Posts
Reliability of Excel for Statistical Purposes? (XP or inferior)
Check out this link (PDF): http://www.stat.unimuenchen.de/~knu...lv/excelxp.pdf.
I'm using Excel 2000 at home. According to a friend, at least the problem regarding the variance function seems to have been solved in Excel 2003. Anyway, I've written two small Excel functions via VBA script to compute variance based on an entire population (first script) and on a sample (second script) and they work alright for both the cases mentioned in the paper and the cases I tried myself. However these are my first attempts at writing VBA functions for Excel and I would like to know if you have comments with regards to what you would have done in a different way / ways in which errors might arise. I'm also interested in hearing comments on the paper.
Here are the scripts:
<code>' ~~~~~~~~~~~~~~~~~~~~~~~~
' This function computes the variance based on the entire population
' ~~~~~~~~~~~~~~~~~~~~~~~~
Function VarianzaPX(Elementos) As Double
Dim i As Variant
Dim n As Integer ' size of the population
Dim Suma As Double ' sum of the values of the variable
Dim Media As Double ' mean value
Dim SumaDesvios2 As Double ' sum of squared deviations from the mean
n = 0
Suma = 0
For Each i In Elementos
Suma = Suma + i
n = n + 1
Next i
Media = Suma / n
SumaDesvios2 = 0
For Each i In Elementos
SumaDesvios2 = SumaDesvios2 + (i  Media) ^ 2
Next i
VarianzaPX = SumaDesvios2 / n
End Function
' ~~~~~~~~~~~~~~~~~~~~~~~~
' This function computes the variance based on a sample
' (unbiased estimator of the variance of the variable)
' ~~~~~~~~~~~~~~~~~~~~~~~~
Function VarianzaX(Elementos) As Double
Dim i As Variant
Dim n As Integer ' sample size
Dim Suma As Double ' sum of the sample values of the variable
Dim Media As Double ' sample mean
Dim SumaDesvios2 As Double ' sum of squared deviations from the mean
n = 0
Suma = 0
For Each i In Elementos
Suma = Suma + i
n = n + 1
Next i
Media = Suma / n
SumaDesvios2 = 0
For Each i In Elementos
SumaDesvios2 = SumaDesvios2 + (i  Media) ^ 2
Next i
VarianzaX = SumaDesvios2 / (n  1)
End Function</code>
Thank you in advance<img src=/w3timages/blue3line.gif width=33% height=2>
<img src=/S/flags/Argentina.gif border=0 alt=Argentina width=30 height=18> <big><font color=4682b4><font face="Comic Sans MS">Diegol</font face=comic></font color=4682b4> </big>

20060430, 08:04 #2
 Join Date
 Feb 2001
 Location
 Weert, Limburg, Netherlands
 Posts
 4,812
 Thanks
 0
 Thanked 0 Times in 0 Posts
Re: Reliability of Excel for Statistical Purposes? (XP or inferior)
Well, first of all, I would declare n as a Long, since an integer cannot be greater than 32,767 .
Secondly, I doubt if your current method will handle multiple areas.
Third: you could combine both functions into one which accepts an argument to determine which version you want:
Function VarianzaPX(Elementos,Type as Integer) As Double
'All code...
If Type=0 Then
VarianzaPX = SumaDesvios2 / n
Else
VarianzaPX = SumaDesvios2 / (n  1)
End If
End FunctionJan Karel Pieterse
Microsoft Excel MVP, WMVP
www.jkpads.com
Professional Office Developers Association

20060502, 00:02 #3
 Join Date
 Mar 2002
 Location
 Buenos Aires, Argentina
 Posts
 877
 Thanks
 0
 Thanked 0 Times in 0 Posts
Re: Reliability of Excel for Statistical Purposes? (XP or inferior)
Hello Jan,
> Well, first of all, I would declare n as a Long, since an integer cannot be greater than 32,767 .
You're right.
> Secondly, I doubt if your current method will handle multiple areas.
Do you mean separate ranges? If so, you're correct. Is there an easy workaround this?
> Third: you could combine both functions into one which accepts an argument to determine which version you want:
When I wrote the code I was thinking of being consistent with Excel's separate VAR() and VARP() formulas, but you're right again.
I was wondering if, for completeness, there was the chance to include a description of a) what the formula does and [img]/forums/images/smilies/cool.gif[/img] what the field is meant to be. For instance, the attached shows the description of the formula xPi() included in the XNUMBERS addin. The XNUMBERS functions are not passwordprotected, so I tried searching their project to find the description "Multiprecision constant Pi", but to no avail. However it would seem a) can be accomplished. Not so sure about [img]/forums/images/smilies/cool.gif[/img].
Thank you<img src=/w3timages/blue3line.gif width=33% height=2>
<img src=/S/flags/Argentina.gif border=0 alt=Argentina width=30 height=18> <big><font color=4682b4><font face="Comic Sans MS">Diegol</font face=comic></font color=4682b4> </big>

20060502, 06:54 #4
 Join Date
 Feb 2001
 Location
 Weert, Limburg, Netherlands
 Posts
 4,812
 Thanks
 0
 Thanked 0 Times in 0 Posts
Re: Reliability of Excel for Statistical Purposes? (XP or inferior)
The structure would be like this:
Function Demo(Arg as Variant)
Dim oArea as Range
Dim oCell as range
If typename(Arg)="Range" Then
For each oArea in Arg.Areas
For Each oCell in oArea
'Use oCell
Next
Next
Else
'Arg is not a range, probably a constant?
End IfJan Karel Pieterse
Microsoft Excel MVP, WMVP
www.jkpads.com
Professional Office Developers Association

20060502, 22:56 #5
 Join Date
 Dec 2001
 Location
 Doorn, Netherlands
 Posts
 311
 Thanks
 0
 Thanked 0 Times in 0 Posts
Re: Reliability of Excel for Statistical Purposes?
Although you seem to be more interested in writing a high precision function in VBA due to Excel's inaccuracy when using it as a statistical tool, this site http://digilander.libero.it/foxes/xnumber_review_47.htm (free) or http://precisioncalc.com/ (not free) may actually have the tools already.
Regards, Teunis

20060502, 23:26 #6
 Join Date
 Mar 2002
 Location
 Buenos Aires, Argentina
 Posts
 877
 Thanks
 0
 Thanked 0 Times in 0 Posts
Re: Reliability of Excel for Statistical Purposes? (XP or inferior)
Hi Jan,
I combined your code with mine and came up with this (eg. for the population variance):
<code>Function VarianzaPX(Arg As Variant) As Double
Dim n As Long
Dim Suma As Double
Dim Media As Double
Dim SumaDesvios2 As Double
Dim oArea As Range
Dim oCell As Range
If TypeName(Arg) = "Range" Then
For Each oArea In Arg.Areas
For Each oCell In oArea
Suma = Suma + oCell.Value
n = n + 1
Next
Next
Media = Suma / n
SumaDesvios2 = 0
For Each oArea In Arg.Areas
For Each oCell In oArea
SumaDesvios2 = SumaDesvios2 + (oCell.Value  Media) ^ 2
Next
Next
PruebaPX = SumaDesvios2 / n
' should Arg be a constant, the algorithm handles it adequately so the output is 0
' (not really aware of what other type Arg may be)
End If
End Function</code>
As far as I tried it, it seems to work exactly like the former I had posted. I thought using range objects would make it possible to work with separate ranges (obtained by holding CTRL as you select them). When selecting separate ranges with the CTRL key as arguments for a formula, Excel writes them, for instance:
B2:B4;D24
And the revamped function won't work with such an argument either.
I'm sure I'm missing something, but I don't know what exactly! <img src=/S/stupidme.gif border=0 alt=stupidme width=30 height=30>
Thanks<img src=/w3timages/blue3line.gif width=33% height=2>
<img src=/S/flags/Argentina.gif border=0 alt=Argentina width=30 height=18> <big><font color=4682b4><font face="Comic Sans MS">Diegol</font face=comic></font color=4682b4> </big>

20060503, 00:49 #7
 Join Date
 Mar 2002
 Location
 Buenos Aires, Argentina
 Posts
 877
 Thanks
 0
 Thanked 0 Times in 0 Posts
Re: Reliability of Excel for Statistical Purposes?
Hello Teunis,
The days before reading the article I was looking for inspiration to delve into Excel functions coding in VBA. When I read the article I got somewhat astonished and didn't realize XNUMBERS (which I already knew) would suffice. Instead I found myself coding to get a taste of it. As I described in a previous post above, I even checked some VBA functions of XNUMBERS without realizing the solution was at the reach of my hand.
BTW, the XVAR() formula in the XNUMBERS plugin looks as though it'll take me a little longer to understand. I'll try to grasp it gradually...<img src=/w3timages/blue3line.gif width=33% height=2>
<img src=/S/flags/Argentina.gif border=0 alt=Argentina width=30 height=18> <big><font color=4682b4><font face="Comic Sans MS">Diegol</font face=comic></font color=4682b4> </big>

20060504, 13:13 #8
 Join Date
 Feb 2001
 Location
 Weert, Limburg, Netherlands
 Posts
 4,812
 Thanks
 0
 Thanked 0 Times in 0 Posts
Re: Reliability of Excel for Statistical Purposes? (XP or inferior)
You are correct if you are indeed using direct references to ranges. But when you are using named ranges, the named range may consist of multiple areas.
If you use the mouse and control key to select noncontiguous areas, than you get multiple arguments for each area indeed, so your function should then be able to cope with an unknown number of arguments. It can be done using the ParamArray keyword in the argument list of the function. (look up in Help) But in your case I'd leave it as is and have it accept a single range argument.Jan Karel Pieterse
Microsoft Excel MVP, WMVP
www.jkpads.com
Professional Office Developers Association

20060504, 23:08 #9
 Join Date
 Mar 2002
 Location
 Buenos Aires, Argentina
 Posts
 877
 Thanks
 0
 Thanked 0 Times in 0 Posts
Re: Reliability of Excel for Statistical Purposes? (XP or inferior)
<hr>But when you are using named ranges, the named range may consist of multiple areas.<hr>So THAT's what I was missing.
Thank you.<img src=/w3timages/blue3line.gif width=33% height=2>
<img src=/S/flags/Argentina.gif border=0 alt=Argentina width=30 height=18> <big><font color=4682b4><font face="Comic Sans MS">Diegol</font face=comic></font color=4682b4> </big>