# Thread: Reliability of Excel for Statistical Purposes? (XP or inferior)

1. ## Reliability of Excel for Statistical Purposes? (XP or inferior)

Check out this link (PDF): http://www.stat.uni-muenchen.de/~knu...lv/excelxp.pdf.

I'm using Excel 2000 at home. According to a friend, at least the problem regarding the variance function seems to have been solved in Excel 2003. Anyway, I've written two small Excel functions via VBA script to compute variance based on an entire population (first script) and on a sample (second script) and they work alright for both the cases mentioned in the paper and the cases I tried myself. However these are my first attempts at writing VBA functions for Excel and I would like to know if you have comments with regards to what you would have done in a different way / ways in which errors might arise. I'm also interested in hearing comments on the paper.

Here are the scripts:

<code>' ~~~~~~~~~~~~~~~~~~~~~~~~
' This function computes the variance based on the entire population
' ~~~~~~~~~~~~~~~~~~~~~~~~

Function VarianzaPX(Elementos) As Double

Dim i As Variant
Dim n As Integer ' size of the population
Dim Suma As Double ' sum of the values of the variable
Dim Media As Double ' mean value
Dim SumaDesvios2 As Double ' sum of squared deviations from the mean

n = 0
Suma = 0
For Each i In Elementos
Suma = Suma + i
n = n + 1
Next i
Media = Suma / n

For Each i In Elementos
Next i

End Function

' ~~~~~~~~~~~~~~~~~~~~~~~~
' This function computes the variance based on a sample
' (unbiased estimator of the variance of the variable)
' ~~~~~~~~~~~~~~~~~~~~~~~~

Function VarianzaX(Elementos) As Double

Dim i As Variant
Dim n As Integer ' sample size
Dim Suma As Double ' sum of the sample values of the variable
Dim Media As Double ' sample mean
Dim SumaDesvios2 As Double ' sum of squared deviations from the mean

n = 0
Suma = 0
For Each i In Elementos
Suma = Suma + i
n = n + 1
Next i
Media = Suma / n

For Each i In Elementos
Next i

VarianzaX = SumaDesvios2 / (n - 1)

End Function</code>

2. ## Re: Reliability of Excel for Statistical Purposes? (XP or inferior)

Well, first of all, I would declare n as a Long, since an integer cannot be greater than 32,767 .

Secondly, I doubt if your current method will handle multiple areas.

Third: you could combine both functions into one which accepts an argument to determine which version you want:

Function VarianzaPX(Elementos,Type as Integer) As Double

'All code...
If Type=0 Then
Else
VarianzaPX = SumaDesvios2 / (n - 1)
End If
End Function

3. ## Re: Reliability of Excel for Statistical Purposes? (XP or inferior)

Hello Jan,

> Well, first of all, I would declare n as a Long, since an integer cannot be greater than 32,767 .

You're right.

> Secondly, I doubt if your current method will handle multiple areas.

Do you mean separate ranges? If so, you're correct. Is there an easy workaround this?

> Third: you could combine both functions into one which accepts an argument to determine which version you want:

When I wrote the code I was thinking of being consistent with Excel's separate VAR() and VARP() formulas, but you're right again.

I was wondering if, for completeness, there was the chance to include a description of a) what the formula does and [img]/forums/images/smilies/cool.gif[/img] what the field is meant to be. For instance, the attached shows the description of the formula xPi() included in the XNUMBERS add-in. The XNUMBERS functions are not password-protected, so I tried searching their project to find the description "Multiprecision constant Pi", but to no avail. However it would seem a) can be accomplished. Not so sure about [img]/forums/images/smilies/cool.gif[/img].

Thank you

4. ## Re: Reliability of Excel for Statistical Purposes? (XP or inferior)

The structure would be like this:

Function Demo(Arg as Variant)
Dim oArea as Range
Dim oCell as range
If typename(Arg)="Range" Then
For each oArea in Arg.Areas
For Each oCell in oArea
'Use oCell
Next
Next
Else
'Arg is not a range, probably a constant?
End If

5. ## Re: Reliability of Excel for Statistical Purposes?

Although you seem to be more interested in writing a high precision function in VBA due to Excel's inaccuracy when using it as a statistical tool, this site http://digilander.libero.it/foxes/xnumber_review_47.htm (free) or http://precisioncalc.com/ (not free) may actually have the tools already.

Regards, Teunis

6. ## Re: Reliability of Excel for Statistical Purposes? (XP or inferior)

Hi Jan,

I combined your code with mine and came up with this (eg. for the population variance):

<code>Function VarianzaPX(Arg As Variant) As Double

Dim n As Long
Dim Suma As Double
Dim Media As Double
Dim oArea As Range
Dim oCell As Range

If TypeName(Arg) = "Range" Then
For Each oArea In Arg.Areas
For Each oCell In oArea
Suma = Suma + oCell.Value
n = n + 1
Next
Next

Media = Suma / n

For Each oArea In Arg.Areas
For Each oCell In oArea
Next
Next

' should Arg be a constant, the algorithm handles it adequately so the output is 0
' (not really aware of what other type Arg may be)
End If

End Function</code>

As far as I tried it, it seems to work exactly like the former I had posted. I thought using range objects would make it possible to work with separate ranges (obtained by holding CTRL as you select them). When selecting separate ranges with the CTRL key as arguments for a formula, Excel writes them, for instance:

B2:B4;D24

And the revamped function won't work with such an argument either.
I'm sure I'm missing something, but I don't know what exactly! <img src=/S/stupidme.gif border=0 alt=stupidme width=30 height=30>

Thanks

7. ## Re: Reliability of Excel for Statistical Purposes?

Hello Teunis,

The days before reading the article I was looking for inspiration to delve into Excel functions coding in VBA. When I read the article I got somewhat astonished and didn't realize XNUMBERS (which I already knew) would suffice. Instead I found myself coding to get a taste of it. As I described in a previous post above, I even checked some VBA functions of XNUMBERS without realizing the solution was at the reach of my hand.
BTW, the XVAR() formula in the XNUMBERS plug-in looks as though it'll take me a little longer to understand. I'll try to grasp it gradually...

8. ## Re: Reliability of Excel for Statistical Purposes? (XP or inferior)

You are correct if you are indeed using direct references to ranges. But when you are using named ranges, the named range may consist of multiple areas.

If you use the mouse and control key to select non-contiguous areas, than you get multiple arguments for each area indeed, so your function should then be able to cope with an unknown number of arguments. It can be done using the ParamArray keyword in the argument list of the function. (look up in Help) But in your case I'd leave it as is and have it accept a single range argument.

9. ## Re: Reliability of Excel for Statistical Purposes? (XP or inferior)

<hr>But when you are using named ranges, the named range may consist of multiple areas.<hr>
So THAT's what I was missing.

Thank you.

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•