Results 1 to 9 of 9
  1. #1
    5 Star Lounger
    Join Date
    Mar 2002
    Location
    Buenos Aires, Argentina
    Posts
    877
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Reliability of Excel for Statistical Purposes? (XP or inferior)

    Check out this link (PDF): http://www.stat.uni-muenchen.de/~knu...lv/excelxp.pdf.

    I'm using Excel 2000 at home. According to a friend, at least the problem regarding the variance function seems to have been solved in Excel 2003. Anyway, I've written two small Excel functions via VBA script to compute variance based on an entire population (first script) and on a sample (second script) and they work alright for both the cases mentioned in the paper and the cases I tried myself. However these are my first attempts at writing VBA functions for Excel and I would like to know if you have comments with regards to what you would have done in a different way / ways in which errors might arise. I'm also interested in hearing comments on the paper.

    Here are the scripts:


    <code>' ~~~~~~~~~~~~~~~~~~~~~~~~
    ' This function computes the variance based on the entire population
    ' ~~~~~~~~~~~~~~~~~~~~~~~~

    Function VarianzaPX(Elementos) As Double

    Dim i As Variant
    Dim n As Integer ' size of the population
    Dim Suma As Double ' sum of the values of the variable
    Dim Media As Double ' mean value
    Dim SumaDesvios2 As Double ' sum of squared deviations from the mean

    n = 0
    Suma = 0
    For Each i In Elementos
    Suma = Suma + i
    n = n + 1
    Next i
    Media = Suma / n

    SumaDesvios2 = 0
    For Each i In Elementos
    SumaDesvios2 = SumaDesvios2 + (i - Media) ^ 2
    Next i

    VarianzaPX = SumaDesvios2 / n

    End Function


    ' ~~~~~~~~~~~~~~~~~~~~~~~~
    ' This function computes the variance based on a sample
    ' (unbiased estimator of the variance of the variable)
    ' ~~~~~~~~~~~~~~~~~~~~~~~~

    Function VarianzaX(Elementos) As Double

    Dim i As Variant
    Dim n As Integer ' sample size
    Dim Suma As Double ' sum of the sample values of the variable
    Dim Media As Double ' sample mean
    Dim SumaDesvios2 As Double ' sum of squared deviations from the mean

    n = 0
    Suma = 0
    For Each i In Elementos
    Suma = Suma + i
    n = n + 1
    Next i
    Media = Suma / n

    SumaDesvios2 = 0
    For Each i In Elementos
    SumaDesvios2 = SumaDesvios2 + (i - Media) ^ 2
    Next i

    VarianzaX = SumaDesvios2 / (n - 1)

    End Function</code>

    Thank you in advance
    <img src=/w3timages/blue3line.gif width=33% height=2>
    <img src=/S/flags/Argentina.gif border=0 alt=Argentina width=30 height=18> <big><font color=4682b4><font face="Comic Sans MS">Diegol</font face=comic></font color=4682b4> </big>

  2. #2
    Platinum Lounger
    Join Date
    Feb 2001
    Location
    Weert, Limburg, Netherlands
    Posts
    4,812
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Reliability of Excel for Statistical Purposes? (XP or inferior)

    Well, first of all, I would declare n as a Long, since an integer cannot be greater than 32,767 .

    Secondly, I doubt if your current method will handle multiple areas.

    Third: you could combine both functions into one which accepts an argument to determine which version you want:

    Function VarianzaPX(Elementos,Type as Integer) As Double

    'All code...
    If Type=0 Then
    VarianzaPX = SumaDesvios2 / n
    Else
    VarianzaPX = SumaDesvios2 / (n - 1)
    End If
    End Function
    Jan Karel Pieterse
    Microsoft Excel MVP, WMVP
    www.jkp-ads.com
    Professional Office Developers Association

  3. #3
    5 Star Lounger
    Join Date
    Mar 2002
    Location
    Buenos Aires, Argentina
    Posts
    877
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Reliability of Excel for Statistical Purposes? (XP or inferior)

    Hello Jan,

    > Well, first of all, I would declare n as a Long, since an integer cannot be greater than 32,767 .

    You're right.

    > Secondly, I doubt if your current method will handle multiple areas.

    Do you mean separate ranges? If so, you're correct. Is there an easy workaround this?

    > Third: you could combine both functions into one which accepts an argument to determine which version you want:

    When I wrote the code I was thinking of being consistent with Excel's separate VAR() and VARP() formulas, but you're right again.

    I was wondering if, for completeness, there was the chance to include a description of a) what the formula does and [img]/forums/images/smilies/cool.gif[/img] what the field is meant to be. For instance, the attached shows the description of the formula xPi() included in the XNUMBERS add-in. The XNUMBERS functions are not password-protected, so I tried searching their project to find the description "Multiprecision constant Pi", but to no avail. However it would seem a) can be accomplished. Not so sure about [img]/forums/images/smilies/cool.gif[/img].

    Thank you
    <img src=/w3timages/blue3line.gif width=33% height=2>
    <img src=/S/flags/Argentina.gif border=0 alt=Argentina width=30 height=18> <big><font color=4682b4><font face="Comic Sans MS">Diegol</font face=comic></font color=4682b4> </big>

  4. #4
    Platinum Lounger
    Join Date
    Feb 2001
    Location
    Weert, Limburg, Netherlands
    Posts
    4,812
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Reliability of Excel for Statistical Purposes? (XP or inferior)

    The structure would be like this:

    Function Demo(Arg as Variant)
    Dim oArea as Range
    Dim oCell as range
    If typename(Arg)="Range" Then
    For each oArea in Arg.Areas
    For Each oCell in oArea
    'Use oCell
    Next
    Next
    Else
    'Arg is not a range, probably a constant?
    End If
    Jan Karel Pieterse
    Microsoft Excel MVP, WMVP
    www.jkp-ads.com
    Professional Office Developers Association

  5. #5
    3 Star Lounger
    Join Date
    Dec 2001
    Location
    Doorn, Netherlands
    Posts
    311
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Reliability of Excel for Statistical Purposes?

    Although you seem to be more interested in writing a high precision function in VBA due to Excel's inaccuracy when using it as a statistical tool, this site http://digilander.libero.it/foxes/xnumber_review_47.htm (free) or http://precisioncalc.com/ (not free) may actually have the tools already.

    Regards, Teunis

  6. #6
    5 Star Lounger
    Join Date
    Mar 2002
    Location
    Buenos Aires, Argentina
    Posts
    877
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Reliability of Excel for Statistical Purposes? (XP or inferior)

    Hi Jan,

    I combined your code with mine and came up with this (eg. for the population variance):

    <code>Function VarianzaPX(Arg As Variant) As Double

    Dim n As Long
    Dim Suma As Double
    Dim Media As Double
    Dim SumaDesvios2 As Double
    Dim oArea As Range
    Dim oCell As Range

    If TypeName(Arg) = "Range" Then
    For Each oArea In Arg.Areas
    For Each oCell In oArea
    Suma = Suma + oCell.Value
    n = n + 1
    Next
    Next

    Media = Suma / n
    SumaDesvios2 = 0

    For Each oArea In Arg.Areas
    For Each oCell In oArea
    SumaDesvios2 = SumaDesvios2 + (oCell.Value - Media) ^ 2
    Next
    Next

    PruebaPX = SumaDesvios2 / n

    ' should Arg be a constant, the algorithm handles it adequately so the output is 0
    ' (not really aware of what other type Arg may be)
    End If

    End Function</code>

    As far as I tried it, it seems to work exactly like the former I had posted. I thought using range objects would make it possible to work with separate ranges (obtained by holding CTRL as you select them). When selecting separate ranges with the CTRL key as arguments for a formula, Excel writes them, for instance:

    B2:B4;D24

    And the revamped function won't work with such an argument either.
    I'm sure I'm missing something, but I don't know what exactly! <img src=/S/stupidme.gif border=0 alt=stupidme width=30 height=30>

    Thanks
    <img src=/w3timages/blue3line.gif width=33% height=2>
    <img src=/S/flags/Argentina.gif border=0 alt=Argentina width=30 height=18> <big><font color=4682b4><font face="Comic Sans MS">Diegol</font face=comic></font color=4682b4> </big>

  7. #7
    5 Star Lounger
    Join Date
    Mar 2002
    Location
    Buenos Aires, Argentina
    Posts
    877
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Reliability of Excel for Statistical Purposes?

    Hello Teunis,

    The days before reading the article I was looking for inspiration to delve into Excel functions coding in VBA. When I read the article I got somewhat astonished and didn't realize XNUMBERS (which I already knew) would suffice. Instead I found myself coding to get a taste of it. As I described in a previous post above, I even checked some VBA functions of XNUMBERS without realizing the solution was at the reach of my hand.
    BTW, the XVAR() formula in the XNUMBERS plug-in looks as though it'll take me a little longer to understand. I'll try to grasp it gradually...
    <img src=/w3timages/blue3line.gif width=33% height=2>
    <img src=/S/flags/Argentina.gif border=0 alt=Argentina width=30 height=18> <big><font color=4682b4><font face="Comic Sans MS">Diegol</font face=comic></font color=4682b4> </big>

  8. #8
    Platinum Lounger
    Join Date
    Feb 2001
    Location
    Weert, Limburg, Netherlands
    Posts
    4,812
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Reliability of Excel for Statistical Purposes? (XP or inferior)

    You are correct if you are indeed using direct references to ranges. But when you are using named ranges, the named range may consist of multiple areas.

    If you use the mouse and control key to select non-contiguous areas, than you get multiple arguments for each area indeed, so your function should then be able to cope with an unknown number of arguments. It can be done using the ParamArray keyword in the argument list of the function. (look up in Help) But in your case I'd leave it as is and have it accept a single range argument.
    Jan Karel Pieterse
    Microsoft Excel MVP, WMVP
    www.jkp-ads.com
    Professional Office Developers Association

  9. #9
    5 Star Lounger
    Join Date
    Mar 2002
    Location
    Buenos Aires, Argentina
    Posts
    877
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Reliability of Excel for Statistical Purposes? (XP or inferior)

    <hr>But when you are using named ranges, the named range may consist of multiple areas.<hr>
    So THAT's what I was missing.

    Thank you.
    <img src=/w3timages/blue3line.gif width=33% height=2>
    <img src=/S/flags/Argentina.gif border=0 alt=Argentina width=30 height=18> <big><font color=4682b4><font face="Comic Sans MS">Diegol</font face=comic></font color=4682b4> </big>

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •