Results 1 to 11 of 11
  1. #1
    2 Star Lounger
    Join Date
    Mar 2003
    Posts
    174
    Thanks
    0
    Thanked 0 Times in 0 Posts

    List Unique Value Problems (Excel 2000)

    I use the following formula to list unique values in a list (D101105). However, something strange occurs. If the list is a plain list, i.e. just values, such as a, b, d,d,b ... the formula works. If the list has lots of formulas such as "vlookup", "Index( ....)" .... then the formula below doens't work. I'm puzzled.

    ={INDEX(D101105,SMALL(IF((ROW(INDIRECT("1:"&ROWS(D101105)))=MATCH(D101105,D101105,0)),ROW(INDIRECT("1:"&ROWS(D101105))),""),ROW(INDIRECT("1:"&ROWS(D101105)))))}

  2. #2
    WS Lounge VIP sdckapr's Avatar
    Join Date
    Jul 2002
    Location
    Pittsburgh, Pennsylvania, USA
    Posts
    11,225
    Thanks
    14
    Thanked 342 Times in 335 Posts

    Re: List Unique Value Problems (Excel 2000)

    A couple of corrections to your items to help avoid confusion:
    You state:
    "In F1 enter & copy down:"

    I think that you mean:
    "In F101 enter & copy down:"

    You don't mention naming a range LIST for your formula:
    =IF(ROW()-ROW(F$101)+1<=MAX($E$101:$E$105),INDEX(List,MATCH(ROW()-ROW($E$101)+1,$E$101:$E$105,0)),"")

    'List" refers to: $D$101:$D$105
    =IF(ROW()-ROW(F$101)+1<=MAX($E$101:$E$105),INDEX($D$101:$D$105,MATCH(ROW()-ROW($E$101)+1,$E$101:$E$105,0)),"")

    TQ2,
    A warning about using array formulas with large ranges:
    Each formula does calcs on the whole list. If you have losts of these formulas and your range is large, your spreadsheet can get very very sluggish. At some point you might want to consider just having a macro extract the unique list and "dump" it where you want.

    Steve

  3. #3
    3 Star Lounger
    Join Date
    Jan 2002
    Location
    The Hague, Netherlands
    Posts
    283
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: List Unique Value Problems (Excel 2000)

    Let D100105 house: {"X";3;1;6;3;10}, where X is a label/header.

    In E100 enter: Count

    In E101 enter & copy down:

    =IF((D101<>"")*ISNA(MATCH(D101,$D$100100,0)),MAX($E$100:E100)+1,"")

    In F101 enter & copy down:

    =IF(ROW()-ROW(F$101)+1<=MAX($E$101:$E$105),INDEX(D101105,MATCH(ROW()-ROW($E$101)+1,$E$101:$E$105,0)),"")

    The foregoing requires 2*N cells for a list of size N.

    Another approach is ( N cells for a list of size N, see http://makeashorterlink.com/?B19D510B6 )...

    In E101 enter:

    =INDEX(D101105,1)

    In E102 control+shift+enter & copy down

    =INDEX($D$101:$D$105,MATCH(0,COUNTIF(E$101:E101,$D $101:$D$105),0))

    Edit: Substituted the real range for "List".
    Microsoft MVP - Excel

  4. #4
    3 Star Lounger
    Join Date
    Jan 2002
    Location
    The Hague, Netherlands
    Posts
    283
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: List Unique Value Problems (Excel 2000)

    Thanks for catching typo and "List", a name that I failed to replace by what it refers to.

    Do you think that a macro would work faster than the first method?
    Microsoft MVP - Excel

  5. #5
    WS Lounge VIP sdckapr's Avatar
    Join Date
    Jul 2002
    Location
    Pittsburgh, Pennsylvania, USA
    Posts
    11,225
    Thanks
    14
    Thanked 342 Times in 335 Posts

    Re: List Unique Value Problems (Excel 2000)

    Yes a macro would be faster if the list is long enough. I made a list of 2500 items and added the 2 cols of formulas. It took a little over 1 sec to calc. My macro would give the same unsorted list in about 0.05 secs and about 0.15 secs for a sorted list.

    The macro was also less sensitive to increased sizes, 5000 items took 3 secs to calc with formulas 0.07 /0.18 macro unsorted and sorted and with 10,000 took ca 7 secs with formulas and 0.14 and 0.22 with the macro.
    FYI, Here is my "macro solution"

    Steve


    <pre>Sub ExtractUniqueList()
    Dim wksList As Worksheet
    Dim rngList As Range
    Dim rngDest As Range
    Dim iSortOrder As Integer

    '********Change as appropriate ***************
    '1=Asc, 0 = unsorted, -1 = desc
    iSortOrder = iorder
    Set rngDest = Worksheets("Sheet1").Range("D1")
    Set wksList = Worksheets("Sheet1")
    With wksList
    Set rngList = .Range(.Range("A1"), .Range("a65536").End(xlUp))
    End With
    '************************************************* ********

    Call ExtractList(UniqueList(rngList.Value, iSortOrder), rngDest)
    Set rngDest = Nothing
    Set rngList = Nothing
    Set wksList = Nothing
    End Sub
    Sub ExtractList(vArray, rng As Range)
    Dim vExtract()
    Dim x As Long
    Dim lItems As Long

    lItems = UBound(vArray)

    ReDim vExtract(1 To lItems, 1 To 1)

    For x = 1 To lItems
    vExtract(x, 1) = vArray(x)
    Next

    With rng
    .Resize(65537 - .Row, 1).ClearContents
    .Resize(lItems, 1).Value = vExtract
    End With

    End Sub

    Public Function UniqueList(vArray, Optional iorder As Integer = 1)
    'Takes an array and gives a unique list as an array output
    'iOrder determines the sort at the end:
    ' 1 = Ascending (Default)
    ' 0 = Unsorted
    ' -1 = Descending

    Dim lItems As Long
    Dim NoDupes As New Collection
    Dim x As Integer
    Dim y As Integer
    Dim i As Integer
    Dim j As Integer
    Dim Temp1
    Dim Temp2
    Dim vTemp()

    On Error Resume Next
    For lItems = LBound(vArray) To UBound(vArray)
    If Not (IsEmpty(vArray(lItems, 1))) Then
    If vArray(lItems, 1) <> "" Then _
    NoDupes.Add vArray(lItems, 1), CStr(vArray(lItems, 1))
    End If
    Next lItems
    On Error GoTo 0

    lItems = NoDupes.Count
    If iorder <> 0 Then
    For x = 1 To lItems - 1
    For y = x + 1 To lItems
    i = IIf(iorder < 0, y, x)
    j = IIf(iorder < 0, x, y)
    If NoDupes(i) > NoDupes(j) Then
    Temp1 = NoDupes(i)
    Temp2 = NoDupes(j)
    NoDupes.Add Temp1, before:=j
    NoDupes.Add Temp2, before:=i
    NoDupes.Remove x + 1
    NoDupes.Remove y + 1
    End If
    Next y
    Next x
    End If

    ReDim vTemp(1 To lItems)
    For x = 1 To lItems
    vTemp(x) = NoDupes(x)
    Next x
    UniqueList = vTemp
    End Function</pre>


  6. #6
    3 Star Lounger
    Join Date
    Jan 2002
    Location
    The Hague, Netherlands
    Posts
    283
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: List Unique Value Problems (Excel 2000)

    That's an interesting result.

    Compared the initial set

    =IF((A3<>"")*ISNA(MATCH(A3,$A$2:A2,0)),MAX($B$2:B2 )+1,"")

    =IF(ROW()-ROW(C$3)+1<=MAX($B$3:$B$5002),INDEX($A$3:$A$5002,M ATCH(ROW()-ROW($C$3)+1,$B$3:$B$5002,0)),"")

    where the data and processing area (A1:C5002) would look like:

    {"","","";"X","Count","U-List";473,1,473;453,2,453;907,3,907,...}


    with

    =MAX($B$3:$B$5002)

    =IF((A3<>"")*ISNA(MATCH(A3,$A$2:A2,0)),LOOKUP(9.99 999999999999E+307,$B$2:B2)+1,"")

    =IF(ROW()-ROW(C$3)+1<=$A$1,INDEX($A$3:$A$5002,MATCH(ROW()-ROW($C$3)+1,$B$3:$B$5002,0)),"")

    where the data and processing area (A1:C5002) would look like:

    {995,"","";"X",0,"U-List";473,1,473;453,2,453;907,3,907,...} with a required 0 in B2,

    I get using FastExcel following averages (of 4 runs):

    Re Calc: 1.6 vs 0.6 (in milliseconds)
    Full Calc: 23,610.7 vs 922.9 (in milliseconds)
    MicroSecs/Formula: 1,361.1 vs 92.3

    It looks like the second set would be comparable to the macro in speed. It's interesting to note the impact of replacing MAX with LOOKUP.
    Microsoft MVP - Excel

  7. #7
    WS Lounge VIP sdckapr's Avatar
    Join Date
    Jul 2002
    Location
    Pittsburgh, Pennsylvania, USA
    Posts
    11,225
    Thanks
    14
    Thanked 342 Times in 335 Posts

    Re: List Unique Value Problems (Excel 2000)

    I just want to make sure (it is not clear to me, from your response): You get "similar" results that I report-the macro is much faster with large datasets than the array formulas?

    Steve

  8. #8
    3 Star Lounger
    Join Date
    Jan 2002
    Location
    The Hague, Netherlands
    Posts
    283
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: List Unique Value Problems (Excel 2000)

    The formula sets I tested for speed are not array formulas.

    My intial reply contains two methods: (1) a set of two ordinary (non-array) formulas, (2) a single array-formula.

    I thought you compared the macro with method 1 and observed that the macro beats method 1 with large sets. My last post compares method 1 with an improved version of it. The improved version is so much better that I concluded it could be as good as the macro method.
    Microsoft MVP - Excel

  9. #9
    WS Lounge VIP sdckapr's Avatar
    Join Date
    Jul 2002
    Location
    Pittsburgh, Pennsylvania, USA
    Posts
    11,225
    Thanks
    14
    Thanked 342 Times in 335 Posts

    Re: List Unique Value Problems (Excel 2000)

    Sorry, wrong /inadequate nomenclature.
    I realize they are not "array formulas", but they are "array" formulas. The distinction being they still work with arrays even though you don;t have to enter them using ctrl-alt-enter. MMULT and SUMPRODUCT are "array" formulas since they are written that way.

    These are still "array calculations" since excel works thru your data. The problem is that in these and the the true "array" formulas that excel must look thru the list multiple time for each cell with a formula. It adds to the complexity of the worksheet and is inefficient.

    In my tests I found:
    the macro to be by far the best (perhaps it is also dependent on the dataset itself) with large datasets, the larger the dataset the more it indicated that the macro should be used.

    Your Method one (with MAX) was the best
    Your method 2 the "array formula" with INDEX was by the worst and I gave up on it after several minutes waiting for it to calculate a 5000 record dataset
    Your Modification of Method 1 (with LOOKUP) I found to be worse then your original taking about 40 secs to do a full calc.

    The methods you propose do have the advantage of "non -VB" and are also "live". But these are both at a price in dramatically increased filesizes and slower calcs with larger datasets. Formulas with large number of "list manipulations" and "list lookups" are able to be done with arrays, but their practicality becomes limited very rapidly. The aspect of them being "live" even gets negated if you must turn auto-calc off. Also even if they are "close" in calculation, the macro is calculated only when called, but the sluggishness will still occur with all the formulas: even if excel does not have to recalc, it still must keep the calc rings updated which can be a big part of initial calcs and when major changes are made to the dataset.

    I think the other advantage of the macro, is that the coding is generally much more straightforward to debug and modify. I work with arrays and frequently provide them as solutions, and then are rarely "obvious" about what they do.

    Steve

  10. #10
    3 Star Lounger
    Join Date
    Jan 2002
    Location
    The Hague, Netherlands
    Posts
    283
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: List Unique Value Problems (Excel 2000)

    > These are still "array calculations" since excel works thru your data.

    I don't subscribe that view. SumProduct, MMult, and formulas confirmed with control+shift+enter operates on (computed) arrays, but the formulas of Method 1 with MAX, including the LOOKUP version of it operate on range objects. I do not consider such formulas as "array calculations".

    > Your Method one (with MAX) was the best
    > Your method 2 the "array formula" with INDEX was by the worst and I gave up on it after several minutes waiting for it to calculate a 5000 record dataset
    > Your Modification of Method 1 (with LOOKUP) I found to be worse then your original taking about 40 secs to do a full calc.

    FastExcel indicates otherwise: The set with LOOKUP performs better than the set with MAX, which is understandable for MAX must examine every cell value, while LOOKUP doesn't (using probably binary search).

    > I think the other advantage of the macro, is that the coding is generally much more straightforward to debug and modify. I work with arrays and frequently provide
    > them as solutions, and then are rarely "obvious" about what they do.

    I agree about the misgivings about the formulas operating on computed arrays. If they can be restricted to apply to relevant subranges within the larger range of interest, the effects can be dramatic. See for an example: http://www.mrexcel.com/board2/viewto...62357&start=10

    Aladin
    Microsoft MVP - Excel

  11. #11
    WS Lounge VIP sdckapr's Avatar
    Join Date
    Jul 2002
    Location
    Pittsburgh, Pennsylvania, USA
    Posts
    11,225
    Thanks
    14
    Thanked 342 Times in 335 Posts

    Re: List Unique Value Problems (Excel 2000)

    I agree that the "array" is nomenclature. I think of the "logic" of the calc rather than the entry and the output. Any time you do a formula with a range, you are doing something of an "array" calc since excel must look thru the list, but that is another discussion.

    I don't know why I see a different calc result than you do, it might be my system, it might be the dataset (My set was all company names), the barometric pressure in Pittsburgh, ??, but TQ2 can decide what works best in the application on that system.

    I do perfer formulas as much as possible (even the arrays and megaformulas or even several intermediate if necessary) since i find them to be faster, less work, and "live", but I have seen instances where the large number of formulas and lookups [See <post#=315619>post 315619</post#> for a recent example] can be bothersome and some "autorunning" macros will work much better.

    Good Discussion!
    Thanks,
    Steve

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •