# Thread: List Unique Value Problems (Excel 2000)

1. ## List Unique Value Problems (Excel 2000)

I use the following formula to list unique values in a list (D101105). However, something strange occurs. If the list is a plain list, i.e. just values, such as a, b, d,d,b ... the formula works. If the list has lots of formulas such as "vlookup", "Index( ....)" .... then the formula below doens't work. I'm puzzled.

={INDEX(D101105,SMALL(IF((ROW(INDIRECT("1:"&ROWS(D101105)))=MATCH(D101105,D101105,0)),ROW(INDIRECT("1:"&ROWS(D101105))),""),ROW(INDIRECT("1:"&ROWS(D101105)))))}

2. ## Re: List Unique Value Problems (Excel 2000)

A couple of corrections to your items to help avoid confusion:
You state:
"In F1 enter & copy down:"

I think that you mean:
"In F101 enter & copy down:"

You don't mention naming a range LIST for your formula:
=IF(ROW()-ROW(F\$101)+1<=MAX(\$E\$101:\$E\$105),INDEX(List,MATCH(ROW()-ROW(\$E\$101)+1,\$E\$101:\$E\$105,0)),"")

'List" refers to: \$D\$101:\$D\$105
=IF(ROW()-ROW(F\$101)+1<=MAX(\$E\$101:\$E\$105),INDEX(\$D\$101:\$D\$105,MATCH(ROW()-ROW(\$E\$101)+1,\$E\$101:\$E\$105,0)),"")

TQ2,
A warning about using array formulas with large ranges:
Each formula does calcs on the whole list. If you have losts of these formulas and your range is large, your spreadsheet can get very very sluggish. At some point you might want to consider just having a macro extract the unique list and "dump" it where you want.

Steve

3. ## Re: List Unique Value Problems (Excel 2000)

Let D100105 house: {"X";3;1;6;3;10}, where X is a label/header.

In E100 enter: Count

In E101 enter & copy down:

=IF((D101<>"")*ISNA(MATCH(D101,\$D\$100100,0)),MAX(\$E\$100:E100)+1,"")

In F101 enter & copy down:

=IF(ROW()-ROW(F\$101)+1<=MAX(\$E\$101:\$E\$105),INDEX(D101105,MATCH(ROW()-ROW(\$E\$101)+1,\$E\$101:\$E\$105,0)),"")

The foregoing requires 2*N cells for a list of size N.

Another approach is ( N cells for a list of size N, see http://makeashorterlink.com/?B19D510B6 )...

In E101 enter:

=INDEX(D101105,1)

In E102 control+shift+enter & copy down

=INDEX(\$D\$101:\$D\$105,MATCH(0,COUNTIF(E\$101:E101,\$D \$101:\$D\$105),0))

Edit: Substituted the real range for "List".

4. ## Re: List Unique Value Problems (Excel 2000)

Thanks for catching typo and "List", a name that I failed to replace by what it refers to.

Do you think that a macro would work faster than the first method?

5. ## Re: List Unique Value Problems (Excel 2000)

Yes a macro would be faster if the list is long enough. I made a list of 2500 items and added the 2 cols of formulas. It took a little over 1 sec to calc. My macro would give the same unsorted list in about 0.05 secs and about 0.15 secs for a sorted list.

The macro was also less sensitive to increased sizes, 5000 items took 3 secs to calc with formulas 0.07 /0.18 macro unsorted and sorted and with 10,000 took ca 7 secs with formulas and 0.14 and 0.22 with the macro.
FYI, Here is my "macro solution"

Steve

<pre>Sub ExtractUniqueList()
Dim wksList As Worksheet
Dim rngList As Range
Dim rngDest As Range
Dim iSortOrder As Integer

'********Change as appropriate ***************
'1=Asc, 0 = unsorted, -1 = desc
iSortOrder = iorder
Set rngDest = Worksheets("Sheet1").Range("D1")
Set wksList = Worksheets("Sheet1")
With wksList
Set rngList = .Range(.Range("A1"), .Range("a65536").End(xlUp))
End With
'************************************************* ********

Call ExtractList(UniqueList(rngList.Value, iSortOrder), rngDest)
Set rngDest = Nothing
Set rngList = Nothing
Set wksList = Nothing
End Sub
Sub ExtractList(vArray, rng As Range)
Dim vExtract()
Dim x As Long
Dim lItems As Long

lItems = UBound(vArray)

ReDim vExtract(1 To lItems, 1 To 1)

For x = 1 To lItems
vExtract(x, 1) = vArray(x)
Next

With rng
.Resize(65537 - .Row, 1).ClearContents
.Resize(lItems, 1).Value = vExtract
End With

End Sub

Public Function UniqueList(vArray, Optional iorder As Integer = 1)
'Takes an array and gives a unique list as an array output
'iOrder determines the sort at the end:
' 1 = Ascending (Default)
' 0 = Unsorted
' -1 = Descending

Dim lItems As Long
Dim NoDupes As New Collection
Dim x As Integer
Dim y As Integer
Dim i As Integer
Dim j As Integer
Dim Temp1
Dim Temp2
Dim vTemp()

On Error Resume Next
For lItems = LBound(vArray) To UBound(vArray)
If Not (IsEmpty(vArray(lItems, 1))) Then
If vArray(lItems, 1) <> "" Then _
End If
Next lItems
On Error GoTo 0

lItems = NoDupes.Count
If iorder <> 0 Then
For x = 1 To lItems - 1
For y = x + 1 To lItems
i = IIf(iorder < 0, y, x)
j = IIf(iorder < 0, x, y)
If NoDupes(i) > NoDupes(j) Then
Temp1 = NoDupes(i)
Temp2 = NoDupes(j)
NoDupes.Remove x + 1
NoDupes.Remove y + 1
End If
Next y
Next x
End If

ReDim vTemp(1 To lItems)
For x = 1 To lItems
vTemp(x) = NoDupes(x)
Next x
UniqueList = vTemp
End Function</pre>

6. ## Re: List Unique Value Problems (Excel 2000)

That's an interesting result.

Compared the initial set

=IF((A3<>"")*ISNA(MATCH(A3,\$A\$2:A2,0)),MAX(\$B\$2:B2 )+1,"")

=IF(ROW()-ROW(C\$3)+1<=MAX(\$B\$3:\$B\$5002),INDEX(\$A\$3:\$A\$5002,M ATCH(ROW()-ROW(\$C\$3)+1,\$B\$3:\$B\$5002,0)),"")

where the data and processing area (A1:C5002) would look like:

{"","","";"X","Count","U-List";473,1,473;453,2,453;907,3,907,...}

with

=MAX(\$B\$3:\$B\$5002)

=IF((A3<>"")*ISNA(MATCH(A3,\$A\$2:A2,0)),LOOKUP(9.99 999999999999E+307,\$B\$2:B2)+1,"")

=IF(ROW()-ROW(C\$3)+1<=\$A\$1,INDEX(\$A\$3:\$A\$5002,MATCH(ROW()-ROW(\$C\$3)+1,\$B\$3:\$B\$5002,0)),"")

where the data and processing area (A1:C5002) would look like:

{995,"","";"X",0,"U-List";473,1,473;453,2,453;907,3,907,...} with a required 0 in B2,

I get using FastExcel following averages (of 4 runs):

Re Calc: 1.6 vs 0.6 (in milliseconds)
Full Calc: 23,610.7 vs 922.9 (in milliseconds)
MicroSecs/Formula: 1,361.1 vs 92.3

It looks like the second set would be comparable to the macro in speed. It's interesting to note the impact of replacing MAX with LOOKUP.

7. ## Re: List Unique Value Problems (Excel 2000)

I just want to make sure (it is not clear to me, from your response): You get "similar" results that I report-the macro is much faster with large datasets than the array formulas?

Steve

8. ## Re: List Unique Value Problems (Excel 2000)

The formula sets I tested for speed are not array formulas.

My intial reply contains two methods: (1) a set of two ordinary (non-array) formulas, (2) a single array-formula.

I thought you compared the macro with method 1 and observed that the macro beats method 1 with large sets. My last post compares method 1 with an improved version of it. The improved version is so much better that I concluded it could be as good as the macro method.

9. ## Re: List Unique Value Problems (Excel 2000)

I realize they are not "array formulas", but they are "array" formulas. The distinction being they still work with arrays even though you don;t have to enter them using ctrl-alt-enter. MMULT and SUMPRODUCT are "array" formulas since they are written that way.

These are still "array calculations" since excel works thru your data. The problem is that in these and the the true "array" formulas that excel must look thru the list multiple time for each cell with a formula. It adds to the complexity of the worksheet and is inefficient.

In my tests I found:
the macro to be by far the best (perhaps it is also dependent on the dataset itself) with large datasets, the larger the dataset the more it indicated that the macro should be used.

Your Method one (with MAX) was the best
Your method 2 the "array formula" with INDEX was by the worst and I gave up on it after several minutes waiting for it to calculate a 5000 record dataset
Your Modification of Method 1 (with LOOKUP) I found to be worse then your original taking about 40 secs to do a full calc.

The methods you propose do have the advantage of "non -VB" and are also "live". But these are both at a price in dramatically increased filesizes and slower calcs with larger datasets. Formulas with large number of "list manipulations" and "list lookups" are able to be done with arrays, but their practicality becomes limited very rapidly. The aspect of them being "live" even gets negated if you must turn auto-calc off. Also even if they are "close" in calculation, the macro is calculated only when called, but the sluggishness will still occur with all the formulas: even if excel does not have to recalc, it still must keep the calc rings updated which can be a big part of initial calcs and when major changes are made to the dataset.

I think the other advantage of the macro, is that the coding is generally much more straightforward to debug and modify. I work with arrays and frequently provide them as solutions, and then are rarely "obvious" about what they do.

Steve

10. ## Re: List Unique Value Problems (Excel 2000)

> These are still "array calculations" since excel works thru your data.

I don't subscribe that view. SumProduct, MMult, and formulas confirmed with control+shift+enter operates on (computed) arrays, but the formulas of Method 1 with MAX, including the LOOKUP version of it operate on range objects. I do not consider such formulas as "array calculations".

> Your Method one (with MAX) was the best
> Your method 2 the "array formula" with INDEX was by the worst and I gave up on it after several minutes waiting for it to calculate a 5000 record dataset
> Your Modification of Method 1 (with LOOKUP) I found to be worse then your original taking about 40 secs to do a full calc.

FastExcel indicates otherwise: The set with LOOKUP performs better than the set with MAX, which is understandable for MAX must examine every cell value, while LOOKUP doesn't (using probably binary search).

> I think the other advantage of the macro, is that the coding is generally much more straightforward to debug and modify. I work with arrays and frequently provide
> them as solutions, and then are rarely "obvious" about what they do.

I agree about the misgivings about the formulas operating on computed arrays. If they can be restricted to apply to relevant subranges within the larger range of interest, the effects can be dramatic. See for an example: http://www.mrexcel.com/board2/viewto...62357&start=10

11. ## Re: List Unique Value Problems (Excel 2000)

I agree that the "array" is nomenclature. I think of the "logic" of the calc rather than the entry and the output. Any time you do a formula with a range, you are doing something of an "array" calc since excel must look thru the list, but that is another discussion.

I don't know why I see a different calc result than you do, it might be my system, it might be the dataset (My set was all company names), the barometric pressure in Pittsburgh, ??, but TQ2 can decide what works best in the application on that system.

I do perfer formulas as much as possible (even the arrays and megaformulas or even several intermediate if necessary) since i find them to be faster, less work, and "live", but I have seen instances where the large number of formulas and lookups [See <post#=315619>post 315619</post#> for a recent example] can be bothersome and some "autorunning" macros will work much better.

Good Discussion!
Thanks,
Steve

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•