1. Count unique numbers in a spreadsheet

Hi there

I have been able to calculate how many times each document has been downloaded using the COUNTIFS function, but I am battling to identify how many unique downloads there have been....

There is a function FREQUENCY that I thought would help, but it does not appear to give the right answers - it looks for a range...

Format for the spreadsheet of data is:
A B C D E
Filename - Document Number - Description - Email - Date/Time

There are 2200 rows.................!!

2. A pivot table should give you the frequency if you count the documents and individual downloaders, but if the text has extra spaces or capitals vs lowercase you may not get the right answer. You'll need to convert everything to the same case and remove leading and trailing spaces before running the pivot.

cheers, Paul

3. It would be helpful if you attached a sample workbook. It does not have to contain 2200 rows, but just enough to demonstrate the function not returning what you want and to show us what you need. An output of the results of the sample would also be helpful so that we can check our logic in computing.

Steve
If I understand the setup (and I may not) and you have no blanks in the data range, the array formula (confirm with ctrl-shift-enter) may be want you want (adjust the range as needed):
Code:
=SUM(1/COUNTIFS(A2:A2200,A2:A2200,D2:D2200,D2:D2200))

4. Thanks for the suggestions.........

Here is a sample of the data that I want to extract.

The data is in one sheet, whilst the summary is in the other worksheet of the file.

Thanks so much!!!!

Cheers

5. Photerer,

Here are 2 sample codes:

1. When placed in the Worksheet_Activate event subroutine, this code will filter all the unique downloaded files, document their frequency, and then tally the number of unique files found in cell C2. The Download page will be updated automatically each time it is viewed.

Filtered:

Code:
Private Sub Worksheet_Activate()
Application.ScreenUpdating = False
'-------------------------------------------------------
LastRow = ActiveSheet.Cells(Rows.Count, 2).End(xlUp).Row
Range(Cells(5, 1), Cells(LastRow, 2)).ClearContents
'-------------------------------------------------------
'COPY FILES
With Worksheets("Sheet1")
LastRow = .Cells(Rows.Count, 1).End(xlUp).Row
For I = 2 To LastRow
Cells(I + 3, 2) = .Cells(I, 1)
Cells(I + 3, 1) = .Cells(I, 2)
Next I
'------------------------------------------------------
'CREATE UNIQUE LIST
LastRow = ActiveSheet.Cells(Rows.Count, 2).End(xlUp).Row
LastRow = ActiveSheet.Cells(Rows.Count, 2).End(xlUp).Row
'------------------------------------------------------
'TOTAL UNIQUE FILES
Cells(2, 3) = LastRow - 4
End With
Application.ScreenUpdating = True
End Sub
2. This code will display all the file names and their frequencies but tally only those whose frequencies are not 0 (unique) in cell C2

Not Filtered:

Code:
Private Sub Worksheet_Activate()
Application.ScreenUpdating = False
Counter = 0
'-------------------------------------------------------
'COUNT UNIQUE FILES
LastRow = ActiveSheet.Cells(Rows.Count, 2).End(xlUp).Row
For I = 5 To LastRow
If Cells(I, 3) <> 0 Then Counter = Counter + 1
Next I
Cells(2, 3) = Counter
Application.ScreenUpdating = True
End Sub

6. The Following User Says Thank You to Maudibe For This Useful Post:

Photorer (2014-03-18)

7. Thanks - that is great, however what I am after is that for each document I need to know how many times it has been downloaded, and then how many unique e-mail users downloaded the document. Sometimes the same user downloads the same document more than once, so it distorts the results.

Are there any built in functions to do this? There is a button on the Data Tab of the ribbon that removes duplicates. Is there any way of doing that without changing the original table?

8. So you are not talking about # of unique documents rather # of each document downloaded by a unique user?

9. Yes - that's exactly right.... sorry if I didn't explain correctly.

10. I am not sure you can do it directly with an formula (I will think some more about it), but it can be done with a formula using an intermediate calc. In Sheet1!F2:
=1/SUMPRODUCT((\$A\$2:\$A\$32&\$D\$2:\$D\$32=A2&D2)*1)
Copy F2 down the column. This is the intemediate calc. It is the inverse of the number of copies for each row. If the number in F is 1 then the combo of filename and email is unique. If the value is 0.5 (=1/2) there are 2 copies of the combination, if 0.33333 (=1/3) there are 3 copies, etc.

=SUMIF(Sheet1!\$A\$2:\$A\$32,B5,Sheet1!\$F\$2:\$F\$32)

Copy E5 down the column. This should be the sum of unique entries for a particular filename.

It could be also done with a user defined function, but that is more work and since each formula would need to calculate based on the entire range, with 2000 calls it may be sluggish, the intermediate column eliminates some of the redundant calculations and even the most complex formulas are more efficient than a even a simple UDF.

Steve

11. The Following User Says Thank You to sdckapr For This Useful Post:

Photorer (2014-03-18)

12. If you want to try the User defined function here is simple means:
Code:
Option Explicit
Function CountUnique(sFilename As String, rFilename As Range, rEmail As Range) As Long
Dim x As Long
Dim colUnique As New Collection
Dim sTemp As String
On Error Resume Next
For x = 1 To rFilename.Count
sTemp = rFilename(x)
If sTemp = sFilename Then
End If
Next
CountUnique = colUnique.Count
End Function
=countunique(B5,Sheet1!\$A\$2:\$A\$32,Sheet1!\$D\$2:\$D\$3 2)

Copy it down the column.

Steve

13. The Following User Says Thank You to sdckapr For This Useful Post:

Photorer (2014-03-18)

14. Hey Steve

How can I thank you!??? That is awesome! Just the result I was looking for. What a great help!

I can easily work on the two formulas in the two spreadsheets..... it is easy to do, and copying of data when it is updated is also easy to the data sheet.... it just gets longer! I will amend the ranges to something longer than the actual data, just to make sure I don't miss any of the downloads.

Please would you explain how this works for me:
=1/SUMPRODUCT((\$A\$2:\$A\$32&\$D\$2:\$D\$32=A2&D2)*1)
I understand that it is the inverse of the SUMPRODUCT result, but I have not seen the & sign used in formulas that don't entail a text string!

Thanks for educating me along the way!

15. Perhaps this will help
\$A\$2:\$A\$32 is the filenames: the 31 cells from A2 to A32
\$D\$2:\$D\$32 is the email range: the the 31 cells from D2 to D32

Imagine in column F you add the formula:
A2&D2
Which would concatenate the filename and the email together
Now imagine copying it down from F3:F32

Now you would have
F2:F32 is the Combined file & email range of the 31 combinations.

The formula:
\$A\$2:\$A\$32&\$D\$2:\$D\$32
is equivalent to that range only it is internal, not physical in the worksheet

SUMPRODUCT((\$A\$2:\$A\$32&\$D\$2:\$D\$32=A2&D2)*1)

Checks an individual combined result versus all the results and creates an array of true/false values and teh "*1" converts to numbers: FALSE = 0, TRUE = 1. This essentially counts the number of the matching combinations.

[It is akin to (using our imaginary F-col range);
SUMIF(\$F\$2:\$F\$32, A2:B2)
but SUMIF does not like the array manipulation of creating the internal intermediate column, where sumproduct does]

So the sumproduct would give the number of copies of each of those combinations.
=1/SUMPRODUCT((\$A\$2:\$A\$32&\$D\$2:\$D\$32=A2&D2)*1)
is the inverse. The inverse is used, so when you sum them up, all of the copies add to "1":
1 unique = 1
With 2 copies each has a value of 1/2 [2* 1/2 = 1]
with 3 copies each has a value of 1/3 [3* 1/3 = 1]
Etc.

I could only figure out a way to eliminate one intermediate column, but not the 2nd, so that inverse of the copy count needs to be used. If you want and it makes more sense to you, you could use the intermediate "F-col calc" with the inverse of the sumif to achieve the same results. Other than using another column, the performance should not be dramatically affected. [The "understanding" may be of more value to you than the elimination of the 2nd intermediate calculation in a column, that is your call...]

Steve

16. The Following User Says Thank You to sdckapr For This Useful Post:

Photorer (2014-03-18)

17. Steve - I have only one thing to say: "Give that man a Bells!"

Thank you so much for explaining this to me. Combining the filename and the e-mail is a stroke of genius! That is definitely the cleverest and simplest way of comparing the two - I would never have thought of that!

Your explanation is also really clear - thanks for that.... I am still learning about array calculations / functions so this is a big help.

Thanks a million both for your time to work out a solution, and even more so for explaining it to me so clearly!

18. You are very welcome. I am glad that I could help. If you have any other questions don't hesitate to post back.

Chip Pearson has an excellent primer on Array formulas at http://www.cpearson.com/excel/ArrayFormulas.aspx and Bob Umlas has some great examples of them as well at http://www.emailoffice.com/excel/arrays-bobumlas.html if you are looking for some good reading on them to help understand them.
Steve

19. You already have a solution, so just fyi:
Originally Posted by Photorer
There is a button on the Data Tab of the ribbon that removes duplicates. Is there any way of doing that without changing the original table?
Quick n dirty would be to record a macro which copies the sheet to a new sheet, and then runs Remove Duplicates on the columns you select in the new sheet. Count formulas would then give the numbers you need from each sheet.

Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•