Results 1 to 15 of 15

20140317, 03:40 #1
 Join Date
 Jan 2011
 Posts
 193
 Thanks
 21
 Thanked 2 Times in 2 Posts
Count unique numbers in a spreadsheet
Hi there
I have a spreadsheet of documents that have been downloaded in one column, and the email address of the person who downloaded it in another.
Some people have downloaded the same document more than once.
I have been able to calculate how many times each document has been downloaded using the COUNTIFS function, but I am battling to identify how many unique downloads there have been....
There is a function FREQUENCY that I thought would help, but it does not appear to give the right answers  it looks for a range...
Please give me some tips.....
Format for the spreadsheet of data is:
A B C D E
Filename  Document Number  Description  Email  Date/Time
There are 2200 rows.................!!
Thanks for the advice!

Subscribe to our Windows Secrets Newsletter  It's Free!
Get our unique weekly Newsletter with tips and techniques, how to's and critical updates on Windows 7, Windows 8, Windows XP, Firefox, Internet Explorer, Google, etc. Join our 480,000 subscribers!
+ Get this BONUS — free!
Get the most of Excel! Learn about new features, basics of creating a new spreadsheet and using the infamous Ribbon in the first chapter of Excel 2013: The Missing Manual  Subscribe and download Chapter 1 for free!

20140317, 06:01 #2
 Join Date
 Dec 2009
 Location
 Earth
 Posts
 3,463
 Thanks
 7
 Thanked 214 Times in 203 Posts
A pivot table should give you the frequency if you count the documents and individual downloaders, but if the text has extra spaces or capitals vs lowercase you may not get the right answer. You'll need to convert everything to the same case and remove leading and trailing spaces before running the pivot.
cheers, Paul

20140317, 06:48 #3
 Join Date
 Jul 2002
 Location
 Pittsburgh, Pennsylvania, USA
 Posts
 11,198
 Thanks
 14
 Thanked 329 Times in 322 Posts
It would be helpful if you attached a sample workbook. It does not have to contain 2200 rows, but just enough to demonstrate the function not returning what you want and to show us what you need. An output of the results of the sample would also be helpful so that we can check our logic in computing.
Steve
If I understand the setup (and I may not) and you have no blanks in the data range, the array formula (confirm with ctrlshiftenter) may be want you want (adjust the range as needed):
Code:=SUM(1/COUNTIFS(A2:A2200,A2:A2200,D2:D2200,D2:D2200))
Last edited by sdckapr; 20140317 at 06:56.

20140317, 07:46 #4
 Join Date
 Jan 2011
 Posts
 193
 Thanks
 21
 Thanked 2 Times in 2 Posts
Thanks for the suggestions.........
Here is a sample of the data that I want to extract.
The data is in one sheet, whilst the summary is in the other worksheet of the file.
Thanks so much!!!!
Cheers
AlainDownload sample.xlsx

20140317, 20:04 #5
 Join Date
 Aug 2010
 Location
 Pa, USA
 Posts
 1,089
 Thanks
 39
 Thanked 189 Times in 176 Posts
Photerer,
Here are 2 sample codes:
1. When placed in the Worksheet_Activate event subroutine, this code will filter all the unique downloaded files, document their frequency, and then tally the number of unique files found in cell C2. The Download page will be updated automatically each time it is viewed.
Filtered:
downloads1.png
Code:Private Sub Worksheet_Activate() Application.ScreenUpdating = False ' 'CLEAR DOWNLOADS SHEET LastRow = ActiveSheet.Cells(Rows.Count, 2).End(xlUp).Row Range(Cells(5, 1), Cells(LastRow, 2)).ClearContents ' 'COPY FILES With Worksheets("Sheet1") LastRow = .Cells(Rows.Count, 1).End(xlUp).Row For I = 2 To LastRow Cells(I + 3, 2) = .Cells(I, 1) Cells(I + 3, 1) = .Cells(I, 2) Next I ' 'CREATE UNIQUE LIST LastRow = ActiveSheet.Cells(Rows.Count, 2).End(xlUp).Row ActiveSheet.Range("A5:B" & LastRow).RemoveDuplicates Columns:=1, Header:=xlNo LastRow = ActiveSheet.Cells(Rows.Count, 2).End(xlUp).Row ' 'TOTAL UNIQUE FILES Cells(2, 3) = LastRow  4 End With Application.ScreenUpdating = True End Sub
Not Filtered:
downloads2.png
Code:Private Sub Worksheet_Activate() Application.ScreenUpdating = False Counter = 0 ' 'COUNT UNIQUE FILES LastRow = ActiveSheet.Cells(Rows.Count, 2).End(xlUp).Row For I = 5 To LastRow If Cells(I, 3) <> 0 Then Counter = Counter + 1 Next I Cells(2, 3) = Counter Application.ScreenUpdating = True End Sub

The Following User Says Thank You to Maudibe For This Useful Post:
Photorer (20140318)

20140318, 01:57 #6
 Join Date
 Jan 2011
 Posts
 193
 Thanks
 21
 Thanked 2 Times in 2 Posts
Thanks  that is great, however what I am after is that for each document I need to know how many times it has been downloaded, and then how many unique email users downloaded the document. Sometimes the same user downloads the same document more than once, so it distorts the results.
Are there any built in functions to do this? There is a button on the Data Tab of the ribbon that removes duplicates. Is there any way of doing that without changing the original table?

20140318, 06:23 #7
 Join Date
 Aug 2010
 Location
 Pa, USA
 Posts
 1,089
 Thanks
 39
 Thanked 189 Times in 176 Posts
So you are not talking about # of unique documents rather # of each document downloaded by a unique user?

20140318, 07:01 #8
 Join Date
 Jan 2011
 Posts
 193
 Thanks
 21
 Thanked 2 Times in 2 Posts
Yes  that's exactly right.... sorry if I didn't explain correctly.

20140318, 08:44 #9
 Join Date
 Jul 2002
 Location
 Pittsburgh, Pennsylvania, USA
 Posts
 11,198
 Thanks
 14
 Thanked 329 Times in 322 Posts
I am not sure you can do it directly with an formula (I will think some more about it), but it can be done with a formula using an intermediate calc. In Sheet1!F2:
=1/SUMPRODUCT(($A$2:$A$32&$D$2:$D$32=A2&D2)*1)
Copy F2 down the column. This is the intemediate calc. It is the inverse of the number of copies for each row. If the number in F is 1 then the combo of filename and email is unique. If the value is 0.5 (=1/2) there are 2 copies of the combination, if 0.33333 (=1/3) there are 3 copies, etc.
In Downloads!E5:
=SUMIF(Sheet1!$A$2:$A$32,B5,Sheet1!$F$2:$F$32)
Copy E5 down the column. This should be the sum of unique entries for a particular filename.
It could be also done with a user defined function, but that is more work and since each formula would need to calculate based on the entire range, with 2000 calls it may be sluggish, the intermediate column eliminates some of the redundant calculations and even the most complex formulas are more efficient than a even a simple UDF.
Steve

The Following User Says Thank You to sdckapr For This Useful Post:
Photorer (20140318)

20140318, 09:09 #10
 Join Date
 Jul 2002
 Location
 Pittsburgh, Pennsylvania, USA
 Posts
 11,198
 Thanks
 14
 Thanked 329 Times in 322 Posts
If you want to try the User defined function here is simple means:
Code:Option Explicit Function CountUnique(sFilename As String, rFilename As Range, rEmail As Range) As Long Dim x As Long Dim colUnique As New Collection Dim sTemp As String On Error Resume Next For x = 1 To rFilename.Count sTemp = rFilename(x) If sTemp = sFilename Then colUnique.Add sTemp, CStr(sTemp & rEmail(x)) End If Next CountUnique = colUnique.Count End Function
=countunique(B5,Sheet1!$A$2:$A$32,Sheet1!$D$2:$D$3 2)
Copy it down the column.
Steve

The Following User Says Thank You to sdckapr For This Useful Post:
Photorer (20140318)

20140318, 14:27 #11
 Join Date
 Jan 2011
 Posts
 193
 Thanks
 21
 Thanked 2 Times in 2 Posts
Hey Steve
How can I thank you!??? That is awesome! Just the result I was looking for. What a great help!
I can easily work on the two formulas in the two spreadsheets..... it is easy to do, and copying of data when it is updated is also easy to the data sheet.... it just gets longer! I will amend the ranges to something longer than the actual data, just to make sure I don't miss any of the downloads.
Please would you explain how this works for me:
=1/SUMPRODUCT(($A$2:$A$32&$D$2:$D$32=A2&D2)*1)
I understand that it is the inverse of the SUMPRODUCT result, but I have not seen the & sign used in formulas that don't entail a text string!
Thanks for educating me along the way!

20140318, 14:47 #12
 Join Date
 Jul 2002
 Location
 Pittsburgh, Pennsylvania, USA
 Posts
 11,198
 Thanks
 14
 Thanked 329 Times in 322 Posts
Perhaps this will help
$A$2:$A$32 is the filenames: the 31 cells from A2 to A32
$D$2:$D$32 is the email range: the the 31 cells from D2 to D32
Imagine in column F you add the formula:
A2&D2
Which would concatenate the filename and the email together
Now imagine copying it down from F3:F32
Now you would have
F2:F32 is the Combined file & email range of the 31 combinations.
The formula:
$A$2:$A$32&$D$2:$D$32
is equivalent to that range only it is internal, not physical in the worksheet
SUMPRODUCT(($A$2:$A$32&$D$2:$D$32=A2&D2)*1)
Checks an individual combined result versus all the results and creates an array of true/false values and teh "*1" converts to numbers: FALSE = 0, TRUE = 1. This essentially counts the number of the matching combinations.
[It is akin to (using our imaginary Fcol range);
SUMIF($F$2:$F$32, A2:B2)
but SUMIF does not like the array manipulation of creating the internal intermediate column, where sumproduct does]
So the sumproduct would give the number of copies of each of those combinations.
=1/SUMPRODUCT(($A$2:$A$32&$D$2:$D$32=A2&D2)*1)
is the inverse. The inverse is used, so when you sum them up, all of the copies add to "1":
1 unique = 1
With 2 copies each has a value of 1/2 [2* 1/2 = 1]
with 3 copies each has a value of 1/3 [3* 1/3 = 1]
Etc.
I could only figure out a way to eliminate one intermediate column, but not the 2nd, so that inverse of the copy count needs to be used. If you want and it makes more sense to you, you could use the intermediate "Fcol calc" with the inverse of the sumif to achieve the same results. Other than using another column, the performance should not be dramatically affected. [The "understanding" may be of more value to you than the elimination of the 2nd intermediate calculation in a column, that is your call...]
Steve

The Following User Says Thank You to sdckapr For This Useful Post:
Photorer (20140318)

20140318, 15:33 #13
 Join Date
 Jan 2011
 Posts
 193
 Thanks
 21
 Thanked 2 Times in 2 Posts
Steve  I have only one thing to say: "Give that man a Bells!"
Thank you so much for explaining this to me. Combining the filename and the email is a stroke of genius! That is definitely the cleverest and simplest way of comparing the two  I would never have thought of that!
Your explanation is also really clear  thanks for that.... I am still learning about array calculations / functions so this is a big help.
Thanks a million both for your time to work out a solution, and even more so for explaining it to me so clearly!

20140318, 18:10 #14
 Join Date
 Jul 2002
 Location
 Pittsburgh, Pennsylvania, USA
 Posts
 11,198
 Thanks
 14
 Thanked 329 Times in 322 Posts
You are very welcome. I am glad that I could help. If you have any other questions don't hesitate to post back.
Chip Pearson has an excellent primer on Array formulas at http://www.cpearson.com/excel/ArrayFormulas.aspx and Bob Umlas has some great examples of them as well at http://www.emailoffice.com/excel/arraysbobumlas.html if you are looking for some good reading on them to help understand them.
Steve

20140324, 05:52 #15
 Join Date
 Jun 2010
 Location
 Indy
 Posts
 95
 Thanks
 34
 Thanked 6 Times in 5 Posts