Results 1 to 15 of 15
  1. #1
    2 Star Lounger
    Join Date
    Jan 2011
    Posts
    199
    Thanks
    21
    Thanked 2 Times in 2 Posts

    Count unique numbers in a spreadsheet

    Hi there

    I have a spreadsheet of documents that have been downloaded in one column, and the e-mail address of the person who downloaded it in another.

    Some people have downloaded the same document more than once.

    I have been able to calculate how many times each document has been downloaded using the COUNTIFS function, but I am battling to identify how many unique downloads there have been....

    There is a function FREQUENCY that I thought would help, but it does not appear to give the right answers - it looks for a range...

    Please give me some tips.....

    Format for the spreadsheet of data is:
    A B C D E
    Filename - Document Number - Description - Email - Date/Time

    There are 2200 rows.................!!

    Thanks for the advice!

  2. Get our unique weekly Newsletter with tips and techniques, how to's and critical updates on Windows 7, Windows 8, Windows XP, Firefox, Internet Explorer, Google, etc. Join our 480,000 subscribers!

    Excel 2013: The Missing Manual

    + Get this BONUS — free!

    Get the most of Excel! Learn about new features, basics of creating a new spreadsheet and using the infamous Ribbon in the first chapter of Excel 2013: The Missing Manual - Subscribe and download Chapter 1 for free!

  3. #2
    Platinum Lounger
    Join Date
    Dec 2009
    Location
    Earth
    Posts
    3,626
    Thanks
    7
    Thanked 231 Times in 219 Posts
    A pivot table should give you the frequency if you count the documents and individual downloaders, but if the text has extra spaces or capitals vs lowercase you may not get the right answer. You'll need to convert everything to the same case and remove leading and trailing spaces before running the pivot.

    cheers, Paul

  4. #3
    WS Lounge VIP sdckapr's Avatar
    Join Date
    Jul 2002
    Location
    Pittsburgh, Pennsylvania, USA
    Posts
    11,225
    Thanks
    14
    Thanked 341 Times in 334 Posts
    It would be helpful if you attached a sample workbook. It does not have to contain 2200 rows, but just enough to demonstrate the function not returning what you want and to show us what you need. An output of the results of the sample would also be helpful so that we can check our logic in computing.

    Steve
    If I understand the setup (and I may not) and you have no blanks in the data range, the array formula (confirm with ctrl-shift-enter) may be want you want (adjust the range as needed):
    Code:
    =SUM(1/COUNTIFS(A2:A2200,A2:A2200,D2:D2200,D2:D2200))
    Last edited by sdckapr; 2014-03-17 at 06:56.

  5. #4
    2 Star Lounger
    Join Date
    Jan 2011
    Posts
    199
    Thanks
    21
    Thanked 2 Times in 2 Posts

    Thanks for the suggestions.........

    Here is a sample of the data that I want to extract.

    The data is in one sheet, whilst the summary is in the other worksheet of the file.

    Thanks so much!!!!

    Cheers
    AlainDownload sample.xlsx

  6. #5
    Bronze Lounger Maudibe's Avatar
    Join Date
    Aug 2010
    Location
    Pa, USA
    Posts
    1,213
    Thanks
    46
    Thanked 231 Times in 212 Posts
    Photerer,

    Here are 2 sample codes:

    1. When placed in the Worksheet_Activate event subroutine, this code will filter all the unique downloaded files, document their frequency, and then tally the number of unique files found in cell C2. The Download page will be updated automatically each time it is viewed.

    Filtered:
    downloads1.png

    Code:
    Private Sub Worksheet_Activate()
    Application.ScreenUpdating = False
    '-------------------------------------------------------
    'CLEAR DOWNLOADS SHEET
    LastRow = ActiveSheet.Cells(Rows.Count, 2).End(xlUp).Row
    Range(Cells(5, 1), Cells(LastRow, 2)).ClearContents
    '-------------------------------------------------------
    'COPY FILES
    With Worksheets("Sheet1")
    LastRow = .Cells(Rows.Count, 1).End(xlUp).Row
    For I = 2 To LastRow
        Cells(I + 3, 2) = .Cells(I, 1)
        Cells(I + 3, 1) = .Cells(I, 2)
    Next I
    '------------------------------------------------------
    'CREATE UNIQUE LIST
    LastRow = ActiveSheet.Cells(Rows.Count, 2).End(xlUp).Row
    ActiveSheet.Range("A5:B" & LastRow).RemoveDuplicates Columns:=1, Header:=xlNo
    LastRow = ActiveSheet.Cells(Rows.Count, 2).End(xlUp).Row
    '------------------------------------------------------
    'TOTAL UNIQUE FILES
    Cells(2, 3) = LastRow - 4
    End With
    Application.ScreenUpdating = True
    End Sub
    2. This code will display all the file names and their frequencies but tally only those whose frequencies are not 0 (unique) in cell C2

    Not Filtered:
    downloads2.png

    Code:
    Private Sub Worksheet_Activate()
    Application.ScreenUpdating = False
    Counter = 0
    '-------------------------------------------------------
    'COUNT UNIQUE FILES
    LastRow = ActiveSheet.Cells(Rows.Count, 2).End(xlUp).Row
    For I = 5 To LastRow
        If Cells(I, 3) <> 0 Then Counter = Counter + 1
    Next I
    Cells(2, 3) = Counter
    Application.ScreenUpdating = True
    End Sub
    Attached Files Attached Files

  7. The Following User Says Thank You to Maudibe For This Useful Post:

    Photorer (2014-03-18)

  8. #6
    2 Star Lounger
    Join Date
    Jan 2011
    Posts
    199
    Thanks
    21
    Thanked 2 Times in 2 Posts
    Thanks - that is great, however what I am after is that for each document I need to know how many times it has been downloaded, and then how many unique e-mail users downloaded the document. Sometimes the same user downloads the same document more than once, so it distorts the results.

    Are there any built in functions to do this? There is a button on the Data Tab of the ribbon that removes duplicates. Is there any way of doing that without changing the original table?

  9. #7
    Bronze Lounger Maudibe's Avatar
    Join Date
    Aug 2010
    Location
    Pa, USA
    Posts
    1,213
    Thanks
    46
    Thanked 231 Times in 212 Posts
    So you are not talking about # of unique documents rather # of each document downloaded by a unique user?

  10. #8
    2 Star Lounger
    Join Date
    Jan 2011
    Posts
    199
    Thanks
    21
    Thanked 2 Times in 2 Posts
    Yes - that's exactly right.... sorry if I didn't explain correctly.

  11. #9
    WS Lounge VIP sdckapr's Avatar
    Join Date
    Jul 2002
    Location
    Pittsburgh, Pennsylvania, USA
    Posts
    11,225
    Thanks
    14
    Thanked 341 Times in 334 Posts
    I am not sure you can do it directly with an formula (I will think some more about it), but it can be done with a formula using an intermediate calc. In Sheet1!F2:
    =1/SUMPRODUCT(($A$2:$A$32&$D$2:$D$32=A2&D2)*1)
    Copy F2 down the column. This is the intemediate calc. It is the inverse of the number of copies for each row. If the number in F is 1 then the combo of filename and email is unique. If the value is 0.5 (=1/2) there are 2 copies of the combination, if 0.33333 (=1/3) there are 3 copies, etc.

    In Downloads!E5:
    =SUMIF(Sheet1!$A$2:$A$32,B5,Sheet1!$F$2:$F$32)

    Copy E5 down the column. This should be the sum of unique entries for a particular filename.

    It could be also done with a user defined function, but that is more work and since each formula would need to calculate based on the entire range, with 2000 calls it may be sluggish, the intermediate column eliminates some of the redundant calculations and even the most complex formulas are more efficient than a even a simple UDF.

    Steve

  12. The Following User Says Thank You to sdckapr For This Useful Post:

    Photorer (2014-03-18)

  13. #10
    WS Lounge VIP sdckapr's Avatar
    Join Date
    Jul 2002
    Location
    Pittsburgh, Pennsylvania, USA
    Posts
    11,225
    Thanks
    14
    Thanked 341 Times in 334 Posts
    If you want to try the User defined function here is simple means:
    Code:
    Option Explicit
    Function CountUnique(sFilename As String, rFilename As Range, rEmail As Range) As Long
      Dim x As Long
      Dim colUnique As New Collection
      Dim sTemp As String
      On Error Resume Next
      For x = 1 To rFilename.Count
        sTemp = rFilename(x)
        If sTemp = sFilename Then
          colUnique.Add sTemp, CStr(sTemp & rEmail(x))
        End If
      Next
      CountUnique = colUnique.Count
    End Function
    In Downloads!F2:
    =countunique(B5,Sheet1!$A$2:$A$32,Sheet1!$D$2:$D$3 2)

    Copy it down the column.

    Steve

  14. The Following User Says Thank You to sdckapr For This Useful Post:

    Photorer (2014-03-18)

  15. #11
    2 Star Lounger
    Join Date
    Jan 2011
    Posts
    199
    Thanks
    21
    Thanked 2 Times in 2 Posts
    Hey Steve

    How can I thank you!??? That is awesome! Just the result I was looking for. What a great help!

    I can easily work on the two formulas in the two spreadsheets..... it is easy to do, and copying of data when it is updated is also easy to the data sheet.... it just gets longer! I will amend the ranges to something longer than the actual data, just to make sure I don't miss any of the downloads.

    Please would you explain how this works for me:
    =1/SUMPRODUCT(($A$2:$A$32&$D$2:$D$32=A2&D2)*1)
    I understand that it is the inverse of the SUMPRODUCT result, but I have not seen the & sign used in formulas that don't entail a text string!

    Thanks for educating me along the way!

  16. #12
    WS Lounge VIP sdckapr's Avatar
    Join Date
    Jul 2002
    Location
    Pittsburgh, Pennsylvania, USA
    Posts
    11,225
    Thanks
    14
    Thanked 341 Times in 334 Posts
    Perhaps this will help
    $A$2:$A$32 is the filenames: the 31 cells from A2 to A32
    $D$2:$D$32 is the email range: the the 31 cells from D2 to D32

    Imagine in column F you add the formula:
    A2&D2
    Which would concatenate the filename and the email together
    Now imagine copying it down from F3:F32

    Now you would have
    F2:F32 is the Combined file & email range of the 31 combinations.

    The formula:
    $A$2:$A$32&$D$2:$D$32
    is equivalent to that range only it is internal, not physical in the worksheet

    SUMPRODUCT(($A$2:$A$32&$D$2:$D$32=A2&D2)*1)

    Checks an individual combined result versus all the results and creates an array of true/false values and teh "*1" converts to numbers: FALSE = 0, TRUE = 1. This essentially counts the number of the matching combinations.

    [It is akin to (using our imaginary F-col range);
    SUMIF($F$2:$F$32, A2:B2)
    but SUMIF does not like the array manipulation of creating the internal intermediate column, where sumproduct does]

    So the sumproduct would give the number of copies of each of those combinations.
    =1/SUMPRODUCT(($A$2:$A$32&$D$2:$D$32=A2&D2)*1)
    is the inverse. The inverse is used, so when you sum them up, all of the copies add to "1":
    1 unique = 1
    With 2 copies each has a value of 1/2 [2* 1/2 = 1]
    with 3 copies each has a value of 1/3 [3* 1/3 = 1]
    Etc.

    I could only figure out a way to eliminate one intermediate column, but not the 2nd, so that inverse of the copy count needs to be used. If you want and it makes more sense to you, you could use the intermediate "F-col calc" with the inverse of the sumif to achieve the same results. Other than using another column, the performance should not be dramatically affected. [The "understanding" may be of more value to you than the elimination of the 2nd intermediate calculation in a column, that is your call...]

    Steve

  17. The Following User Says Thank You to sdckapr For This Useful Post:

    Photorer (2014-03-18)

  18. #13
    2 Star Lounger
    Join Date
    Jan 2011
    Posts
    199
    Thanks
    21
    Thanked 2 Times in 2 Posts
    Steve - I have only one thing to say: "Give that man a Bells!"

    Thank you so much for explaining this to me. Combining the filename and the e-mail is a stroke of genius! That is definitely the cleverest and simplest way of comparing the two - I would never have thought of that!

    Your explanation is also really clear - thanks for that.... I am still learning about array calculations / functions so this is a big help.

    Thanks a million both for your time to work out a solution, and even more so for explaining it to me so clearly!

  19. #14
    WS Lounge VIP sdckapr's Avatar
    Join Date
    Jul 2002
    Location
    Pittsburgh, Pennsylvania, USA
    Posts
    11,225
    Thanks
    14
    Thanked 341 Times in 334 Posts
    You are very welcome. I am glad that I could help. If you have any other questions don't hesitate to post back.

    Chip Pearson has an excellent primer on Array formulas at http://www.cpearson.com/excel/ArrayFormulas.aspx and Bob Umlas has some great examples of them as well at http://www.emailoffice.com/excel/arrays-bobumlas.html if you are looking for some good reading on them to help understand them.
    Steve

  20. #15
    2 Star Lounger
    Join Date
    Jun 2010
    Location
    Indy
    Posts
    101
    Thanks
    36
    Thanked 7 Times in 6 Posts
    You already have a solution, so just fyi:
    Quote Originally Posted by Photorer View Post
    There is a button on the Data Tab of the ribbon that removes duplicates. Is there any way of doing that without changing the original table?
    Quick n dirty would be to record a macro which copies the sheet to a new sheet, and then runs Remove Duplicates on the columns you select in the new sheet. Count formulas would then give the numbers you need from each sheet.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •