Results 1 to 13 of 13
  1. #1
    2 Star Lounger
    Join Date
    Mar 2003
    Posts
    191
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Finding duplicate records (2003)

    Hi there. I have a data set of many columns. I am trying to work out which is the best way of finding and deleteing the rows where that data in two particular is the same and are duplicates. Any ideas?
    What is the best way for me to acheive this?
    Thanks heaps.

    Bill

  2. #2
    Plutonium Lounger
    Join Date
    Mar 2002
    Posts
    84,353
    Thanks
    0
    Thanked 29 Times in 29 Posts

    Re: Finding duplicate records (2003)

    Is there a word missing from your post? ("... that data in two particular is the same ...")
    Are you looking for rows that are completely identical, or for rows that are identical in two specific columns?

  3. #3
    2 Star Lounger
    Join Date
    Mar 2003
    Posts
    191
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Finding duplicate records (2003)

    Whoops, sorry Hans there was a word missing.
    I am looking for rows that are identical in two specific columns. Is this possible to do?
    Thanks as always.

    Bill

  4. #4
    WS Lounge VIP sdckapr's Avatar
    Join Date
    Jul 2002
    Location
    Pittsburgh, Pennsylvania, USA
    Posts
    11,225
    Thanks
    14
    Thanked 342 Times in 335 Posts

    Re: Finding duplicate records (2003)

    you could create a new column and place something like (assuming you want to check COl A and B
    =A2=B2
    then copy it down the column.

    Then you can use data-filter- autofilter
    filter for true on this column
    select all the rows
    then delete these rows
    remove the autofilter (data-filer autofilter)
    Delete the new column

    Steve

  5. #5
    Plutonium Lounger
    Join Date
    Mar 2002
    Posts
    84,353
    Thanks
    0
    Thanked 29 Times in 29 Posts

    Re: Finding duplicate records (2003)

    Here is a macro that works on a different assumption than Steve's solution. I interpreted your request to mean that if two rows have identical values in two specific columns, then one of the rows should be deleted. You can decide which one you need.

    Sub DeleteDups()
    Dim rng As Range
    Dim lngMaxRow As Long
    Dim lngRow As Long
    Const Col1 = 1
    Const Col2 = 2
    Set rng = Range("A1").CurrentRegion
    rng.Sort Key1:=rng.Cells(1, Col1), Key2:=rng.Cells(1, Col2)
    lngMaxRow = rng.Rows.Count
    For lngRow = lngMaxRow To 2 Step -1
    If rng.Cells(lngRow, Col1) = rng.Cells(lngRow - 1, Col1) And _
    rng.Cells(lngRow, Col2) = rng.Cells(lngRow - 1, Col2) Then
    rng.Cells(lngRow, 1).EntireRow.Delete
    End If
    Next lngRow
    Set rng = Nothing
    End Sub

    If the data don't begin in A1, change the line mentioning Range("A1") accordingly.
    The constants Col1 and Col2 specify which columns are to be compared.

  6. #6
    3 Star Lounger
    Join Date
    Dec 2000
    Location
    Vancouver, Br. Columbia, Canada
    Posts
    268
    Thanks
    0
    Thanked 1 Time in 1 Post

    Re: Finding duplicate records (2003)

    As I understand it (or perhaps, as I misunderstand it) you have more than two columns of data (and many rows). You wish to compare rows and identify rows that are identical in two or more comumns, like this:

    <table border=1><td>Row Num</td><td>Ref Col1</td><td>Ref Col2</td><td>Other Data</td><td>More Fields...</td><td>1</td><td>A</td><td>123</td><td>Different Data</td></tr><td>2</td><td>B</td><td>123</td><td>Description<td><td>3</td><td>B</td><td>123</td><td>Different Description</td><td>4</td><td>B</td><td>578</td><td>Same Description</td><td>More...</td></table>
    You will compare the entires in Ref Col1 and Ref Col2 - if they match in both rows, you will delete one of the rows - in this case, like row 2 and 3.

    How will you know which row to delete, and which to retain?

  7. #7
    2 Star Lounger
    Join Date
    Mar 2003
    Posts
    191
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Finding duplicate records (2003)

    Dean,
    It doesn't really matter which of the duplicate records is deleted as I want to delete any duplicate record and leave only one instance remaining.
    Hope this helps.
    Thanks all for your help thus far. Much appreciated.

    Bill

  8. #8
    Plutonium Lounger
    Join Date
    Mar 2002
    Posts
    84,353
    Thanks
    0
    Thanked 29 Times in 29 Posts

    Re: Finding duplicate records (2003)

    See if the macro I posted does what you want. (Make a backup copy of your workbook before trying the macro).

  9. #9
    2 Star Lounger
    Join Date
    Mar 2003
    Posts
    191
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Finding duplicate records (2003)

    Hans,
    I've got to be honest, I've had a mental blank here.
    Where do I cut and paste your code into? Sorry, it's been a while!
    Thanks.

    Bill

  10. #10
    3 Star Lounger
    Join Date
    Dec 2000
    Location
    Vancouver, Br. Columbia, Canada
    Posts
    268
    Thanks
    0
    Thanked 1 Time in 1 Post

    Re: Finding duplicate records (2003)

    <hr>It doesn't really matter which of the duplicate records is deleted...<hr>

    Then I think we really do have a misunderstanding. When you initially described what you wanted to do as determine "...the best way of finding and deleteing the rows where that data in two particular..." - I thought you were referring to the data being the same in two particular (specified) columns. Looking at the code Hans supplied, I can see that he made the same assumption. Based on your comment above, though, were you referring to all the data being the same (ie, in every column) on two particular rows?

    In this case, if you had -say- twelve columns of data, and two rows were identical in eleven columns, but differed in the twelfth column, you would retain them; only if they were identical in all columns would you delete them - in which case, as you say, it wouldn't matter which one was retained.

    The followoing routine will step from the bottom of your data range to the top comparing the values in each column of one row to the values in the row immediately above it. If they all match it will delete the row, otherwise it will move up to the next row. It only goes as far as comparing the third row to the second row: I have implicitly assumed that the first row is a header row. In order for this to work, you will have to sort the data range on every column. If you have more data columns than the three that excel allows for sorting there are techniques to sort on all columns: the easiest is to sort on the three least-significant columns first, then on the three next-least, and so on, until you are sorting on the most significant columns.

    <pre>Sub DelDupes()
    Dim i As Long
    Dim j As Long
    Dim iRows As Long
    Dim iCols As Long
    Dim blDel As Boolean


    iRows = ActiveSheet.UsedRange.Rows.Count
    iCols = ActiveSheet.UsedRange.Columns.Count
    blDel = True
    For i = iRows To 3 Step -1
    For j = 1 To iCols
    If Cells(i, j) <> Cells(i - 1, j) Then blDel = False
    Next j
    If blDel = True Then
    Cells(i, 1).EntireRow.Delete
    End If
    blDel = True
    Next i

    End Sub
    </pre>


    I have attached a small sample.

  11. #11
    Plutonium Lounger
    Join Date
    Mar 2002
    Posts
    84,353
    Thanks
    0
    Thanked 29 Times in 29 Posts

    Re: Finding duplicate records (2003)

    - Copy the code from my post to the clipboard.
    - Switch to Excel, with the workbook open.
    - Activate the Visual Basic Editor (Alt+F11)
    - Select Insert | Module.
    - Paste the code into the module (Edit | Paste or Ctrl+V).
    - Adjust the column numbers Col1 = 1 and Col2 = 2 to the columns on which you want to compare.
    - It might be a good idea to save the workbook at this point (there is a Save button in the toolbar in the Visual Basic Editor)
    - To run the code, click somewhere inside it and press F5.

  12. #12
    3 Star Lounger
    Join Date
    Aug 2004
    Posts
    361
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Finding duplicate records (2003)

    Attached is a spreadsheet that shows duplicates being found. It will not automatically delete the duplicates. I like to check the data before I delete the so-called duplicates. The sheet does not use VBA, just a simple formul and some conditional formatting for the last column.

    Note that you can use the VBA shown in post 457117 that will delete all rows that have 'dup' in column A. Thanks to Legare Coleman

  13. #13
    2 Star Lounger
    Join Date
    Mar 2003
    Posts
    191
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Finding duplicate records (2003)

    Hans,
    With a bit of tweaking, your code worked a treak. Thanks so much for your help.

    Bill

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •