Page 1 of 2 12 LastLast
Results 1 to 15 of 22
  1. #1
    2 Star Lounger
    Join Date
    Jun 2001
    Posts
    103
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Macro to Remove Duplicate Entries (MS Word 2003)

    I receive documents that create many duplicated entries. There are a lot of "garbage" at the top of the document so I have to be at the start of where the entries begin.

    The original document is in RTF format and I import it into Word. I am using MS Word 2003.

    I work with 3 documents. Doc 1 is the original, Doc 2 is the new document that will be checked against it, and then Doc 3 is where the duplicated entries are placed. I manually make sure that the entries that are duplicated on Doc 2 are placed at the end of Doc 3 so that I can keep track of how many times the same thing is reported (I sort the entries after they are in). I then copy Doc 2 into Doc 1 with the duplicated entries removed.

    I am looking for a macro to remove the duplicated entries from Doc 2 and place it at the end of Doc 3. It takes an awful long time to go through all the hundreds of pages that I sometimes get. After a while I do get tired and miss entries and that is definitely a nuisance. (I cannot sort Doc 1. The new entries are placed at the end of Doc 1.) That
    Attached Files Attached Files

  2. #2
    3 Star Lounger
    Join Date
    Apr 2004
    Location
    Boston, Massachusetts, USA
    Posts
    389
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Macro to Remove Duplicate Entries (MS Word 200

    This sounds like a problem best addressed with an RTF parser (or at least an RTF tokenizer), but I'm assuming you already know that, and have reasons for asking for what you're asking for (If you do want any tips on finding a good RTF tokenizer, let me know). Clearly, these are RTF command sequences in your document.

    It was easiest to split this into three separate macros (mostly to make up for functions missing in VBA), but it's still fairly concise. I wasn't sure how you wanted to handle the final output, but the result of the main macro is (1). Duplicated entries are removed, and (2). Duplicate entries are stored in an array, ready to be inserted wherever you need them. I commented out a simple example; you may need to modify it to suit your needs.

    <pre>Sub RemoveDupEntries()
    Dim para As Paragraph
    Dim doc As Document
    Dim vEntries() As Variant
    Dim vDupEntries() As Variant
    Dim sParaText As String
    ReDim vEntries(0)
    ReDim vDupEntries(0)
    Set doc = ActiveDocument
    For Each para In doc.Paragraphs
    sParaText = Left(para.Range.Text, para.Range.Characters.count - 1)
    If IsMember(vEntries, sParaText) Then
    If IsMember(vDupEntries, sParaText) = False Then
    Push vDupEntries, sParaText
    End If
    para.Range.Delete
    Else
    Push vEntries, sParaText
    End If
    Next para
    ' Now vDupEntries contains any items that were duplicated.
    ' You can insert it at the end of doc 3 as needed
    ' For example:
    ' Documents("Doc3").Content.InsertAfter Join(vDupEntries, vbCr)
    End Sub
    '
    Function Push(ByRef vArray As Variant, ByVal str As String)
    ReDim Preserve vArray(UBound(vArray) + 1)
    vArray(UBound(vArray)) = str
    End Function
    '
    Function IsMember(vArray As Variant, ByVal str As String)
    Dim v As Variant
    For Each v In vArray
    If v = str Then
    IsMember = True
    Exit Function
    End If
    Next v
    IsMember = False
    End Function
    </pre>


  3. #3
    2 Star Lounger
    Join Date
    Jun 2001
    Posts
    103
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Macro to Remove Duplicate Entries (MS Word 200

    Thank you so much for the SUPER fast reponse.

    The macro runs but it keeps bringing the dialog box

  4. #4
    3 Star Lounger
    Join Date
    Apr 2004
    Location
    Boston, Massachusetts, USA
    Posts
    389
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Macro to Remove Duplicate Entries (MS Word 200

    Doc3 would need to be open when you run the macro as posted. Otherwise, you'll have to explicitly open it:
    <pre>Dim doc3 as Document
    Set doc3 = Documents.Open("C:My DocumentsSomePathDoc3.doc")
    doc3.Content.InsertAfter ...' and so forth
    </pre>


  5. #5
    2 Star Lounger
    Join Date
    Jun 2001
    Posts
    103
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Macro to Remove Duplicate Entries (MS Word 200

    I did change the macro as you suggested:

    Dim doc3 As Document
    Set doc3 = Documents.Open("Cocuments and Settings......Doc3.doc")
    doc3.Content.InsertAfter Join(vDupEntries, vbCr)

    The macro opens "Doc3.doc" but the result is a blank document.

    If I use the original instead of the above (with the document open):
    Documents("Doc3").Content.InsertAfter Join(vDupEntries, vgCr)
    I still get nothing on Doc3.

    So, both ways I am getting a blank Doc3.

    I even created two other documents with with some identical entries to make sure that I was not doing something wrong, but still I get a blank Doc3.

    I know I am missing something.

    Thank you again.

  6. #6
    3 Star Lounger
    Join Date
    Apr 2004
    Location
    Boston, Massachusetts, USA
    Posts
    389
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Macro to Remove Duplicate Entries (MS Word 200

    Try recording a macro of yourself opening the correct Doc3, then take a look at the recorded macro to inspect the path that Word recorded.

  7. #7
    3 Star Lounger
    Join Date
    Apr 2004
    Location
    Boston, Massachusetts, USA
    Posts
    389
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Macro to Remove Duplicate Entries (MS Word 200

    As a sidebar to this thread ...

    This particular task is a great example of how handy it can be to have at least some familiarity with one of the big three so-called "scripting languages", Perl, Python, or Ruby.

    Obviously, VBA is the most appropriate solution for 95% of Word scripting tasks, but occassionally certain ones come along that can really take advantage of some of the strengths of those other languages. What takes dozens of lines to accomplish in VBA (see above) can be done in just a handful of Ruby:
    <pre>require 'win32ole'
    wrd = WIN32OLE.connect('Word.Application')
    doc3 = wrd.Documents("Doc3.doc")
    doc3.Content.Text = wrd.ActiveDocument.Content.Text.split("r").uniq!.j oin("r")
    </pre>


    Sorting and comparing are areas where some of these other languages really excel, and can offer simpler solutions than even native VBA macros.

    Here's another quick example (I'll use Ruby again, cuz' it's my favorite, though Python and Perl are also well equipped here). Say you wanted to delete any paragraph in a document whose style begins with a lowercase letter or a number, or ends with a number (hey, I've done stranger things ...):
    <pre>require 'win32ole'
    wrd = WIN32OLE.connect('Word.Application')
    doc = wrd.ActiveDocument
    doc.Paragraphs.each {|p| p.Range.Delete if p.Style.NameLocal.match(/(^([a-z]|d)|d$)/) }
    </pre>

    Don't get me wrong, I'm not knocking VBA here. Again, it's obviously the most appropriate tool in most cases. But I'd also say that it only takes a day or two to become familiar enough with one of these other languages to make it worth the effort for those other 5% of times when VBA's just not up to the task. (This second example's actually fairly easy to do using the VBScript RegExp object, but still requires more lines than the Ruby version.)

    Sorry for the random sidebar, but wasn't sure where else it belonged.

    Cheers!

  8. #8
    2 Star Lounger
    Join Date
    Jun 2001
    Posts
    103
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Macro to Remove Duplicate Entries (MS Word 200

    This is what I have in that section of the program:

    Dim doc3 As Document

    Documents.Open FileName:="Doc3.doc", ConfirmConversions:=True, ReadOnly:= _
    False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate:= _
    "", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="", _
    Format:=wdOpenFormatAuto, XMLTransform:=""

    doc3.Content.InsertAfter Join(vDupEntries, vbCr)
    End Sub

    Now the macro opens the document, still with nothing in it, with a message box: Object variable or With block variable not set. When I click on OK the line doc3.Content.InsertAfter Join(vDupEntries, vbCr) is highlighted in yellow in the de######.

    I am sorry I am causing such ruckus.

  9. #9
    3 Star Lounger
    Join Date
    Apr 2004
    Location
    Boston, Massachusetts, USA
    Posts
    389
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Macro to Remove Duplicate Entries (MS Word 200

    Looks like you need to assign the document to the doc3 variable:
    <pre>Dim doc3 as Document
    Set doc3 = Documents.Open(FileName:="Doc3.doc", ConfirmConversions:=True, ReadOnly:= _
    False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate:= _
    "", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="", _
    Format:=wdOpenFormatAuto, XMLTransform:="")
    </pre>


  10. #10
    2 Star Lounger
    Join Date
    Jun 2001
    Posts
    103
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Macro to Remove Duplicate Entries (MS Word 200

    Thank you so much for all your help, however, I made the recommended changes. Now the macro says "File could not be found. Check the spelling. Try different name. (Doc3.doc)." and brings me back to the de######. There's got to be an easier way.

  11. #11
    3 Star Lounger
    Join Date
    Apr 2004
    Location
    Boston, Massachusetts, USA
    Posts
    389
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Macro to Remove Duplicate Entries (MS Word 200

    Can you post the exact code you're using?

  12. #12
    2 Star Lounger
    Join Date
    Jun 2001
    Posts
    103
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Macro to Remove Duplicate Entries (MS Word 200

    Sub RemoveDupEntries()
    Dim para As Paragraph
    Dim doc As Document
    Dim vEntries() As Variant
    Dim vDupEntries() As Variant
    Dim sParaText As String
    ReDim vEntries(0)
    ReDim vDupEntries(0)

    Set doc = ActiveDocument

    For Each para In doc.Paragraphs
    sParaText = Left(para.Range.Text, para.Range.Characters.Count - 1)
    If IsMember(vEntries, sParaText) Then
    If IsMember(vDupEntries, sParaText) = False Then
    Push vDupEntries, sParaText
    End If
    para.Range.Delete
    Else
    Push vEntries, sParaText
    End If

    Next para

    Dim doc3 As Document
    Set doc3 = Documents.Open(FileName:="Doc3.doc", ConfirmConversions:=True, _
    ReadOnly:=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate:= _
    "", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="", _
    Format:=wdOpenFormatAuto, XMLTransform:="")
    doc3.Content.InsertAfter Join(vDupEntries, vbCr)

    End Sub
    Function Push(ByRef vArray As Variant, ByVal str As String)
    ReDim Preserve vArray(UBound(vArray) + 1)
    vArray(UBound(vArray)) = str
    End Function
    Function IsMember(vArray As Variant, ByVal str As String)
    Dim v As Variant

    For Each v In vArray
    If v = str Then
    IsMember = True
    Exit Function
    End If
    Next v
    IsMember = False
    End Function

  13. #13
    3 Star Lounger
    Join Date
    Apr 2004
    Location
    Boston, Massachusetts, USA
    Posts
    389
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Macro to Remove Duplicate Entries (MS Word 200

    When you only give a relative pathname "doc3.doc", instead of a full pathname ("C:My Documentsdoc3.doc"), Word looks for the file in the "current" directory, which may or may not be what you think is the "current" directory. It's whatever folder appears when you choose File->Open.

    Next time you get the "File could not be found" error, stop the macro, go back to Word, choose File->Open, and make sure you're in the same folder as doc3. Then just cancel the Open dialog, and try running the macro again. Alternately, give the macro the full path to doc3.doc.

  14. #14
    2 Star Lounger
    Join Date
    Jun 2001
    Posts
    103
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Macro to Remove Duplicate Entries (MS Word 200

    I did put the full path on the macro and now it opens the document OK. However, there are no entries pasted on Doc3.

  15. #15
    3 Star Lounger
    Join Date
    Apr 2004
    Location
    Boston, Massachusetts, USA
    Posts
    389
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Macro to Remove Duplicate Entries (MS Word 200

    Perhaps I should have been clearer from the get-go that you'd need to modify the code to suit your particular situation (the location of the documents, their names, etc.). I used the ActiveDocument as the document with the list of items in it (what you call "Doc2") because it was easiest while experimenting with the macro.

    More properly, instead of this:
    <pre>Set doc = ActiveDocument</pre>

    you'd want:
    <pre>Set doc = Documents.Open("CocumentsDoc2.doc") ' or whatever the path is for Doc2</pre>

    or
    <pre>Set doc = Documents("Doc2.doc") ' if doc2 is alread open</pre>


    You may want to experiment with the variables to ensure you've correctly "got" both of them:
    <pre>Sub RemoveDupEntries()
    Dim para As Paragraph
    Dim doc2 As Document
    Dim doc3 As Document
    Dim vEntries() As Variant
    Dim vDupEntries() As Variant
    Dim sParaText As String
    ReDim vEntries(0)
    ReDim vDupEntries(0)

    Set doc2 = Documents.Open("CocumentsDoc2.doc")
    Set doc3 = Documents.Open("CocumentsDoc3.doc")

    ' Remove (or comment out) the two lines following this comment
    ' when you're sure things are working correctly:
    Msgbox "doc2 is now: " & doc2.Name & ", and doc3 is now: " & doc3.Name
    Exit Sub

    For Each para In doc2.Paragraphs
    sParaText = Left(para.Range.Text, para.Range.Characters.Count - 1)
    If IsMember(vEntries, sParaText) Then
    If IsMember(vDupEntries, sParaText) = False Then
    Push vDupEntries, sParaText
    End If
    para.Range.Delete
    Else
    Push vEntries, sParaText
    End If
    Next para

    ' And remove (or comment out) the two lines following this comment
    ' when you're sure the list is correctly populating
    MsgBox Join(vDupEntries, vbCr)
    Exit Sub

    doc3.Content.InsertAfter Join(vDupEntries, vbCr)
    End Sub
    </pre>


Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •