Results 1 to 13 of 13
  1. #1
    New Lounger
    Join Date
    Mar 2016
    Posts
    8
    Thanks
    3
    Thanked 0 Times in 0 Posts

    macro to find and replace in batch from a MS 2007 Word file using Regular Expressions (regex)

    Hello everyone.

    In a .docx file, I would like to replace in batch groups of several original words by just another single term which must appear in the same formatting as the original terms.
    Since the in MS Word the Find and Replace option does not support complex regex such as abc(?!(cde\b|ba)), and I need to carry out dozens of such time-consuming changes involving sometimes just parts of words, I thought a macro supporting regular expressions could somehow do the job.

    Any help would be very much appreciated.
    Many thanks.
    Last edited by REGEX; 2016-03-27 at 09:41. Reason: clarity

  2. #2
    Super Moderator RetiredGeek's Avatar
    Join Date
    Mar 2004
    Location
    Manning, South Carolina
    Posts
    9,436
    Thanks
    372
    Thanked 1,457 Times in 1,326 Posts
    RegEx,

    Welcome to the Lounge as a New Poster!

    Perhaps it is your construction of the Regular Expression.

    For example this RegEx in your Link:
    (\d)([a-z])
    would NOT work in Word. However, with a little manipulation you can get it to work: ([0-9])([a-z])

    Sample Run Before:
    RegEx Before.PNG

    Sample Run After:
    RegEx After.PNG

    Perhaps if you would provide a description of what you are trying to find / replace along with a sample document we may be able to help.

    HTH
    Last edited by RetiredGeek; 2016-03-25 at 20:52.
    May the Forces of good computing be with you!

    RG

    PowerShell & VBA Rule!

    My Systems: Desktop Specs
    Laptop Specs

  3. The Following User Says Thank You to RetiredGeek For This Useful Post:

    REGEX (2016-03-26)

  4. #3
    Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    5,055
    Thanks
    2
    Thanked 417 Times in 346 Posts
    For batch processing, you might use an Excel workbook to hold the Find expressions in one column and the Replace expressions in another. For that, you might use code like:
    Code:
    Sub BulkFindReplace()
    Application.ScreenUpdating = False
    Dim xlApp As Object, xlWkBk As Object, StrWkBkNm As String, StrWkSht As String
    Dim iDataRow As Long, xlFList As String, xlRList As String, i As Long
    StrWkBkNm = "C:\Users\" & Environ("Username") & "\Documents\Workbook Name.xls"
    StrWkSht = "Sheet1"
    If Dir(StrWkBkNm) = "" Then
      MsgBox "Cannot find the designated workbook: " & StrWkBkNm, vbExclamation
      Exit Sub
    End If
    'Get the folder to process
    strFolder = GetFolder
    If strFolder = "" Then Exit Sub
    strFile = Dir(strFolder & "\*.doc", vbNormal)
    On Error Resume Next
    'Start Excel
    Set xlApp = CreateObject("Excel.Application")
    If xlApp Is Nothing Then
      MsgBox "Can't start Excel.", vbExclamation
      Exit Sub
    End If
    On Error GoTo 0
    With xlApp
      'Hide our Excel session
      .Visible = False
      ' The file is available, so open it.
      Set xlWkBk = .Workbooks.Open(FileName:=StrWkBkNm, ReadOnly:=True, AddToMru:=False)
      If xlWkBk Is Nothing Then
        MsgBox "Cannot open:" & vbCr & StrWkBkNm, vbExclamation
        .Quit
        Exit Sub
      End If
      ' Process the workbook.
      With xlWkBk
        'Ensure the worksheet exists
        If SheetExists(StrWkSht) = True Then
          With .Worksheets(StrWkSht)
            ' Find the last-used row in column A.
            iDataRow = .Cells(.Rows.Count, 1).End(-4162).Row ' -4162 = xlUp
            ' Capture the F/R data.
            For i = 1 To iDataRow
              ' Skip over empty fields to preserve the underlying cell contents.
              If Trim(.Range("A" & i)) <> vbNullString Then
                xlFList = xlFList & "|" & Trim(.Range("A" & i))
                xlRList = xlRList & "|" & Trim(.Range("B" & i))
              End If
            Next
          End With
        Else
          MsgBox "Cannot find the designated worksheet: " & StrWkSht, vbExclamation
        End If
      .Close False
      End With
      .Quit
    End With
    ' Release Excel object memory
    Set xlWkBk = Nothing: Set xlApp = Nothing
    'Exit if there are no data
    If xlFList = "" Then Exit Sub
    'Process each document in the folder
    With ActiveDocument
      'code to process the document goes here
    End With
    Application.ScreenUpdating = True
    End Sub
     
    Function SheetExists(SheetName As String) As Boolean
    SheetExists = False
    On Error GoTo NoSuchSheet
    If Len(Sheets(SheetName).Name) > 0 Then SheetExists = True
    NoSuchSheet:
    End Function
    Note that you'll need to add whatever document processing code you want, where indicated. To use the full power of Regular Expression functions in Word, you'd need to use a macro with a refererence to the Microsoft vbScript Regular Expressions library. Whilst Word's own wildcard support only gives access to a sub-set of the Regular Expression functions, it does provide functionality the vbScript Regular Expressions library doesn't support, including the ability to specify character/paragraph formatting and languages - for both the find & replace side of things.
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  5. The Following User Says Thank You to macropod For This Useful Post:

    REGEX (2016-03-26)

  6. #4
    New Lounger
    Join Date
    Mar 2016
    Posts
    8
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Hi Paul Edstein.

    First and foremost, I want to say thanks for your macro.

    I am a newbie, so I do not really understand the 'syntax' appearing in the macro. I'd really appreciate it if you were as kind as to explain in broad terms the different chunks of it.

    an Excel workbook to hold the Find expressions in one column and the Replace expressions in another
    I do not know where in the macro the path of this workbook must be indicated.

    add whatever document processing code you want, where indicated
    I am afraid I do not know what document processing code is and where it's supposed to belong.

    you'd need to use a macro with a refererence to the Microsoft vbScript Regular Expressions library
    If the macro you posted does not support Microsoft vbScript Regular Expressions library, then I'd like to know how such reference is to be made.

    Word's own wildcard support only gives access to a sub-set of the Regular Expression functions
    I'd like to know if it's possible for some lines of the macro to be changed so that it supports alternation, when specified, between such sub-set of wildcards and the Microsoft vbScript Regular Expressions library.


    I am sorry for my lack of proper specific terminology.
    I gratefully hope to receive a prompt reply.
    Last edited by REGEX; 2016-03-26 at 04:55. Reason: clarity

  7. #5
    Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    5,055
    Thanks
    2
    Thanked 417 Times in 346 Posts
    Quote Originally Posted by REGEX View Post
    I am a newbie, so I do not really understand the 'syntax' appearing in the macro. I'd really appreciate it if you were as kind as to explain in broad terms the different chunks of it.
    I had expected anyone saying they wanted to use the full Regular Expression functionality and providing an MSDN link for its usage with Visual Studio would have more than a passing knowledge of VBA.

    Quote Originally Posted by REGEX View Post
    I do not know where in the macro the path of this workbook must be indicated.
    As coded, the macro presupposes the Find/Replace expressions would be on 'Sheet1' of a workbook named 'Workbook Name.xls' in the user's 'Documents' folder. All of that is encompassed in just two lines:
    StrWkBkNm = "C:\Users\" & Environ("Username") & "\Documents\Workbook Name.xls"
    StrWkSht = "Sheet1"

    Quote Originally Posted by REGEX View Post
    I am afraid I do not know what document processing code is and where it's supposed to belong.
    If you read the code, you'll see the comment line:
    'code to process the document goes here

    Quote Originally Posted by REGEX View Post
    If the macro you posted does not support Microsoft vbScript Regular Expressions library, then I'd like to know how such reference is to be made.
    The macro does support it, but you must add the library reference, via Tools|References in the VBE

    Quote Originally Posted by REGEX View Post
    I'd like to know if it's possible for some lines of the macro to be changed so that it supports alternation, when specified, between such sub-set of wildcards and the Microsoft vbScript Regular Expressions library.
    Yes, that's possible. However, since I have no idea what you want to do and no first-hand experience using the Microsoft vbScript Regular Expressions library, I can't really help you on that front.
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  8. #6
    New Lounger
    Join Date
    Mar 2016
    Posts
    8
    Thanks
    3
    Thanked 0 Times in 0 Posts
    would have more than a passing knowledge of VBA.
    I’ve played around using tools such as dnGREP or powerGREP, so I am aware of the regex power http://www.regular-expressions.info/. Still, I would very much like to get an overall understanding of the 'syntax' appearing in the different lines pertaining to the different parts of the macro

    StrWkBkNm = "C:\Users\" & Environ("Username") & "\Documents\Workbook Name.xls"
    According to the number of ampersands, there seem to be three different paths to fill, namely a) "C:\Users\" b) Environ("Username") c) "\Documents\Workbook Name.xls", so I am not sure exactly up to what extend the .docx file path should be jotted down.

    'code to process the document goes here
    Then I supposse I must delete this whole line and instead paste a ‘code’; yet again, I do not know what exactly must be added, i.e., what ‘code’ actually signifies.

    you must add the library reference, via Tools|References in the VBE
    Should I add just the 5.5 version, or also the 1.0 as well?

    I have no idea what you want to do
    I’d really appreciate it if you could tell me what I might add to the post to clarify my issue.

  9. #7
    Super Moderator RetiredGeek's Avatar
    Join Date
    Mar 2004
    Location
    Manning, South Carolina
    Posts
    9,436
    Thanks
    372
    Thanked 1,457 Times in 1,326 Posts
    REGEX,

    StrWkBkNm = "C:\Users\" & Environ("Username") & "\Documents\Workbook Name.xls"

    According to the number of ampersands, there seem to be three different paths to fill, namely a) "C:\Users\" b) Environ("Username") c) "\Documents\Workbook Name.xls", so I am not sure exactly up to what extend the .docx file path should be jotted down.
    Actually the & is the string join feature in VBA so the statement is taking C:\Users\" then joining/appending the Environment Variable Environ("Username") then joining/appending the [/noparse]"\Documents Workbook Name.xls"[/noparse]. Thus the only part you would need to change is the Workbook Name with the actual name of your workbook which you would locate in your Documents folder.

    HTH
    May the Forces of good computing be with you!

    RG

    PowerShell & VBA Rule!

    My Systems: Desktop Specs
    Laptop Specs

  10. The Following User Says Thank You to RetiredGeek For This Useful Post:

    REGEX (2016-03-26)

  11. #8
    New Lounger
    Join Date
    Mar 2016
    Posts
    8
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Thanks for replying RG,

    It would be great to have highlighted the exact strings along the macro that must be filled in by different types of actual data.

  12. #9
    Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    5,055
    Thanks
    2
    Thanked 417 Times in 346 Posts
    For an example of an elaborate implementation of the code I posted, see:
    http://www.msofficeforums.com/word-v...sion-only.html
    Likewise, for an implementation that processes multiple documents, including in headers & footers, see:
    http://www.msofficeforums.com/word-v...er-footer.html
    As you will see, the scope is limited only by one's understanding of what can be done with VBA. None of these examples have needed recourse to the Microsoft vbScript Regular Expressions library.

    That you've needed to ask the basic questions you've asked so far indicates you have little practical knowledge of VBA or Regular Expressions. I'd recommend starting by learning how to use VBA, at least, before trying to tackle your own programming. Likewise, I don't think we want to turn this thread into an extended VBA/Regular Expressions tutorial.
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  13. #10
    New Lounger
    Join Date
    Mar 2016
    Posts
    8
    Thanks
    3
    Thanked 0 Times in 0 Posts
    I don't think we want to turn this thread into an extended VBA/Regular Expressions tutorial.
    O.k.

    I'd like to test the macro as soon as possible, but I still hesitate regarding

    1) StrWkBkNm = "C:\Users\" & Environ("Username") & "\Documents\Workbook Name.xls"
    since I do not store files in the default 'Documents' folder, but in E:\, I'd like to know, if possible, how I can modify such line so that I can just paste the file path as such.

    2) 'code to process the document goes here
    what I am supposed to replace the above line with

    Lastly, I'd like to know whether there are any more lines that I am to fill out.

  14. #11
    Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    5,055
    Thanks
    2
    Thanked 417 Times in 346 Posts
    Quote Originally Posted by REGEX View Post
    1) StrWkBkNm = "C:\Users\" & Environ("Username") & "\Documents\Workbook Name.xls"
    since I do not store files in the default 'Documents' folder, but in E:\, I'd like to know, if possible, how I can modify such line so that I can just paste the file path as such.
    So replace as much as you need after the '=' character with your actual file path & name.
    Quote Originally Posted by REGEX View Post
    2) 'code to process the document goes here
    what I am supposed to replace the above line with
    You replace it with whatever code you need. I've given you links showing what might go there.
    Quote Originally Posted by REGEX View Post
    Lastly, I'd like to know whether there are any more lines that I am to fill out.
    That's entirely up to you. We have no idea what you need, since you haven't actually told us anything about what you're trying to achieve.
    Last edited by macropod; 2016-03-27 at 08:38.
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  15. #12
    New Lounger
    Join Date
    Mar 2016
    Posts
    8
    Thanks
    3
    Thanked 0 Times in 0 Posts
    At first I had in mind implementing conditional regular expressions both in the Find column and in the Replace column of the Excel workbook; yet, I've just read that the Microsoft vbScript does not support, among other regex, such conditionals, unlike for example The Microsoft .NET Framework.
    Therefore, I’d like to know if it would be possible to adjust the macro so that it also runs such kind of conditionals on both columns.

    You replace it with whatever code you need. I've given you links showing what might go there.
    In neither link can I find an expression similar to 'code to process the document goes here. but rather the word 'code' is used as a synonym for what I call a macro.
    code.PNG

    So honestly, I still do not know what type of data I am to replace that line with.
    Last edited by REGEX; 2016-03-27 at 07:12.

  16. #13
    Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    5,055
    Thanks
    2
    Thanked 417 Times in 346 Posts
    Quote Originally Posted by REGEX View Post
    At first I had in mind implementing conditional regular expressions both in the Find column and in the Replace column of the Excel workbook; yet, I've just read that the Microsoft vbScript does not support, among other regex, such conditionals, unlike for example The Microsoft .NET Framework.
    Therefore, I’d like to know if it would be possible to adjust the macro so that it also runs such kind of conditionals on both columns.
    I have already told you:
    I have no idea what you want to do and no first-hand experience using the Microsoft vbScript Regular Expressions library, I can't really help you on that front.
    I'm not about to start playing guessing games.

    Quote Originally Posted by REGEX View Post
    In neither link can I find an expression similar to 'code to process the document goes here. but rather the word 'code' is used as a synonym for what I call a macro.
    ...
    So honestly, I still do not know what type of data I am to replace that line with.
    Well, what do you suppose a comment line like:
    'Process each word from the F/R List
    which appears in both links, followed by a block of code that processes the document's content, might suggest???

    I'm not prepared to waste any more time on this. All you've done so far is to ask for code to implement a particular process, when we have no idea what the objective is and the process you've specified may not even be appropriate to the task. If you want to play that game, you won't find any takers, here or on any other forum.
    Last edited by macropod; 2016-03-27 at 21:57.
    Cheers,

    Paul Edstein
    [MS MVP - Word]

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •