Page 1 of 2 12 LastLast
Results 1 to 15 of 19
  1. #1
    2 Star Lounger
    Join Date
    Aug 2011
    Posts
    112
    Thanks
    35
    Thanked 3 Times in 3 Posts

    Count occurrences of word

    We get a word document which we have to check for acronyms with a sample below.

    Abbreviations and Acronyms

    AML—Acquisition Master List
    AP—Acquisition Plan
    ASU—Acquisition and Sustainment Unit
    BES—Business and Enterprise Systems
    CAE—Component Acquisition Executive

    With a VBA routine, I would like to search the document for the count of the acronym. So basically everything before the em dash.

    Is it possible to get the count before the acronym? Example...

    (3) AML—Acquisition Master List
    (2) AP—Acquisition Plan
    (6) ASU—Acquisition and Sustainment Unit
    (4) BES—Business and Enterprise Systems
    (2) CAE—Component Acquisition Executive

    As you can see, the list starts right after the title Abbreviations and Acronyms and ends before the title Terms.

    In the case of the acroynm ASU, the macro would have to search case sensitive so it does not pick up "measure".

    This is the best way I can think to do this, but basically what I'm trying to do is search the document for the count which in the case of AP it only occurs twice. If it only occurs twice then it needs to be removed the list of Acronyms and spelled out at the first use in the word document.

  2. #2
    Bronze Lounger
    Join Date
    Jan 2001
    Location
    La Jolla, CA
    Posts
    1,470
    Thanks
    30
    Thanked 62 Times in 58 Posts
    Do you have a sample Word file you could post?

  3. #3
    2 Star Lounger
    Join Date
    Aug 2011
    Posts
    112
    Thanks
    35
    Thanked 3 Times in 3 Posts

    Count occurrences of word

    Here is a stripped down version. I have some macro that count words but it builds a new word file to display the findings that would be fine.

    If it needs to go in that direction, the cherry on the top would be to know the page of the first occurrence.
    Attached Files Attached Files

  4. #4
    Bronze Lounger
    Join Date
    Jan 2001
    Location
    La Jolla, CA
    Posts
    1,470
    Thanks
    30
    Thanked 62 Times in 58 Posts
    I didn't do exactly what you wanted because my VBA skills are very limited. However, I did get your result if you'll accept another option.

    I did a CTRL+A and CTRL+C of your Word doc. Then, pasted that into the 2nd sheet on the attached Excel file.
    The first sheet counts the words based on the acronyms.

    Hope this helps as a quick solution and one that works until one of the VBA heavies gives you another way.

    /Kevin
    Attached Files Attached Files

  5. #5
    Bronze Lounger
    Join Date
    Jan 2001
    Location
    La Jolla, CA
    Posts
    1,470
    Thanks
    30
    Thanked 62 Times in 58 Posts
    This attachment updated the acronym list you had in that document.

    Sorry, I just noticed that.

    You can supplement your own list in my Excel file by adding new ones in the B column and filling the formulas down in the A and C columns.

    The acronym occurs 1+ the real number of times in the document because they're all in the list of acronyms.
    Attached Files Attached Files
    Last edited by kweaver; 2016-04-12 at 20:31.

  6. #6
    2 Star Lounger
    Join Date
    Aug 2011
    Posts
    112
    Thanks
    35
    Thanked 3 Times in 3 Posts
    Hi Kevin,

    Not so sure this solution is going to work for my situation.

  7. #7
    Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    5,055
    Thanks
    2
    Thanked 417 Times in 346 Posts
    Try:
    Code:
    Sub AcronymLister()
    Application.ScreenUpdating = False
    Dim StrTmp As String, StrAcronyms As String, i As Long, j As Long, Rng As Range, Tbl As Table
    StrAcronyms = "Acronym" & vbTab & "Term" & vbTab & "Page" & vbTab & "Count" & vbTab & "Reference Pages" & vbCr
    With ActiveDocument
      With .Range
        With .Find
          .ClearFormatting
          .Replacement.ClearFormatting
          .MatchWildcards = True
          .Wrap = wdFindStop
          .Text = "\([A-Z0-9]{2,}\)"
          .Replacement.Text = ""
          .Execute
        End With
        Do While .Find.Found = True
          StrTmp = Replace(Replace(.Text, "(", ""), ")", "")
          If InStr(1, StrAcronyms, .Text, vbBinaryCompare) = 0 Then
            If .Words.First.Previous.Previous.Words(1).Characters.First = Right(StrTmp, 1) Then
              For i = Len(StrTmp) To 1 Step -1
                .MoveStartUntil Mid(StrTmp, i, 1), wdBackward
                .Start = .Start - 1
                If InStr(.Text, vbCr) > 0 Then
                  .MoveStartUntil vbCr, wdForward
                  .Start = .Start + 1
                End If
                If .Sentences.Count > 1 Then .Start = .Sentences.Last.Start
                If .Characters.Last.Information(wdWithInTable) = False Then
                  If .Characters.First.Information(wdWithInTable) = True Then
                    .Start = .Cells(.Cells.Count).Range.End + 1
                  End If
                ElseIf .Cells.Count > 1 Then
                  .Start = .Cells(.Cells.Count).Range.Start
                End If
              Next
            End If
            StrTmp = Replace(Replace(Replace(.Text, " (", "("), "(", "|"), ")", "")
            StrAcronyms = StrAcronyms & Split(StrTmp, "|")(1) & vbTab & Split(StrTmp, "|")(0) & vbTab & .Information(wdActiveEndAdjustedPageNumber) & vbTab & vbTab & vbCr
          End If
          .Collapse wdCollapseEnd
          .Find.Execute
        Loop
        StrAcronyms = Replace(Replace(Replace(StrAcronyms, " (", "("), "(", vbTab), ")", "")
        Set Rng = .Characters.Last
        With Rng
          If .Characters.First.Previous <> vbCr Then .InsertAfter vbCr
          .InsertAfter Chr(12)
          .Collapse wdCollapseEnd
          .Style = "Normal"
          .Text = StrAcronyms
          Set Tbl = .ConvertToTable(Separator:=vbTab, NumRows:=.Paragraphs.Count, NumColumns:=5)
          With Tbl
            .Columns.AutoFit
            .Rows(1).HeadingFormat = True
            .Rows(1).Range.Style = "Strong"
            .Rows.Alignment = wdAlignRowCenter
          End With
          .Collapse wdCollapseStart
        End With
      End With
      Rng.Start = .Range.Start
      For i = 2 To Tbl.Rows.Count
        StrTmp = ""
        With .Range
          With .Find
            .ClearFormatting
            .Replacement.ClearFormatting
            .Format = False
            .Forward = True
            .Text = Split(Tbl.Cell(i, 1).Range.Text, vbCr)(0)
            .MatchWholeWord = True
            .MatchWildcards = False
            .MatchCase = True
            .Execute
          End With
          j = 0
          Do While .Find.Found
            If Not .InRange(Rng) Then Exit Do
            j = j + 1
            If j > 0 Then StrTmp = StrTmp & " " & .Information(wdActiveEndAdjustedPageNumber)
            .Collapse wdCollapseEnd
            .Find.Execute
          Loop
        End With
        Tbl.Cell(i, 4).Range.Text = j
        Tbl.Cell(i, 5).Range.Text = Replace(Trim(StrTmp), " ", ", ")
      Next
    End With
    Set Rng = Nothing: Set Tbl = Nothing
    Application.ScreenUpdating = True
    End Sub
    The code above generates a list of acronyms in a table at the end of the document, with definition page references, page references for all occurrences of the acronyms, and counts. It even builds a list of the acronym terms as far as practicable.
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  8. The Following User Says Thank You to macropod For This Useful Post:

    jrb (2016-04-13)

  9. #8
    2 Star Lounger
    Join Date
    Aug 2011
    Posts
    112
    Thanks
    35
    Thanked 3 Times in 3 Posts
    Thank you Paul. I look forward to testing this at work today.

  10. #9
    2 Star Lounger
    Join Date
    Aug 2011
    Posts
    112
    Thanks
    35
    Thanked 3 Times in 3 Posts
    Hi Paul,

    Thank you for this macro. It has truly shown it's worth/value in our attempts to streamline our editing process.

    I messed around with trying to have the acronym table output go to a new word doc, but evidently don't understand enough to get the right placement.

    Also, you mentioned a slightly more elaborate version of this macro which give other counts and page ranges. Is it possible to see that version?

  11. #10
    Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    5,055
    Thanks
    2
    Thanked 417 Times in 346 Posts
    The more elaborate version of the code is:
    Code:
    Sub AcronymLister()
    Application.ScreenUpdating = False
    Dim StrTmp As String, StrAcronyms As String, i As Long, j As Long, k As Long, Rng As Range, Tbl As Table
    StrAcronyms = "Acronym" & vbTab & "Term" & vbTab & "Page" & vbTab & "Cross-Reference Count" & vbTab & "Cross-Reference Pages" & vbCr
    With ActiveDocument
      With .Range
        With .Find
          .ClearFormatting
          .Replacement.ClearFormatting
          .MatchWildcards = True
          .Wrap = wdFindStop
          .Text = "\([A-Z0-9]{2,}\)"
          .Replacement.Text = ""
          .Execute
        End With
        Do While .Find.Found = True
          StrTmp = Replace(Replace(.Text, "(", ""), ")", "")
          If (InStr(1, StrAcronyms, .Text, vbBinaryCompare) = 0) And (Not IsNumeric(StrTmp)) Then
            If .Words.First.Previous.Previous.Words(1).Characters.First = Right(StrTmp, 1) Then
              For i = Len(StrTmp) To 1 Step -1
                .MoveStartUntil Mid(StrTmp, i, 1), wdBackward
                .Start = .Start - 1
                If InStr(.Text, vbCr) > 0 Then
                  .MoveStartUntil vbCr, wdForward
                  .Start = .Start + 1
                End If
                If .Sentences.Count > 1 Then .Start = .Sentences.Last.Start
                If .Characters.Last.Information(wdWithInTable) = False Then
                  If .Characters.First.Information(wdWithInTable) = True Then
                    .Start = .Cells(.Cells.Count).Range.End + 1
                  End If
                ElseIf .Cells.Count > 1 Then
                  .Start = .Cells(.Cells.Count).Range.Start
                End If
              Next
            End If
            StrTmp = Replace(Replace(Replace(.Text, " (", "("), "(", "|"), ")", "")
            StrAcronyms = StrAcronyms & Split(StrTmp, "|")(1) & vbTab & Split(StrTmp, "|")(0) & vbTab & .Information(wdActiveEndAdjustedPageNumber) & vbTab & vbTab & vbCr
          End If
          .Collapse wdCollapseEnd
          .Find.Execute
        Loop
        StrAcronyms = Replace(Replace(Replace(StrAcronyms, " (", "("), "(", vbTab), ")", "")
        Set Rng = .Characters.Last
        With Rng
          If .Characters.First.Previous <> vbCr Then .InsertAfter vbCr
          .InsertAfter Chr(12)
          .Collapse wdCollapseEnd
          .Style = "Normal"
          .Text = StrAcronyms
          Set Tbl = .ConvertToTable(Separator:=vbTab, NumRows:=.Paragraphs.Count, NumColumns:=5)
          With Tbl
            .Columns.AutoFit
            .Rows(1).HeadingFormat = True
            .Rows(1).Range.Style = "Strong"
            .Rows.Alignment = wdAlignRowCenter
          End With
          .Collapse wdCollapseStart
        End With
      End With
      Rng.Start = .Range.Start
      For i = 2 To Tbl.Rows.Count
        StrTmp = "": j = 0: k = 0
        With .Range
          With .Find
            .ClearFormatting
            .Replacement.ClearFormatting
            .Format = False
            .Forward = True
            .Text = "[!\(]" & Split(Tbl.Cell(i, 1).Range.Text, vbCr)(0) & "[!\)]"
            .MatchWildcards = True
            .Execute
          End With
          Do While .Find.Found
            If Not .InRange(Rng) Then Exit Do
            j = j + 1
            If k <> .Duplicate.Information(wdActiveEndAdjustedPageNumber) Then
              k = .Duplicate.Information(wdActiveEndAdjustedPageNumber)
              StrTmp = StrTmp & k & " "
            End If
            .Collapse wdCollapseEnd
            .Find.Execute
          Loop
        End With
        Tbl.Cell(i, 4).Range.Text = j
        StrTmp = Replace(Trim(StrTmp), " ", ",")
        If StrTmp <> "" Then
          'Add the current record to the output list (StrOut)
          StrTmp = Replace(Replace(ParseNumSeq(StrTmp, "&"), ",", ", "), "  ", " ")
        End If
        Tbl.Cell(i, 5).Range.Text = StrTmp
      Next
    End With
    Set Rng = Nothing: Set Tbl = Nothing
    Application.ScreenUpdating = True
    End Sub
    
    Function ParseNumSeq(StrNums As String, Optional StrEnd As String)
    'This function converts multiple sequences of 3 or more consecutive numbers in a
    ' list to a string consisting of the first & last numbers separated by a hyphen.
    ' The separator for the last sequence can be set via the StrEnd variable.
    Dim ArrTmp(), i As Integer, j As Integer, k As Integer
    ReDim ArrTmp(UBound(Split(StrNums, ",")))
    For i = 0 To UBound(Split(StrNums, ","))
      ArrTmp(i) = Split(StrNums, ",")(i)
    Next
    For i = 0 To UBound(ArrTmp) - 1
      If IsNumeric(ArrTmp(i)) Then
        k = 2
        For j = i + 2 To UBound(ArrTmp)
          If CInt(ArrTmp(i) + k) <> CInt(ArrTmp(j)) Then Exit For
          ArrTmp(j - 1) = ""
          k = k + 1
        Next
        i = j - 2
      End If
    Next
    StrNums = Join(ArrTmp, ",")
    StrNums = Replace(Replace(Replace(StrNums, ",,", " "), ", ", " "), " ,", " ")
    While InStr(StrNums, "  ")
      StrNums = Replace(StrNums, "  ", " ")
    Wend
    StrNums = Replace(Replace(StrNums, " ", "-"), ",", ", ")
    If StrEnd <> "" Then
      i = InStrRev(StrNums, ",")
      If i > 0 Then
        StrNums = Left(StrNums, i - 1) & Replace(StrNums, ",", " " & Trim(StrEnd), i)
      End If
    End If
    ParseNumSeq = StrNums
    End Function
    The foregoing, more elaborate code is essentially the same down to:
    j = j + 1

    The simplest way to send the output to a new document would be to replace:
    Code:
    End With
    Set Rng = Nothing: Set Tbl = Nothing
    with:
    Code:
      Dim Doc As Document
      Set Doc = Documents.Add
      Doc.Range.FormattedText = Tbl.Range.FormattedText
      Tbl.Delete
    End With
    Set Rng = Nothing: Set Tbl = Nothing: Set Doc = Nothing
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  12. #11
    2 Star Lounger
    Join Date
    Aug 2011
    Posts
    112
    Thanks
    35
    Thanked 3 Times in 3 Posts
    Hi Paul,

    I love the Cross-Reference Count & Pages. The Cross-Reference fine tunes the discrepancies within the document.

    Running this last macro on the sentences below, PK, LG, IN, and EN do not produce a Term within the table. I thought maybe it has to do with a single term for the acronym; however, LGs Term is Logistics Management.

    1.1. This Acquisition Master List (AML) quantifies manpower required for Program Management (PM) to manage individual programs through the acquisition life cycle. The AFMS defines Contracting (PK) to support a Gen3 Acquisition and Sustainment Unit (ASU) evolved methodology for the Air Force Life Cycle Management Center (AFLCMC) within the Air Force Materiel Command (AFMC). This AFMS is applicable to the acquisition life cycle management of Business Acquisition Systems Support Branch (AFLCMC/HIBB).

    1.2. Full Time Equivalent (FTE) Data Source. Gen3 models were developed using required FTE counts collected during workshop measurements for a Baseline Representative Program (BRP). Required FTEs were determined for each applicable life cycle phase using technical estimates for the processes in the Process Oriented Description (found at the link below). Technical estimates were provided by Subject Matter Experts from the following functions: PM, Financial Management (FM), PK, Logistics Management (LG), Intel (IN) (when applicable), and Engineering (EN).

  13. #12
    Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    5,055
    Thanks
    2
    Thanked 417 Times in 346 Posts
    There are limitations to what the code can do in terms of identifying the expressions that comprise the acronyms. The process the code uses is to first identify the acronym, then look backwards to find one or more words using, as capitals, the letters of the acronym. Since, there's no logical connection between the upper-case letters for the:
    • PK acronym and the word 'Contracting';
    • LG acronym and the words 'Logistics Management';
    • IN acronym and the word 'Intel'; and
    • EN acronym and the word 'Engineering',
    in particular, the final letter of each acronym with the first-preceding word, no match is made. Exactly how one might make such obscure matches without creating other problems isn't entirely clear. Indeed, I'd have to question the rationale for having an acronym for a single word, especially in the case of IN for a word that is itself only 5 letters long.
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  14. #13
    2 Star Lounger
    Join Date
    Aug 2011
    Posts
    112
    Thanks
    35
    Thanked 3 Times in 3 Posts
    Thanks Paul.

    This is some great stuff. Even without the acronym term in all places, the cross-reference count narrows down the scope.

  15. #14
    2 Star Lounger
    Join Date
    Aug 2011
    Posts
    112
    Thanks
    35
    Thanked 3 Times in 3 Posts
    Hi Paul,

    I surely do not want to bludgeon this topic, but with the documenting of the page number, is it possible to document the first occurrence of the term?

    Some users will right their document and use a term on page 1, but they don't develop the acronym until later in the document...maybe page 4 or 5.

    Maybe an additional column in the table for A-Page and T-Page. A for the acronym page number and T for the Term page number for the first use. What about maybe a cross reference to the term? In reality, it should only be in the document one time.

    I tried to step thru the code to figure this out on my own, but this is way above my level of comprehension.

    If this is not practical then no worries. You have helped tremendously to this point.
    Last edited by jrb; 2016-04-16 at 17:31. Reason: Clarification

  16. #15
    Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    5,055
    Thanks
    2
    Thanked 417 Times in 346 Posts
    The macro already outputs the page # of the page on which the acronym is defined and a separate set of references all the pages on which it is otherwise referred to.
    Cheers,

    Paul Edstein
    [MS MVP - Word]

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •