Page 1 of 2 12 LastLast
Results 1 to 15 of 23
  1. #1
    Platinum Lounger
    Join Date
    Feb 2001
    Location
    Yilgarn region of Toronto, Ontario
    Posts
    5,453
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Word97SR2: ActiveDocument.Words

    20010428122227 Number of words 2720
    20010428122227 Loop counter 2
    20010428122930 4.0 3.0 416.0
    20010428123157 Number of words 2720
    20010428123157 Loop counter 5
    20010428124826 9.0 10.0 969.0
    20010428130537 10.0 11.0 974.0

    The times above represent runs of the subroutine TEST at the foot of the posting.

    TEST calls each of test1, test2, test3, each of which is to make a specified number of passes through all the words of a document.

    test1 is a straightforward "for each wrd in activedocument.words", and when first I began using it I thought it was a tad slow, taking 4.0 seconds to loop through all the words just twice, and that maybe "activedocument.words" regenerated the set each time, so I wrote test2.

    test2 says "grab the activedocument.words just once and store it in a varianet". Running time is about the same.

    test3 says "let's index through the collection", but that is UNBELIEVABLY slower by a factor of 100!!!


    Whaaaaaaaaaaaaaaaaaaaaat!


    What I need is a fast way of examining each word of a document (or a range or a selection).

    In the meantime I'd be grateful for a practised eye yo look over my test code and spot anything that looks really weird or out of place.

    100 times slower? No!



    <pre>Function test1(icount As Integer)
    Application.ScreenUpdating = False
    Dim starter
    starter = Time
    Dim wrd
    Dim j As Integer
    For j = 1 To icount
    For Each wrd In ActiveDocument.Words
    wrd.Select
    Next wrd
    Next j
    test1 = Time - starter
    Application.ScreenUpdating = True
    End Function

    Function test2(icount As Integer)
    Application.ScreenUpdating = False
    Dim starter
    starter = Time
    Dim wrd
    Dim j As Integer
    For j = 1 To icount
    Dim w
    Set w = ActiveDocument.Words
    For Each wrd In w
    wrd.Select
    Next wrd
    Next j
    test2 = Time - starter
    Application.ScreenUpdating = True
    End Function

    Function test3(icount As Integer)
    Application.ScreenUpdating = False
    Dim starter
    starter = Time
    Dim wrd
    Dim j As Integer
    For j = 1 To icount
    Dim w
    Set w = ActiveDocument.Words
    Dim k As Integer
    For k = 1 To w.Count
    w(k).Select
    Next k
    Next j
    test3 = Time - starter
    Application.ScreenUpdating = True
    End Function

    Sub test()
    Dim str1 As String
    Dim icount As Integer
    icount = 5
    Call LogFile(strcApplication, "Number of words " & str(ActiveDocument.Words.Count))
    Call LogFile(strcApplication, "Loop counter " & str(icount))
    While True
    str1 = " "
    str1 = str1 & Format(test1(icount) * 24 * 60 * 60, "#,##0.0") & " "
    str1 = str1 & Format(test2(icount) * 24 * 60 * 60, "#,##0.0") & " "
    str1 = str1 & Format(test3(icount) * 24 * 60 * 60, "#,##0.0") & " "
    Debug.Print str1
    ' Call LogFile(strcApplication, str1)
    Wend
    End Sub
    </pre>


  2. #2
    kelliel
    Guest

    Re: Word97SR2: ActiveDocument.Words

    Is there any way you can modify the code in <A target="_blank" HREF=http://www.wopr.com/cgi-bin/w3t/showthreaded.pl?Cat=&Board=vb&Number=34771&page=0& view=expanded&sb=5&vc=1#Post34771>this post</A>?

  3. #3
    Super Moderator
    Join Date
    Dec 2000
    Location
    New York, NY
    Posts
    2,970
    Thanks
    3
    Thanked 29 Times in 27 Posts

    Re: Word97SR2: ActiveDocument.Words

    I was going to point to the same post that Lawrence did - a really interesting idea and it would be nice to see something useful done with it.

    As far as For Each vs. a counter, there was a thread on this I think on the old Lounge, and that same factor of 100 was cited - must be about right!
    Now, I guess the interesting question is: why that big a difference? What is there about the way collections are structured, that makes them so much more efficient - anybody willing to take a stab at explaining?

  4. #4
    Super Moderator jscher2000's Avatar
    Join Date
    Feb 2001
    Location
    Silicon Valley, USA
    Posts
    23,112
    Thanks
    5
    Thanked 93 Times in 89 Posts

    Re: Word97SR2: ActiveDocument.Words

    Well, one thing I thought was perhaps using an explicit numeric index used a different order, but this routine demonstrates that the order is the same:

    Sub CompareIndexes()
    Dim wrd As Range, doc1 As Document, doc2 As Document
    Dim strArr1() As String, strArr2() As String, lng As Long
    Set doc1 = ActiveDocument
    ReDim strArr1(1 To doc1.Words.Count)
    ReDim strArr2(1 To doc1.Words.Count)
    lng = 1
    For Each wrd In doc1.Words
    strArr1(lng) = wrd.Text
    lng = lng + 1
    Next
    Dim colWords As Words
    Set colWords = doc1.Words
    For lng = 1 To colWords.Count
    strArr2(lng) = colWords(lng).Text
    Next
    Set doc2 = Documents.Add
    For lng = 1 To UBound(strArr1)
    doc2.Range.InsertAfter strArr1(lng) & vbTab & strArr2(lng) & vbCrLf
    Next
    End Sub

    Apparently "For Each" is optimized. There had to be some reason for adding it.

  5. #5
    Gold Lounger
    Join Date
    Dec 2000
    Location
    New Hampshire, USA
    Posts
    3,386
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Word97SR2: ActiveDocument.Words

    I've seen factors of 160+ with very small coding/design changes.

    Fastest way to move thru a document is often the Expand method, using Range, instead of Selection, object.

  6. #6
    Platinum Lounger
    Join Date
    Feb 2001
    Location
    Yilgarn region of Toronto, Ontario
    Posts
    5,453
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Word97SR2: ActiveDocument.Words

    I'll try. The difference here seems to be in obtaining the TEXT property of the object (document) and for this particular example being able to strip the atomic parts of a string (characters). I suspect that this is not a sufficiently general solution for people who actually need the non-atomic elements, words in my case. If you need to examine the property of a word ("Is it a hyperlink?") then the characters/text won;t do you any good.

  7. #7
    Platinum Lounger
    Join Date
    Feb 2001
    Location
    Yilgarn region of Toronto, Ontario
    Posts
    5,453
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Word97SR2: ActiveDocument.Words

    >What is there about the way collections are structured, that makes them so much more efficient


    I suspect that my problems (factor x100) are related to the way I'm accessing the collection.

    It's AS IF the collection was being re-built on each pass. An analogy would be re-opening a document each time I needed the next paragraph. There's be a huge delay during the opening and closing that would add termendously to the "access next paragraph" work.

  8. #8
    Super Moderator
    Join Date
    Dec 2000
    Location
    New York, NY
    Posts
    2,970
    Thanks
    3
    Thanked 29 Times in 27 Posts

    Re: Word97SR2: ActiveDocument.Words

    >It's AS IF the collection was being re-built on each pass. An analogy would be re-opening a document each time I needed the next paragraph. There's be a huge delay during the opening and closing that would add termendously to the "access next paragraph" work

    That's an interesting surmise but I still can't keep from picking at this one: both the For Each...Next and For...Next constructs make use of an index. In the case of For Each it's an internal index that we can't access.

    For an example of For Each's internal index, there's the case where one avoids using For Each ...Next when deleting members of a collection - as explained in MS's own Office 2000 VB Programmer's Guide, this is because each deletion of a member resets the index that For Each is accessing (so they recommend the *much slower* For...Next step backwards method instead). They also recommend sticking with For..Next when you need to work with the count or index - for example if you need to act upon every tenth member of a collection.

    So here's the Collection object, one of the mainstays of VBA and something we constantly work with. If we need to work directly with the Count property and use a For...Next loop, we're stuck with code that runs 100+ times slower as compared with a For Each..Next loop.

    On the other hand, the For Each..Next loop doesn't allow us to access the collection index nor run the loop backwards.

    Considering the centrality in VBA of looping through collections (and all the concern about efficiency/optimisation) it seems reasonable to wish that the For Each construct provided more means for us to hook into its functionality (or that the For..Next construct could be better optimized). And just exactly how do these two constructs differ in their internal workings?

  9. #9
    5 Star Lounger
    Join Date
    May 2001
    Location
    Stuttgart, Baden-W, Germany
    Posts
    931
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Word97SR2: ActiveDocument.Words

    Being a "dumb" interpreter-language, (Visual) Basic steps through the loop
    For index = 1 to n
    Collection.Item(index)
    Next index
    and the interpreter is always surprised by the new Item. <img src=/S/crazy.gif border=0 alt=crazy width=15 height=15> Instead of moving from the previous .Item(index-1) (which it has long ago forgotten) to .Item(index), it loops through the whole collection to locate that item.

    This takes a time proportional to <font face="Georgia">1 + 2 + 3 + 4 + ... + n-1 + n = n(n+1)/2</font face=georgia>. So that's very bad news if the files are getting big. If a loop is using indexed items, it will get 10.000 times slower if the file gets 100 times bigger. <img src=/S/snail.gif border=0 alt=snail width=21 height=17>
    If you just need an index for bookkeeping, it is much faster to use "For Each Item In Collection", and increment an index i=i+1 in the loop. If you don't want to delete an item while doing the loop, you can mark it up for deletion with a special formatting. So I never have found a compelling need to use indexed items.

    Another fast method to move around are ranges with .MoveEndUntil /.MoveStartWhile Cset:="..."
    If the ranges that interest me are formatted in a special way, I often put in some unique characters as markers for Cset with a search/replace.

    <font face="Script MT Bold">Greetings, Klaus</font face=script>

  10. #10
    Super Moderator
    Join Date
    Dec 2000
    Location
    New York, NY
    Posts
    2,970
    Thanks
    3
    Thanked 29 Times in 27 Posts

    Re: Word97SR2: ActiveDocument.Words

    Klaus,

    Interesting thoughts, thanks. Sounds like Chris' take was about right then.

    Incrementing a counter within a For Each loop is an interesting suggestion - never heard that one.

    Now for the $64K question: why does looping through the collection via For Each work so much more quickly, i.e. what is the internal method used, that is so much more efficient?

    Your mention of .MoveEndUntil and .MoveStartWhile reminds me that there is another method provided for moving through some and only some, collections: Previous and Next. Which gives rise to another whiny<g> question: why isn't the Previous and Next method available for more collections?

    (I can add these to my other unanswerable questions:
    Why isn't the Exists method available for more collection objects?
    Why is the Tasks object only available in Word?
    (I'm starting to sound like Andy Rooney) <img src=/S/laugh.gif border=0 alt=laugh width=15 height=15>

    Gary

  11. #11
    5 Star Lounger
    Join Date
    May 2001
    Location
    Stuttgart, Baden-W, Germany
    Posts
    931
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Word97SR2: ActiveDocument.Words

    I guess that using a "For Each"-loop, the Basic interpreter knows that it will have to access each Item, and can choose the most effective method to do so.

    .Next and .Previous are very fast if you already know how many Items are in the Collection (and you can loop bottom to top).

    Being unanswerable, I can not comment on your questions. I often put questions like these to the gods in Redmond (mswish@microsoft.com); right now I'm awaiting which of my prayers have been heard (Word2002). <img src=/S/doh.gif border=0 alt=doh width=15 height=15>

  12. #12
    Platinum Lounger
    Join Date
    Feb 2001
    Location
    Yilgarn region of Toronto, Ontario
    Posts
    5,453
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Word97SR2: ActiveDocument.Words

    > .Next and .Previous are very fast if

    ... because the internal code, the VBA interpreter, remembers where it was last time it came in. We can think of that (vaguely) as remembering the last memory address and just needing to load a single 32-bit memory address pointer without re-calculating an incremental number, multiplying by the pointer length, scanning through all the variable-length string items of elements prior to the previous element, adding ITS length, and then calculating the new absolute memory address.

    That's assuming there's any memory left .....

  13. #13
    Platinum Lounger
    Join Date
    Dec 2000
    Location
    Queanbeyan, New South Wales, Australia
    Posts
    3,730
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Word97SR2: ActiveDocument.Words

    A question for those who have the time to investiagte <img src=/S/smile.gif border=0 alt=smile width=15 height=15>...

    If you were going to delete objects from a collection (which is a big reason for using the "for i = collection.count to 0 step -1" construct)

    Would it be more efficient to use a construct like:

    [pre]blnFinished = false
    do until blnFinished
    for each object in objects.collection
    if conditionMet then
    delete object
    exit for
    end if
    if objects.index = objects.count then
    blnfinished = true
    end if
    Loop
    [pre]

    And, if it was really more efficient- would it really matter in the real world?
    Subway Belconnen- home of the Signboard to make you smile. Get (almost) daily updates- follow SubwayBelconnen on Twitter.

  14. #14
    Platinum Lounger
    Join Date
    Feb 2001
    Location
    Yilgarn region of Toronto, Ontario
    Posts
    5,453
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Word97SR2: ActiveDocument.Words

    >This takes a time proportional to 1 + 2 + 3 + 4 + ... + n-1 + n = n(n+1)/2.

    Which is why my test3 ran so much slower than the first two tests (each of which used the ForEach construct), right?

  15. #15
    Platinum Lounger
    Join Date
    Feb 2001
    Location
    Yilgarn region of Toronto, Ontario
    Posts
    5,453
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Word97SR2: ActiveDocument.Words

    > collections: Previous and Next

    I can't find any reference to these in the Word87SR2 help files. Are you able to provide a clue?

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •