Results 1 to 1 of 1
2005-02-13, 18:46 #1
- Join Date
- Feb 2001
- Yilgarn region of Toronto, Ontario
- Thanked 0 Times in 0 Posts
iterate through all paragraphs (Word/Xp et al.)
I'm amazed at the timing differences in different methods of accessing the paragraphs in a document. The code at the foot of this post is my original effort at documenting to an ASCII text file some basic characteristics of formatting for every paragraph in a document. The task (for each document) falls into two stages:
(1) loop through every paragraph and
(2) document the format of each paragraph.
Client wants the format as a set of tab-delimited strings, so I'm pretty well saddled with "Print #".
Some of these documents were taking upwards of six minutes to analyse. They are large documents, but the job (6,000 documents) looked like taking all week, so I started looking at ways to improve the timing.
Version 0 <font color=red>2m12s</font color=red> The initial version of code ran for 2m12s on a large document. (The GHz of my machine should be irrelevant, since all timings are relative, right?)
Version 1 1m53s I dropped the test for 'should I use the first character if the paragraph formatting is heterogenous?' and elected to use the first character regardless. I converted my series of successive appends to strResult to a single rather wide statement that assembled thestring in one blow. A saving of 19s, nice, but not impressive.
Version 2 1m53s I used 'With .rng', and nested within that 'With .Font' and 'With .Paragraph'; no significant saving
Version 3 1m56s I switched my loop to read 'For Each prg in doc.Paragraphs' and found that took longer.
Version 4 3m52s I tried looping 'For lng = 1 to doc.Paragraphs.Count' and rolled my eyes. OK. I knew this anyway, but I was trying to be exhaustive in my research.
Version 5 1m55s I removed the 'Application.StatusBar' display, since I'm using audible alerts to monitor progress. It makes little difference.
Version 6 1m55s I brought the 'doc.Range.End' outside the While statement, assigning the value to lngRaneEnd and using lngRangeEnd as the test in the While statement. Clutching at straws here .....
Version 7 <font color=448800>1m05s</font color=448800> I replaced the '<font color=red>rng.Move wdParagraph, 1</font color=red>' with '<font color=448800>rng.Start = rng.Paragraphs(1).Range.End + 1</font color=448800>'. Go figure! In retrospect my lesson of Versions 0 and 4 showed me that how I access paragraphs can greatly affect the timing. Changing the addressing method to a simple adjustment of a LONG value cut my execution time by more than half. Instead of seven days, think three.
I'd love to have these findings corroborated by someone else, that simply adjusting the rng.Start might be the fastest way (yet) to iterate through every paragraph of a document.
<pre>Public Function ProcessNormalStyles(doc As Document, intFile As Integer, strDelim As String)
''' Crude method to obtain primary characteristics of a basic style
Dim rng As Range
Set rng = doc.Paragraphs(1).Range
Dim lng As Long
While rng.End < (doc.Range.End - 1)
Application.StatusBar = rng.End & " of " & doc.Range.End
Dim strResult As String
strResult = ""
If rng.Font.Name = "" Then
strResult = strResult & rng.Characters(1).Font.Name
strResult = strResult & rng.Font.Name
strResult = strResult & strDelim & rng.Font.Bold
strResult = strResult & strDelim & rng.Font.Italic
strResult = strResult & strDelim & rng.Font.Underline
strResult = strResult & strDelim & rng.Font.Size
strResult = strResult & strDelim & rng.Font.StrikeThrough
strResult = strResult & strDelim & rng.ParagraphFormat.Alignment
strResult = strResult & strDelim & rng.ParagraphFormat.LeftIndent
strResult = strResult & strDelim & rng.ParagraphFormat.RightIndent
strResult = strResult & strDelim & rng.ParagraphFormat.SpaceBefore
strResult = strResult & strDelim & rng.ParagraphFormat.SpaceAfter
Print #1, doc.FullName & strDelim & strResult
rng.Move wdParagraph, 1