Results 1 to 11 of 11
  1. #1
    New Lounger
    Join Date
    Apr 2002
    Location
    New South Wales, Australia
    Posts
    11
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Text in Word (W2K)

    Hello,

    We've been using an older version of a product called Outside In (formerly from INSO, now Stellent) to view documents, and to extract text to enable indexing.

    We've just found that certain text (headings, footers, form fields with names, text in textboxes) does not get extracted. This could lead to inaccurate text retrieval.

    "ActiveDocument.Range.Text" gives a similar result. I can save the document as plain text, and that gives some more text- but dropdown form fields and textboxes are still excluded.

    Thanks for any help you might offer.

    David

  2. #2
    Super Moderator jscher2000's Avatar
    Join Date
    Feb 2001
    Location
    Silicon Valley, USA
    Posts
    23,112
    Thanks
    5
    Thanked 93 Times in 89 Posts

    Re: Text in Word (W2K)

    Are you looking for a way to manually extract all of the text, or a way to coax third party software to do that? To manually extract the form fields, what if you Unlink all the fields (convert them to text) before copying? In a test with drop down fields, it does seem to lock in the current value; ditto for text boxes. Headers and footers are a bit more involved, since they can be different for every section. See the recent discussion of sections and storyranges (here or on the VBA board) for more information.

  3. #3
    New Lounger
    Join Date
    Apr 2002
    Location
    New South Wales, Australia
    Posts
    11
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Text in Word (W2K)

    jsher2002

    Either way would do. Another 3rd party product might also do.

    I'll try unlinking- but I'm not sure how.

    If I can unlink texboxes and dropdown fields, and do a "File Save as Text doc" it might well do what we need to do.

    Thanks for you help.

  4. #4
    Gold Lounger
    Join Date
    Dec 2000
    Location
    Hollywood (sorta), California, USA
    Posts
    2,759
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Text in Word (W2K)

    Have you considered converting the files to pdf?

    If you need to view .doc Word files on PC's without Word, consider the Word Viewer
    Kevin <IMG SRC=http://www.wopr.com/w3tuserpics/Kevin_sig.gif alt="Keep the change, ya filthy animal...">
    <img src=/w3timages/blackline.gif width=33% height=2><img src=/w3timages/redline.gif width=33% height=2><img src=/w3timages/blackline.gif width=33% height=2>

  5. #5
    Star Lounger
    Join Date
    Jan 2001
    Posts
    68
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Try the generic print driver

    Another way to get all the text out of a Word (or other) document is with the generic print driver. All versions of Windows come with this driver. Set it up as an alternate printer and pick the

  6. #6
    Super Moderator jscher2000's Avatar
    Join Date
    Feb 2001
    Location
    Silicon Valley, USA
    Posts
    23,112
    Thanks
    5
    Thanked 93 Times in 89 Posts

    Re: Text in Word (W2K)

    > I'll try unlinking- but I'm not sure how.

    ActiveDocument.Unprotect "MyPassword"
    ActiveDocument.Fields.Unlink

  7. #7
    New Lounger
    Join Date
    Apr 2002
    Location
    New South Wales, Australia
    Posts
    11
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Try the generic print driver

    Sorry, this didn't work for me. I get something, which although almost entirely ascii characters, has less readable text than viewing the Word document in Notebook. Is it a postscript file?

  8. #8
    New Lounger
    Join Date
    Apr 2002
    Location
    New South Wales, Australia
    Posts
    11
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Text in Word (W2K)

    Thanks Jefferson. I'm almost there. I now have the contents of the dropdown fields. But I don't have the text from a textbox. That's not the formfield type textbox- it's the one like a frame, from Insert, Textbox.

    Frames get converted to text, textboxes don't. Strange.

    I guess I can get it out by something along the lines of "ActiveDocument.Shapes(1).AlternativeText"- unless there's a way I can do it without VBA.

    Thanks for your help.

  9. #9
    Super Moderator jscher2000's Avatar
    Join Date
    Feb 2001
    Location
    Silicon Valley, USA
    Posts
    23,112
    Thanks
    5
    Thanked 93 Times in 89 Posts

    Re: Text in Word (W2K)

    Well, as long as we're trashing your original document (not saving the changes, of course!), how about using something like this to convert any shape containing a textbox into a frame:

    Dim intCounter As Integer
    With ActiveDocument.Shapes
    For intCounter = .Count To 1 Step -1
    If .Item(intCounter).TextFrame.HasText Then
    .Item(intCounter).ConvertToFrame
    End If
    Next
    End With

    I am counting backwards because when I convert a shape to a frame, it may drop out of the Shapes collection. If I am counting forwards, this could cause the loop to skip an object.

  10. #10
    Star Lounger
    Join Date
    Jan 2001
    Posts
    68
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Try the generic print driver

    Well shoot, this used to work! I just tried it on the Win NT system at the office and the output was useless! I used to do this all the time in Win 3.1 and (I'm pretty sure) Win 98. Thought I had done it before on NT, but maybe not.

    It looks like the driver is outputing a character, putting in a carriage return (but not linefeed) and putting out the next. Yuch!

  11. #11
    Plutonium Lounger
    Join Date
    Nov 2001
    Posts
    10,550
    Thanks
    0
    Thanked 7 Times in 7 Posts

    Re: Try the generic print driver

    Strangely, I had exactly the same problem yesterday.

    I used to be able to get text from almost any document by using the printer "Generic / Text only on File" but I haven't used this for ages. When I tried printing an Acrobat document to that printer yesterday it just produced gibberish. I'm not sure if I have every successfully used this printer on Windows 2000.

    StuartR

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •