Page 1 of 2 12 LastLast
Results 1 to 15 of 23
  1. #1
    Star Lounger
    Join Date
    Jul 2008
    Posts
    68
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I have 13 Meg txt files opened in Word. Each line in the file ends with a paragraph mark. I am trying to replace each mark with a space by using Word's replace feature (in Word I am using caret p). However, this process is taking forever. Is there a faster way to remove end of line paragraph marks from txt files?

    Thank you in advance for your replies.

  2. #2
    Super Moderator
    Join Date
    Jun 2002
    Location
    Mt Macedon, Victoria, Australia
    Posts
    3,993
    Thanks
    1
    Thanked 45 Times in 44 Posts
    Is the problem that you can't use Replace All with Word's Find and Replace, because you need to keep some of them?
    Are there some of the paragraph marks you need to keep because they really represent paragraphs?
    Regards
    John



  3. #3
    Star Lounger
    Join Date
    Jul 2008
    Posts
    68
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by John Hutchison View Post
    Is the problem that you can't use Replace All with Word's Find and Replace, because you need to keep some of them?
    Are there some of the paragraph marks you need to keep because they really represent paragraphs?
    Thank you for the reply.

    No, I am using Find and Replace, Replace All. The problem is the enormous time and resources of processing the file. The sentences/phrases in the file are approximately five words long. Every time a paragraph mark is replaced, all the lines following get "pulled up." I do not need to keep any paragraph marks because I embedded markers in the text before I started the paragraph mark deletion procedure. Is there some kind of application I can use to replace the paragraph marks outside of Word (if that would help any)?

  4. #4
    Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    5,054
    Thanks
    2
    Thanked 417 Times in 346 Posts
    Hi HowdeeDoodee,

    You might try something using WSH instead of Word. There's some code here: http://www.ericphelps.com/scripting/...AndReplace.txt
    It's coded to only work with files under 1MB, but deleting the lines:
    Code:
    If objScriptingFile.Size > 999999 Then
      Status "* TOO LARGE: " & objScriptingFile.Path
      Exit Sub
    End If
    might work for you. You could also change the line:
    strOldText = InputBox("Enter old text you want replaced", "Old Text", "/index.htm"">")
    to:
    strOldText = vbcr
    or:
    strOldText = vbcrlf
    No matter which program you use, though, processing files with 13MB of text is going to take some time. All the WSH will do is avoid Word's overheads. A fast PC wouldn;t hurt, either
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  5. #5
    Super Moderator
    Join Date
    Jan 2001
    Location
    Melbourne, Victoria, Australia
    Posts
    3,852
    Thanks
    4
    Thanked 259 Times in 239 Posts
    I would start by putting the file into Normal (Draft) view and the cursor at the top of the file to avoid repagination slowdowns. Then try something like this

    ActiveDocument.Range.Text = Replace(ActiveDocument.Range.Text, vbCr, " ")

    I don't know how big a string is allowed to be but a string function might work well enough.
    Andrew Lockton, Chrysalis Design, Melbourne Australia

  6. #6
    Silver Lounger
    Join Date
    Apr 2010
    Location
    Montréal
    Posts
    1,795
    Thanks
    33
    Thanked 52 Times in 51 Posts
    Hello, HDDD.

    In your case, what I do is to go to Find/Replace, ( Ctrl H ) and replace all double ¶ marks with another, I use two %%. Then do a Alt A and they are all replaced. Then do the Ctrl H again and Find all single ¶ and Replace with a space. After this is done, go back to the %%, and replace them with a double ¶. To get the ¶, do a Ctrl^ P.

    Look into the "More" on the dialog to Find/Replace then click on "Special" to find the one to remove. The Ctrl^ P is indeed that : Ctrl + Shift 6 + P or quicker, the Alt 20 keys.

    Have a good look at that dialog, after you have selected what to Find, then what to replace with, it will show at the bottom that the A in All is underlined, this tells you that an Alt A will flush them all. If it is not to your liking, a Ctrl Z will bring you back one step.

    Are we having fun yet ? ......................Jean.

  7. #7
    Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    5,054
    Thanks
    2
    Thanked 417 Times in 346 Posts
    Hi Jean,

    The issue isn't one of replacing single paragraph breaks only with spaces, but of replacing all paragraph breaks with spaces, which requires nothing morte than a single Find/Replace.
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  8. #8
    Silver Lounger
    Join Date
    Apr 2010
    Location
    Montréal
    Posts
    1,795
    Thanks
    33
    Thanked 52 Times in 51 Posts
    macropod, greets.

    You are right, read my note and you will see that the procedure that I mentioned is just this, replace all ¶ with a space. Keep in mind that all true paragraphs are followed by an undesirable ¶ and the one to keep. So, there are cases where there are two in a row thus the necessity to F & R two ¶ ¶ and replace them with another two characters that will be later replaced in the text to actually keep the formatting of the text.

    The second procedure is to rid the text of the single ¶ by a space, if this is done first, the F & R would do this all over, disregarding that all ¶ ¶ should not be taken singly. At this moment, there are no more double ¶¶ and one can get rid of all the left over single ¶ and replace them with a space.

    If these are "maunal line-break" they are different, they are ^| I am looking for the Alt code for them, but mechanically, they are Shift 6 + pipe, ^| , you can try replacing them with a space and all will be happy, I think that Alt 28 could be it. I hope that you have toggled the " show non-printing characters " to see what is what. In Word, it is the ¶ up on the bar and if you hover on it, it will read "Show/Hide ¶".

    I hope that you are having fun, I am...................Jean.

  9. #9
    Silver Lounger
    Join Date
    Apr 2010
    Location
    Montréal
    Posts
    1,795
    Thanks
    33
    Thanked 52 Times in 51 Posts
    HDDD, hello.

    If this text is not too personal, upload it here and I will do the trick for you. See just below when you reply, the Attachments, hit the Browse, find your .doc and then Upload file.

    Kid's stuff. ...............Jean.

  10. #10
    Silver Lounger
    Join Date
    Apr 2010
    Location
    Montréal
    Posts
    1,795
    Thanks
    33
    Thanked 52 Times in 51 Posts
    Me again macropod.

    >>> The issue isn't one of replacing single paragraph breaks only with spaces, but of replacing all paragraph breaks with spaces, which requires nothing morte than a single Find/Replace.

    Does not HDDD want to keep the format of the whole text ? A .doc such as hers, is done by an uneducated typist doing a CR ( Enter ) at the end of all lines as on a typewriter, instead of letting wordwrap do its trick.

    A FWIW...........Jean.

  11. #11
    Super Moderator
    Join Date
    Jun 2002
    Location
    Mt Macedon, Victoria, Australia
    Posts
    3,993
    Thanks
    1
    Thanked 45 Times in 44 Posts
    Quote Originally Posted by Jean Parrot View Post
    >>> The issue isn't one of replacing single paragraph breaks only with spaces, but of replacing all paragraph breaks with spaces, which requires nothing morte than a single Find/Replace.

    Does not HDDD want to keep the format of the whole text ? A .doc such as hers, is done by an uneducated typist doing a CR ( Enter ) at the end of all lines as on a typewriter, instead of letting wordwrap do its trick.
    No that is not the issue. He genuinely wants to replace all CRs, not just the double CRs. Read the full thread.
    The problem is that the document is enormous (13MB) and Replace All is taking far too long. He knows how to do a Replace All, but is looking for a faster alternative.
    Regards
    John



  12. #12
    Silver Lounger
    Join Date
    Apr 2010
    Location
    Montréal
    Posts
    1,795
    Thanks
    33
    Thanked 52 Times in 51 Posts
    Hello John.

    >>>The problem is that the document is enormous (13MB) and Replace All is taking far too long. He knows how to do a Replace All, but is looking for a faster alternative.

    I do not see why, mine was done here in a wink. I had 33 ± full pages. But who am I to say ? I do not think that there is another Word alternative. I read the question over again and I have the impression, maybe wrong, that the replacement is done one ¶ at the time and the Replace All is not used. ( ??? )

    Be good.............. Jean.

  13. #13
    Super Moderator
    Join Date
    Jun 2002
    Location
    Mt Macedon, Victoria, Australia
    Posts
    3,993
    Thanks
    1
    Thanked 45 Times in 44 Posts
    mine was done here in a wink. I had 33 ± full pages
    13 MB is probably 15,000 pages!

    In the third post in this thread, he says he is using Replace All
    Regards
    John



  14. #14
    Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    5,054
    Thanks
    2
    Thanked 417 Times in 346 Posts
    Quote Originally Posted by Jean Parrot View Post
    I do not see why, mine was done here in a wink. I had 33 ± full pages.
    The why is because that's what HowdeeDoodee says he wants to do.
    I have 13 Meg txt files opened in Word. Each line in the file ends with a paragraph mark. I am trying to replace each mark with a space
    Processing around 33 pages is trivial compared to processing 13MB of text with a paragraph break at the end of each line of about 5 words each. That makes for in excess of 450,000 paragraph breaks to replace! Your 33 page document would probably have much less than 1/3 that number of characters, relatively few of which (probably less than 1,000) would be paragraph breaks.
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  15. #15
    Super Moderator
    Join Date
    Dec 2000
    Location
    New York, NY
    Posts
    2,970
    Thanks
    3
    Thanked 29 Times in 27 Posts
    We haven't yet heard back from the original poster to see if any of the suggestions here have helped, but based on this:

    Every time a paragraph mark is replaced, all the lines following get "pulled up."
    - it seems they are doing the Find/Replace while in Print Layout view, in which case Andrew's suggestion to first switch to Normal view would probably speed things up significantly.

    Another thing that might help would be to run the Find/Replace via a macro, and start the macro with the line:

    Application.ScreenUpdating = False

    Gary

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •