Results 1 to 13 of 13
  1. #1
    New Lounger
    Join Date
    Nov 2002
    Posts
    13
    Thanks
    3
    Thanked 0 Times in 0 Posts

    How to find/replace text that includes both hypertext AND plain text?

    I'm using Word 2010 on Windows 7. I've searched Google and everywhere for the answer to this question.

    I have a Word document with text that I copied/pasted from a web page. It includes paragraphs of text separated by two extra "empty" paragraph marks. One paragraph mark is a hyperlink (i.e. it's in hypertext format and if I hover over it it shows a link) and the other is not. I want to replace all instances of this with nothing. That is, I want to use find-and-replace to delete all instances of these two adjacent empty paragraph marks. But when I type ^p^p into the find-and-replace dialog, Word finds nothing in the document. It does not appear to recognize these two adjacent paragraph marks as such. How can I search for a stretch of text that includes BOTH hypertext and plain text?

    Many thanks!

  2. #2
    Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    5,055
    Thanks
    2
    Thanked 417 Times in 346 Posts
    You might have more success with a wildcard Find/Replace, where:
    Find = [^13]{1,}
    Replace = ^p
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  3. #3
    New Lounger
    Join Date
    Nov 2002
    Posts
    13
    Thanks
    3
    Thanked 0 Times in 0 Posts

    Sorry, that doesn't work

    Sorry, that doesn't work. All it does is replace every paragraph mark with ... itself. No change at all.

    Also, it doesn't answer the general question, which I put in the thread's title but should also have put in the message itself. What about the general case that's not just two paragraph marks?

  4. #4
    Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    5,055
    Thanks
    2
    Thanked 417 Times in 346 Posts
    Actually, what the code does is replace every series of ASCII 13 characters with a single paragraph break. With some electronic content that originates in a non-PC environment, line breaks are achieved by ASCII 13 characters, which look like paragraph breaks but aren't. Are you sure what you're trying to replace is in fact a paragraph break (ie does it appear on screen as )? It is also common for websites to return data with manual line breaks (ASCII 11) instead of paragraph breaks. To process both forms of break you could use a wildcard Find/Replace, where:
    Find = [^11^13]{1,}
    Replace = ^p

    As for the general question, text that is part of a hyperlink can usually be found via the Find method. If you'd like to get rid of the hyperlinks, and convert them to plain text, select them, then press Ctrl-Shift-F9.
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  5. #5
    New Lounger
    Join Date
    Nov 2002
    Posts
    13
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Paul,

    Thanks for the clarification. However, I tried your suggestion and it didn't change anything in the text.

    Also, perhaps I didn't make my general question clear. The problem is at the border between the plain text and hypertext - it appears that a single "find/replace" can't cross that border? I don't want to undo the hyperlink.

    For example, if I had a sentence like this several times in a document ...

    I read The New York Times every day.

    ... where "The New York Times" is hyperlinked to www.nytimes.com, and I wanted to make a global find/replace of "read The New York Times every" with "run three miles", thus crossing the plain/hypertext boundary, then Find/Replace either works once and not again, or not at all. I could attach a test document if someone's interested.

    And I don't want to have to destroy the hyperlink. There are other cases where I need to preserve the hyperlink - e.g. if in the above example I wanted to replace the text with "I log onto The New York Times website".

    This seems complicated, but it's actually a straightforward question: how can I get Find/Replace to cross the plain/hypertext boundary?

  6. #6
    Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    5,055
    Thanks
    2
    Thanked 417 Times in 346 Posts
    Hi ebruskin,

    Yes, it might be helpful if you attach a copy of the document. As I said, "text that is part of a hyperlink can usually be found via the Find method", however, Find expressions that span hyperlinks and plain text can be problematic.
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  7. #7
    New Lounger
    Join Date
    Nov 2002
    Posts
    13
    Thanks
    3
    Thanked 0 Times in 0 Posts

    Sample Document Uploaded

    Paul, I've uploaded a small Word document with the text I described to you above. (Replacing "read The New York Times every" with "run three miles each", where The New York Times is a hyperlink. I can't "Find" "read The New York Times every", although I can find, e.g., "read" or "The New York" separately.

    The first line with the xxx is the remnant of some change (I forgot what) that I was able to make, but then was not able to repeat it.

    -- Eric
    Attached Files Attached Files

  8. #8
    Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    5,055
    Thanks
    2
    Thanked 417 Times in 346 Posts
    For the example given, unless you want to work with the exposed field code, you could use a wildcard Find/Replace where:
    Find = read*(every)
    Replace = run three times \1
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  9. #9
    New Lounger
    Join Date
    Nov 2002
    Posts
    13
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Yes, that works! What does \1 do? I looked up the backslash in wildcard find-and-replace (I don't use it much)?

    However, going back to the example with the paragraph marks, I can't use the special characters like ^p, ^l, etc with wildcard find-and-replace.

    Is there a way to work with the exposed field codes that doesn't use wildcard find-and-replace?

  10. #10
    Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    5,055
    Thanks
    2
    Thanked 417 Times in 346 Posts
    The \1 sys to include the first bracketted term in the Find expression in the Replace output. Thus:
    Find = read*(every)
    Replace = run three times \1
    says to find 'read' by nay number of characters before 'every' and to insert the 'every' back into the replacement text. One could also use:
    Find = read*every
    Replace = run three times every
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  11. #11
    New Lounger
    Join Date
    Nov 2002
    Posts
    13
    Thanks
    3
    Thanked 0 Times in 0 Posts
    So, evidently the wildcard version of find-and-replace can cross the boundary between hypertext and plain text. I might be able to find ways to strategize with this to deal with more generic characters such as paragraph marks.

    If I still have trouble, I will craft a careful example and come back here. I appreciate very much (THANK YOU, if that's how the counter at the left works) the time you've taken with this. Evidently you are the only Word MVP in the world who could figure something out?

    -- Eric

  12. #12
    Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    5,055
    Thanks
    2
    Thanked 417 Times in 346 Posts
    There's only about two dozen MS Word MVPs world wide, and none of us knows everything about Word. Others who might know how to handle the issue may not have seen the questions about it that you found.
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  13. #13
    New Lounger
    Join Date
    Nov 2002
    Posts
    13
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Well, that confirms my belief that Australia's got talent far out of proportion to its population: at least 4% of all Word MVPs in a country with well below 1% of the world's population.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •