Results 1 to 7 of 7
  1. #1
    Star Lounger
    Join Date
    Sep 2001
    Location
    Perth, Western Australia
    Posts
    89
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Looking for Advanced Search and Replace expertise (Word 2003 SP-1)

    Hi all

    I have a couple of "Search and Replace" challenges which I hope someone can help me with. One of my clients has copied a vast amount of publicly available business name and address data from a yellow pages website and pasted it into a Word document - with the chaos you'd expect from such a process. Then they emailed it to me and asked me to organise it & format it into table for mail merge purposes.

    Here's challenge #1:

    I have been able to tidy up the address lines of each record considerably, putting tabs into most of the right places. But I am now left with 17,000 records that each have the business name on one line (ending with a paragraph mark), followed by a second line containing the addresses (also ending with a paragraph mark). What I need is an intelligent search and replace macro which will go through the file and replace the first paragraph mark foir each record with a tab, thus putting the business name and address into the same line.

    I have no idea how to program this, but I had two ideas for the logic: the first is to replace the first paragraph mark in the file with a tab, and thereafter every second one. But that might trip up if there is an extra one hidden inside. A more robust approach (but more complex to program I think) is to search for every line which contains no tab characters (which is a uniform and reliable characteristic of the address lines), then replace the paragraph mark at the end of every such line with a tab.

    Easy, huh?

    And now for challenge #2:

    When I have sorted all this out and got the 17,000 records in a beautiful table, there will be a number of duplicates (maybe 1% of the total), based primarily on the address. But some duplicate address will be kosher, e.g. 2 businesses at the same address. So what I hoped to do is run a macro which finds a duplicate address, highlights in some way, then stops for me to review it and delete it manually if appropriate - after which I would rerun the macro (from a hotkey).

    If anyone out there can help me with either of these two challenges, I'd be very grateful indeed.

    Best regards

    Neil

  2. #2
    Uranium Lounger
    Join Date
    Dec 2000
    Location
    Los Angeles Area, California, USA
    Posts
    7,453
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Looking for Advanced Search and Replace expert

    Hi Neil:
    Issue 1: Getting the addresses in a table
    I assume that the address is more than one line. Nevertheless, Klaus Linke came up with a method. Take a look at the thread surrounding <post#=176634>post 176634</post#> for a way to sort the paragraphs. You will also find some ideas to help put them in a table.

    Issue 2: removing duplicates
    Note: I haven't tried this with a table, but I let me know if there's trouble when you get this far. Klaus Linke also came up with this.

    Use wildcards:
    Find: ^13(*^13)1
    Replace: ^p1
    and repeat the search until nothing more is found.

    Note that the very top of the document can have two identical paragraphs unless you place a paragraph mark at the top. This only works for repeating patterns, so it will not work if the identical paragraphs are separated by non-identical paragraphs. Therefore, you must sort before trying this.

    Almost forgot. Take a look at Dave Rado's article, also, regarding splitting first name last name at http://www.mvps.org/word/FAQs/TblsFldsFms/...ameLastName.htm.

    If you run into trouble, post back.
    Hope this helps,

  3. #3
    Star Lounger
    Join Date
    Sep 2001
    Location
    Perth, Western Australia
    Posts
    89
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Looking for Advanced Search and Replace expert

    Phil - wow, once again, thanks for the lightning response.

    The key problem is this: for each record, the company name is on one line, and the address on another (In fact, I have split the address lines into tab-delimited fields pretty successfully using techniques that you referred me to).

    Here is a sample record - I have inserted pseudo ^codes within square brackets to show what it actually looks like:

    ------
    Technically Speaking [^p] <span style="background-color: #FFFF00; color: #000000; font-weight: bold">It's this para mark that I want to replace with a tab</span hi>
    PO Box 3101[^t]Yokine[^t]Western Australia[^t]3101[^p]
    ------

    Once I figure this out, I'll have the holy grail: each record on a single line with a fixed number of tab-delimited fields.

    As I mentioned in my initial post, it's a safe bet that none of the business names contain tabs - so if one can do a search / replace using the absence of a defined character or string, this may be the way to go... if only I knew how to program.

    I hope that makes it a little clearer - once again, many thanks indeed for your interest.

    Best regards

    Neil

  4. #4
    Plutonium Lounger
    Join Date
    Mar 2002
    Posts
    84,353
    Thanks
    0
    Thanked 29 Times in 29 Posts

    Re: Looking for Advanced Search and Replace expert

    Make sure that there is a paragraph mark before the first company name (otherwise, the first company name will be skipped).

    Using wildcards in the Find/Replace dialog again, put this in the 'Find what' box:

    <code>^13([!^t]@)^13</code>

    and this in the 'Replace with' box:

    <code>^p1^t</code>

    Click Replace All.

    Explanation:
    <code>[!^t]</code> means any character but a tab.
    <code>[!^t]@</code> means one or more occurrences of 'any character but a tab'.
    <code>^13([!^t]@)^13</code> means a paragraph (between two paragraph marks) consisting of one or more occurrences of 'any character but a tab'. The part between the parentheses ( ), i.e. the text of the paragraph, is available for further processing as 1.
    <code>^p1^t</code> means a paragraph mark, followed by the text of the paragraph, followed by a tab.

  5. #5
    Star Lounger
    Join Date
    Sep 2001
    Location
    Perth, Western Australia
    Posts
    89
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Looking for Advanced Search and Replace expert

    Hans

    Many thanks for your prompt & helpful response - I was pretty excited when I saw it because it looked like it was right on target (by the way: is there a source of all knowledge somewhere on the web about these lesser known but extremely powerful Find / Replace options?).

    The bad news is that I followed your instructions precisely (including copying the strings and pasting them into the Word dialogue), but receive zero find instances.

    Can you think of anything I might have done wrong?

    Thanks again

    Neil

  6. #6
    Plutonium Lounger
    Join Date
    Mar 2002
    Posts
    84,353
    Thanks
    0
    Thanked 29 Times in 29 Posts

    Re: Looking for Advanced Search and Replace expert

    I don't have Word 2003, but in older versions, the online help has a pretty good overview of the options available in the Find and Replace dialog.

    The Find/Replace works in a demo document I created - I just tested again, copying the codes from this thread. Are you sure that the lines are separated by paragraph marks and not by manual line breaks? (You can search for a manual line break as ^11)

    Umm, you did tick 'Use wildcards' in the Find and Replace dialog, didn't you?

  7. #7
    Star Lounger
    Join Date
    Sep 2001
    Location
    Perth, Western Australia
    Posts
    89
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Looking for Advanced Search and Replace expert

    Hans

    I am deeply embarrassed on two counts: firstly, I did not have the wildcards checked as you intructed. Boy, does checking that box make a big difference! It worked perfectly. Secondly, on searching Word's help on wildcards, I have found that it does indeed list a lot of different options. I'm going to be having a very close look at that.

    Well, they say it has been a wasted day if you don't learn anything - as I've learned a couple of very useful things, I'd say it's been a very valuable day.

    Thanks heaps for your help.

    Cheers

    Neil

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •