Results 1 to 5 of 5
  1. #1
    New Lounger
    Join Date
    Apr 2002
    Posts
    22
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Relating Tables With Non-Identical Data (Access 2000 (SR-1))

    I've imported two Excel tables that I wish to relate into Access. Company names appear in both lists, but they appear with some variation. For instance, one list says John Henry Company and the other says The John Henry Company. One says Ferrell Companies; the other says Ferrellgas. I want to pull all the companies from the Mass Market List that also appear in the Top 100 list, even if the company name is not stated identically in both cases (as mentioned above and shown in the attached example). How do I structure a query and/or relationship to accomplish this?
    Attached Images Attached Images

  2. #2
    5 Star Lounger
    Join Date
    Jan 2001
    Location
    Vancouver, Br. Columbia, Canada
    Posts
    632
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Relating Tables With Non-Identical Data (Access 2000 (SR-1))

    Try this (no guarantees of success, and it may be too much work for you)

    Create a query as if the two fields are identical using the normal QBE methods. Play around until you get the fields that you want displayed, but don't worry about how many records are displayed. Include the keyfield from each table. Next, view the query in SQL mode -- it's under the View menu if you are unfamiliar with it. Read through the query until you find something like this:
    FROM Table1 INNER JOIN table2 ON Table1.custname1 = table2.custname1;

    Modify the SQL to look something like this:
    FROM Table1 INNER JOIN table2 ON left(Table1.custname1,2) = left(table2.custname1,2);

    This example compares the leftmost 2 characters of each table, and joins the tables where they are equal. You will have to play with the number of comparison characters to achieve your desired goal. Unfortunately, it also makes the recordset read-only, so you won't be able to modify the data. You may have to change the query to a make-table query, and create a temporary table that you can subequently relate back to the original table to do your editing.

    As for getting rid of "the", you may need to go back to the original Excel tables and create a new column that omits the "filler" words. I don't know of any automatic way to do this -- maybe sort the table by company name, and create a formula for stripping off the left four characters???

    As I said... it may be too much work. Hope this helps.
    --------------------------------------------------
    Jack MacDonald
    Vancouver, Canada

  3. #3
    Plutonium Lounger
    Join Date
    Dec 2000
    Location
    Sacramento, California, USA
    Posts
    16,775
    Thanks
    0
    Thanked 1 Time in 1 Post

    Re: Relating Tables With Non-Identical Data (Access 2000 (SR-1))

    You're in for a lot of work. If you're trying to clean up mailing or other kinds of lists to remove *near* duplicates, there is just so much you can do with a query and then you have to start doing the hand work. My suggestion is that you add an autonumber key to both tables if you haven't already done so. Then add a long integer field to each and call it something like ForeignKeyID. Then you can run an update query that finds exact matches and puts the autonumber from one table into the foreignkeyID of the other table with a matching record. After that, you'll need to decide which version of the name you want to be the ultimate survivor. Whichever one that is, you populate flag it and start looking for near matches in the other table and inserting the ultimate name's autonumber key into the foreignkeyID for the other record. I used to have to do this kind of thing with lists with as many as 75,000 names, and it was no fun at all! <img src=/S/sick.gif border=0 alt=sick width=15 height=15>
    Charlotte

  4. #4
    New Lounger
    Join Date
    Apr 2002
    Posts
    22
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Relating Tables With Non-Identical Data (Access 2000 (SR-1))

    This certainly helps. I weeded out the "The" companies by doing a search and replace to remove that prefix from both tables. I added some parameters to the query to narrow the playing field and increase my odds (e.g., requiring a city and state match). I found a LEFT function of about 6 characters to be optimal. There were still a few that didn't exactly match when I checked (e.g. Bradford Companies vs. Bradford White) but I also caught a couple that I didn't see when first eyeballing the list. Definitely an imprecise proposition when two different data sources don't follow identical naming conventions.

  5. #5
    Silver Lounger GARYPSWANSON's Avatar
    Join Date
    Aug 2001
    Location
    Frederick, Maryland, USA
    Posts
    1,788
    Thanks
    0
    Thanked 2 Times in 2 Posts

    Re: Relating Tables With Non-Identical Data (Access 2000 (SR-1))

    Another option is to create a table that stores the names and assign an id to each name. Use this ID to a key in each table storing names and then join the tables.

    For example,

    - Run a query against the first set of data to get distinct names. Write this to a table that has the id (autonumber), Name, and NameID. Input a unique ID or number for each name.
    - Run a query against the second set of data to get distinct names. Write this to the prior table and update the NameID's. If John Jones Comp is ID 1 , then set all alias names such as The John Jones Comp to ID 1 as well.

    - Run a query that then looks for the name and updates the correct Name ID to both tables based on the name. (You will need to add the NameID field to both tables)
    - You can then join the tables based on the Name ID's that are equal in both tables.

    Of course, you would then need to maintain a lookup table for new names.

    HTH
    Regards,

    Gary
    (It's been a while!)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •