Results 1 to 7 of 7
  1. #1
    3 Star Lounger
    Join Date
    Jul 2007
    Location
    United States
    Posts
    220
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Data Cleaning Function (MS Excel 2003)

    I have two database lists that I want clean so I can make a clean comparison and eventually merge the data sets in MS Access. The first database has a listing of titles. The second database has many of the same titles, but the title is preceded by data in parenthesis. Various kinds of data is in the parenthesis like Author, Date of Publishing, commas, Publisher). Essentially, I am trying to play with the RIGHT, MID, FIND functions to get rid of the parenthesis and everything inside the parenthesis so I can match up just the titles. Here is the layout below:

    Database 1
    A1 = Title 1
    A2 = Title 2
    A3 = Title 3
    A4 = Title 4

    Database 2
    A1 = (Author) Title 1
    A2 = (Date) Title 2
    A3 = (Author) Title 3
    A4 = (Publisher) Title 4

  2. #2
    Platinum Lounger
    Join Date
    Feb 2002
    Location
    A Magic Forest in Deepest, Darkest Kent
    Posts
    5,681
    Thanks
    0
    Thanked 1 Time in 1 Post

    Re: Data Cleaning Function (MS Excel 2003)

    Hi Amy

    There will be a few other additions to this thread but here is a starter for you

    Assuming the Database 2 items start in A1, in B1 type the following function in:

    =MID(A1,FIND(")",A1)+2,LEN(A1)-2)


    It basically uses the Find function to find the position number of the first ) and uses the number to calculate the position where the Mid function should start, I had added 2 to this number to go "past" the space and start at the first letter of the word you want. This can be copied down the column
    Jerry

  3. #3
    WS Lounge VIP rory's Avatar
    Join Date
    Dec 2000
    Location
    Burwash, East Sussex, United Kingdom
    Posts
    6,280
    Thanks
    3
    Thanked 191 Times in 177 Posts

    Re: Data Cleaning Function (MS Excel 2003)

    You could also just copy the data into a new column, then do a straight Find/Replace (Ctrl+H) putting <code>*) </code> in the Find box and leaving the Replace box blank.
    Regards,
    Rory

    Microsoft MVP - Excel

  4. #4
    3 Star Lounger
    Join Date
    Jul 2007
    Location
    United States
    Posts
    220
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Data Cleaning Function (MS Excel 2003)

    Rory,
    Unfortunately, the Find/Replace method has some limitations since many of the titles themselves contain parenthesis beyond just the preceding set. For example: (Author) Title (1).

    I've found some other useful formulas that eventually do the same thing as the solution first identified.

    =TRIM(RIGHT(A2,LEN(A2)-LEN(LEFT(A2,FIND(")",A2,1)))))

    =TRIM((REPLACE(A2,1(SEARCH(")",A2,1)),""))

    Can you identify other threads that have useful data cleaning formulas? That would be great.

  5. #5
    3 Star Lounger
    Join Date
    Mar 2003
    Location
    Beddau, Mid Glamorgan, Wales
    Posts
    289
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Data Cleaning Function (MS Excel 2003)

    Hi Amy:

    In certain circumstances (not necessarily the one you've outlined in your posts) I have found the Data/Text to Columns... command useful in cleaning data, especially as you can specify which character to use to split up the text into new cells.

    If you do use this method, make sure there sufficient empty columns to the right of the existing data.
    Regards,

    Tony
    [s] [/s]
    www.SylviArtist.com

  6. #6
    Star Lounger
    Join Date
    Jul 2006
    Posts
    71
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Re: Data Cleaning Function (MS Excel 2003)

    When I do this sort of thing, I usually move the data to Word and use global search and replace to insert a TAB character to the right of the right parenthesis. Pasted back into a new spreadsheet, you will then have the information in parentheses in column A and the title in column B. If you have a few records with multiple sets of parentheses -- where you'd have more than one tab character inserted -- you can do a quick check for them in Word by highlighting the data and setting a couple of tabs to see what lines respond to them, or sorting the target spreadsheet on columns C, D, E, or whatever, and correcting those few lines.

  7. #7
    Star Lounger
    Join Date
    Jul 2006
    Posts
    71
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Re: Data Cleaning Function (MS Excel 2003)

    Sorry, Amy, I think my last post wasn't perfectly clear ....
    1) highlight Database 2 and copy to clipboard.
    2) Switch to Word and paste.
    3) You'll have a Word table. Highlight the table and Go to Table / Convert / Table to text. Separate by paragraph signs, if it asks you to choose.
    4) You'll have rows of data. Go to Edit / Replace and tell it to replace a right parenthesis with right parenthesis + ^t, the tab character.
    5) Once replaced, highlight everything, copy back to the clipboard, switch back to Excel, and paste to a new spreadsheet. The inserted tab will cause the data to paste to two columns. The data within the parentheses should be in Column A and the Title of the work should be in Column B.
    6) You can sort the sheet on columns C, D, E, etc. if this wholesale global search-and-replace found more than one right parenthesis in a line, and fix those few records -- probably not many.

    Hope this helps, Ann

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •