# Thread: Extracting text from the middle of the Cell

1. ## Extracting text from the middle of the Cell

I have been the recipient of a lot of help over the years, and I thank everyone that has helped. I would like to give back with something.

I was given a list of chemicals and the Chemical Abstract Service Number (CAS No.) for each item. Unfortunately, the list was given as a simple number. Two examples would be
• Formaldehyde 50000
• Ammonia 7664417

The normal way of showing a CAS No. is with some dashes:
• Formaldehyde 50-00-0
• Ammonia 7664-41-7

I was looking for a way to extract this data into the typical format. It is complicated by the fact that the first set of numbers can be from two to six digits. A friend of mine came up with a really elegant way of doing this.

Note that the chemical name and the CAS No. are in two separate cells.

I put the CAS numbers into column E. Then I used the following formula, using the LEFT, RIGHT, and LEN functions:
=LEFT(E2,LEN(E2)-3)&"-"&LEFT(RIGHT(E2,3),2)&"-"&RIGHT(E2,1)

As this is sort of a tutorial for those unfamiliar with ranges, I will detail what this does. It can be quite useful to extract information from inside the contents of a cell. It also shows how formulas can be nested.

First, the items between two quotation marks are inserted into the cell as text. Between each different part in the cell, an ampersand (&) must be used.

Then, starting at the end of the formula:
RIGHT(E2,1) means I start at the right of the cell E2, and take only the first character.

LEFT(RIGHT(E2,3),2) is really elegant. The innermost function in the nest is RIGHT(E2,3), which means the formula extracts the three characters from the right of E2. Then, the LEFT function comes into play, choosing only the first two digits found.

LEFT(E2,LEN(E2)-3) takes into account the varying lengths of the strings. LEN(E2)-3 starts by counting the number of characters in E2, and subtracts 3 from that number. That removes the final three characters described above. Then the LEFT function chooses the number of characters starting at the beginning of the string.

2. If you want to keep it as a number, you could also just use a custom number format to display the number in the CAS format: 0-00-0

Steve
PS a shorter alternative formula for generating the text string is:
=REPLACE(REPLACE(E2,LEN(E2),0,"-"),LEN(E2)-2,0,"-")

PPS. and an even shorter one is not to insert but use the custom format:
=TEXT(E2,"0-00-0")

3. ## The Following User Says Thank You to sdckapr For This Useful Post:

JohnD1 (2014-04-07)

4. Very Nice JohnD1. You could make a user defined function using your formula that would make it very easy to use:

Enter the formula in cell D1 =CAS(B1) then copy down

where A1 is the chemical name and B1 is the mal-formatted CAS number

Code:
Public Function CAS(rng As Range)
Application.Volatile
CAS = Left(rng, Len(rng) - 3) & "-" & Left(Right(rng, 3), 2) & "-" & Right(rng, 1)
End Function

5. Originally Posted by sdckapr
If you want to keep it as a number, you could also just use a custom number format to display the number in the CAS format: 0-00-0

Steve
PS a shorter alternative formula for generating the text string is:
=REPLACE(REPLACE(E2,LEN(E2),0,"-"),LEN(E2)-2,0,"-")

PPS. and an even shorter one is not to insert but use the custom format:
=TEXT(E2,"0-00-0")
That last one is really nice. Thank you.

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•