How to Pull Certain Number from a Text String

Adroit

New Member
Joined
Jul 16, 2012
Messages
16
Hello,

I'm trying to pull a specific 10 digit number from a text string. The ID # is always 10 digits however sometimes there are dates in the data that need to be excluded as well as other numbers such as "I-9". The 10 digit number will always start with 1 or 8. Is there a macro or Formula to extract this data? Please find the example dataset below. Thanks in advance!


Edit: I should specify I want the "100118416" number from the first cell example or the "800011320" number from the 10th example. Thanks again.

NDM Abxx, Kexxxxx 100118416 2-29-12.pdf</SPAN>
VPG I9 Aberxxxxx, Dexxx 100113048 1-1-12.pdf</SPAN>
I-9 Abxxx, Rexx 100119966 4-5-12.pdf</SPAN>
NDM I9 Achxxxn, Paxx 100119008 3-15-2012.pdf</SPAN>
VPG I-9 Acxxxxx, Toxx 100112782 01-01-12.pdf</SPAN>
I-9 Acxxxr, Gxxx 100117854 2-27-12.pdf</SPAN>
NDM Acxxx, Angxxxxx 1001118015 3.3.12.pdf</SPAN>
ADXXX, DEXXX 100118892 I9.pdf</SPAN>
I-9 Adxxx, Chxxxxxxx 100119633 3-28-12.pdf</SPAN>
I-9 Adxxxx, Daxxx 800011320 6-14-12.pdf</SPAN>
I-9 Adxxxx,Jenxxxx 100121079 5 25 2012.pdf</SPAN>
VPG I9 Axxxx, Lxxxx E 100113621 1-1-12.pdf</SPAN>
NDM I9 Adaxx, Mxxx 100117676 2-16-2012.pdf</SPAN>
NDM I9 Adxxxx, Pxxx100117778 2-18-2012.pdf</SPAN>
ADXXX, PXXX T 100117778.pdf</SPAN>

<TBODY>
</TBODY>
 
Last edited:
Ben's example was great at finding descrepencies in the data.

Rick, when I'm using yours it pulls the data out great but correct me if I'm wrong, it only pulls out the first 9 digits. For the one example with 10 digits it's only pulling the first 9. Is there any workaround for this or perhaps I'm missing something?
 
Upvote 0

Excel Facts

Can Excel fill bagel flavors?
You can teach Excel a new custom list. Type the list in cells, File, Options, Advanced, Edit Custom Lists, Import, OK
Yes, I just substituted x's for some of the names since this is actual data and I didn't want to publish people's names. Note that there aren't always commas present so text to columns and reverse concatenation won't work here :/
 
Last edited:
Upvote 0
Correction for the 9 or 10 digits:
=MAX(IFERROR(MID($A1,ROW(INDIRECT("1:"&LEN(A1))),{9,10})+0,""))

So the actuall name is:
Acxxxr, Gxxx

not only Gxxx?

Are the limited number of prefixes(NDM,I-9) or can be anything?
 
Upvote 0
They can be anything, some don't have any and start immediately with the name, others start with VPG and various other things. The name in your example would be Acxxxr, Gxxx yes. Gxxx would only be the first name. If necessary I can repost the data table with fake names.
 
Upvote 0
Rick, when I'm using yours it pulls the data out great but correct me if I'm wrong, it only pulls out the first 9 digits. For the one example with 10 digits it's only pulling the first 9. Is there any workaround for this or perhaps I'm missing something?
Sorry, I missed the 10-digit number in there. Try this formula (coupled with my UDF)...

=TRIM(GetPattern(SUBSTITUTE(A16,".",""),"#########[ 0-9]"))
 
Upvote 0
They can be anything, some don't have any and start immediately with the name, others start with VPG and various other things. The name in your example would be Acxxxr, Gxxx yes. Gxxx would only be the first name. If necessary I can repost the data table with fake names.

NDM Abxx, Kexxxxx 100118416 2-29-12.pdf </SPAN>
100118416</SPAN>
Abxx,Kexxxxx</SPAN>
VPG I9 Aberxxxxx, Dexxx 100113048 1-1-12.pdf </SPAN>
100113048</SPAN>
Aberxxxxx,Dexxx</SPAN>
I-9 Abxxx, Rexx 100119966 4-5-12.pdf </SPAN>
100119966</SPAN>
Abxxx,Rexx</SPAN>
NDM I9 Achxxxn, Paxx 100119008 3-15-2012.pdf </SPAN>
100119008</SPAN>
Achxxxn,Paxx</SPAN>
VPG I-9 Acxxxxx, Toxx 100112782 01-01-12.pdf </SPAN>
100112782</SPAN>
Acxxxxx,Toxx</SPAN>
I-9 Acxxxr, Gxxx 100117854 2-27-12.pdf </SPAN>
100117854</SPAN>
Acxxxr,Gxxx</SPAN>
NDM Acxxx, Angxxxxx 1001118015 3.3.12.pdf </SPAN>
1001118015</SPAN>
Acxxx,Angxxxxx</SPAN>
ADXXX, DEXXX 100118892 I9.pdf </SPAN>
100118892</SPAN>
ADXXX,DEXXX</SPAN>
I-9 Adxxx, Chxxxxxxx 100119633 3-28-12.pdf </SPAN>
100119633</SPAN>
Adxxx,Chxxxxxxx</SPAN>
I-9 Adxxxx, Daxxx 800011320 6-14-12.pdf </SPAN>
800011320</SPAN>
Adxxxx,Daxxx</SPAN>
I-9 Adxxxx,Jenxxxx 100121079 5 25 2012.pdf </SPAN>
100121079</SPAN>
Adxxxx,Jenxxxx</SPAN>
VPG I9 Axxxx, Lxxxx E 100113621 1-1-12.pdf </SPAN>
100113621</SPAN>
Axxxx,Lxxxx E</SPAN>
NDM I9 Adaxx, Mxxx 100117676 2-16-2012.pdf </SPAN>
100117676</SPAN>
Adaxx,Mxxx</SPAN>
NDM I9 Adxxxx, Pxxx100117778 2-18-2012.pdf </SPAN>
100117778</SPAN>
Adxxxx,Pxxx</SPAN>
ADXXX, PXXX T 100117778.pdf </SPAN>
100117778</SPAN>
ADXXX,PXXX T</SPAN>

<TBODY>
</TBODY>

When you have got the numbers alredy extracted in B1 use this in C1,Drag down:

=TRIM(RIGHT(SUBSTITUTE(TRIM(LEFT(" "&A1,FIND(",",A1)))," ",REPT(" ",99)),99))&","&TRIM(RIGHT(SUBSTITUTE(TRIM(LEFT(A1,FIND(B1,A1)-1)),",",REPT(" ",99)),99))
 
Upvote 0
Sorry, I missed the 10-digit number in there. Try this formula (coupled with my UDF)...

=TRIM(GetPattern(SUBSTITUTE(A16,".",""),"#########[ 0-9]"))

There was supposed to be a space between the double quotes in the above formula...

=TRIM(GetPattern(SUBSTITUTE(A16,"."," "),"#########[ 0-9]"))
 
Upvote 0
When you have got the numbers alredy extracted in B1 use this in C1,Drag down:

=TRIM(RIGHT(SUBSTITUTE(TRIM(LEFT(" "&A1,FIND(",",A1)))," ",REPT(" ",99)),99))&","&TRIM(RIGHT(SUBSTITUTE(TRIM(LEFT(A1,FIND(B1,A1)-1)),",",REPT(" ",99)),99))

Robert this works pretty well the only issue I seem to have is that the data appears to be dependent on the extracted ID numbers. So if No ID # was extracted then consequently no name will be extracted. In some of my strings there are no ID #s but there are names.
 
Upvote 0
Also there's an issue with Commas after the first name in some of my data it appears. For example:

Alvxxxx, Brxxx, 100116788.pdf
Alvxxxx, Hxxxx, 100116313.pdf
Arxxx, Ivxxxx, 100116284.pdf

These return only the last name and I'm guessing it's due to the comma after the first name and before the ID numbers.</SPAN>

<TBODY>
</TBODY><COLGROUP><COL></COLGROUP>
</SPAN>

<TBODY>
</TBODY><COLGROUP><COL></COLGROUP>
</SPAN>

<TBODY>
</TBODY><COLGROUP><COL></COLGROUP>
 
Upvote 0
Also there's an issue with Commas after the first name in some of my data it appears. For example:


Alvxxxx, Brxxx, 100116788.pdf

Alvxxxx, Hxxxx, 100116313.pdf
Arxxx, Ivxxxx, 100116284.pdf

These return only the last name and I'm guessing it's due to the comma after the first name and before the ID numbers.

<tbody>
</tbody>

<tbody>
</tbody>

<tbody>
</tbody>

If you are using my GetPattern UDF to get the number, then you can use it in the following formula to retrieve the name...

=TRIM(RIGHT(SUBSTITUTE(LEFT(A1,FIND(",",A1&",")-1)," ",REPT(" ",99)),99))&LEFT(GetPattern(A1,",?*[!A-Za-z]"),LEN(GetPattern(A1,",?*[!A-Za-z]"))-1)
 
Upvote 0

Forum statistics

Threads
1,214,908
Messages
6,122,186
Members
449,071
Latest member
cdnMech

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top