Multiple Text & Number String Extraction Without VBA

jawnathin

New Member
Joined
Nov 27, 2013
Messages
8
Alright Excel friends, I had a problem that was solved that has just grown once more. I need to find and extract multiple strings from a text field. These strings consist of two letters followed by three numbers and possibly an extra letter. It could be XX000 or XX000X (e.g. AB123 or BA123Z). There are hundreds (thousands?) of rows of data I need to grab these strings from.

The current array formula I'm using (provided by a member here) is:
=IF(ISBLANK(S3),MID(R3,MATCH(TRUE,ISNUMBER(1*MID(R3,ROW(INDIRECT("1:"&LEN(Q3))),1)),0)-2,5),S3)


Sample data below:

CreatedCommentDesired Extraction
1/1/2013Bananas BA426 Them more textBA426
5/1/2013Apples AP654 Bananas BA525 More additional textAP654 AND BA525
6/3/2013More Fruit MF055 AP162 BA223X Even more textMF055 AND AP162 AND BA223X
7/5/2013More Text Before Numbers MT323 and this one is going to be really long to prove a point. :wink:MT323

<tbody>
</tbody>

Any ideas?

My original thread is here: http://www.mrexcel.com/forum/excel-...action-without-visual-basic-applications.html

Thanks to everyone who checks this out! :)

____________
Jawnathin
 
First of all, you guys are awesome. I really appreciate the help with this! My title may be a bit misleading. I'm trying to avoid the use of VBA, but it's not a dealbreaker if I need to use it. Also, a note on the text strings: the only format I'm finding is XX000 and/or XX000X. The formula or macro should ideally not extract strings that don't match exactly. For example, if there's also a string in there that says "CARS123456" I don't want it to extract "RS123" -- it's not the string I'm looking for.

Matt: Your solution is good. As I'm receiving more data, I'm noticing that the text strings (XX000) are not limited to a maximum of three. The comment field can have none, 1-3, or even more stuck in there. Your solution can scale to more (I'll just need to nest the formula in adjacent columns), but it may get messy.

Pup: Your formula that built on Matt's returns 3 copies of the first string in the cell. Not sure what I'm doing wrong.

Rick: Your solution is great! Unfortunately I'm getting a Subscript out of Range error when I run it on the line "NextAND = 1 + Len(ANDS(0))".

As far as the output goes, in retrospect rather than a bold "AND" (which seems silly to me now) it might be best to use a semicolon as the delimiter between extracted strings.

Again, thank you all for your help!

--
Jawnathin
 
Upvote 0

Excel Facts

Difference between two dates
Secret function! Use =DATEDIF(A2,B2,"Y")&" years"&=DATEDIF(A2,B2,"YM")&" months"&=DATEDIF(A2,B2,"MD")&" days"
HI
With regards to the different errors, your receiving, is it possible to paste a large data sample, in that way we can probably sort out the issues between us 25 lines or so, or even upload a spreadsheet to dropbox (its free and paste the link like I did), for us to work from, I think I speak for all of us, we like to help and may well all have different approachs to your request, but I'm 99% confident that you will get a solution
 
Upvote 0
Hi.

Just 4 fun ...!

<table border='1' cellspacing='0' cellpadding='2' valign='middle' colspan='8' style='font-family:Arial; color:#000000; background-color:#FFFFFF; font-size:10px; font-weight:normal; font-style:normal; '><colgroup><col width='28pt'><col width='267pt'><col width='151,5pt'><col width='60pt'><col width='60pt'><col width='60pt'><col width='12pt'><col width='195,75pt'></colgroup><tr style='background-color:#cacaca'><td> </td><td align='middle'>A</td><td align='middle'>B</td><td align='middle'>C</td><td align='middle'>D</td><td align='middle'>E</td><td align='middle'>F</td><td align='middle'>G</td></tr><tr><td style='background-color:#cacaca' align='middle'>1</td><td align='left' style='font-family:Calibri; font-size:11px; ' >Bananas BA426 Them more text</td><td align='left' style='font-family:Calibri; color:#222222; font-size:11px; ' >010000000000000</td><td align='left' style='font-family:Calibri; color:#222222; font-size:11px; ' >BA426</td><td align='left' style='font-family:Calibri; color:#222222; font-size:11px; ' > </td><td align='left' style='font-family:Calibri; color:#222222; font-size:11px; ' > </td><td align='right' style='font-family:Calibri; font-size:11px; ' > </td><td align='left' style='font-family:Calibri; font-size:11px; ' >BA426</td></tr><tr><td style='background-color:#cacaca' align='middle'>2</td><td align='left' style='font-family:Calibri; font-size:11px; ' >Apples AP654 Bananas BA525 More …</td><td align='left' style='font-family:Calibri; color:#222222; font-size:11px; ' >010100000000000</td><td align='left' style='font-family:Calibri; color:#222222; font-size:11px; ' >AP654</td><td align='left' style='font-family:Calibri; color:#222222; font-size:11px; ' >BA525</td><td align='left' style='font-family:Calibri; color:#222222; font-size:11px; ' > </td><td align='right' style='font-family:Calibri; font-size:11px; ' > </td><td align='left' style='font-family:Calibri; font-size:11px; ' >AP654, BA525</td></tr><tr><td style='background-color:#cacaca' align='middle'>3</td><td align='left' style='font-family:Calibri; font-size:11px; ' >More Fruit MF055 AP162 BA223X Even …</td><td align='left' style='font-family:Calibri; color:#222222; font-size:11px; ' >001110000000000</td><td align='left' style='font-family:Calibri; color:#222222; font-size:11px; ' >MF055</td><td align='left' style='font-family:Calibri; color:#222222; font-size:11px; ' >AP162</td><td align='left' style='font-family:Calibri; color:#222222; font-size:11px; ' >BA223X</td><td align='right' style='font-family:Calibri; font-size:11px; ' > </td><td align='left' style='font-family:Calibri; font-size:11px; ' >MF055, AP162, BA223X</td></tr><tr><td style='background-color:#cacaca' align='middle'>4</td><td align='left' style='font-family:Calibri; font-size:11px; ' >More Text Before Numbers MT323 and…</td><td align='left' style='font-family:Calibri; color:#222222; font-size:11px; ' >000010000000000</td><td align='left' style='font-family:Calibri; color:#222222; font-size:11px; ' >MT323</td><td align='left' style='font-family:Calibri; color:#222222; font-size:11px; ' > </td><td align='left' style='font-family:Calibri; color:#222222; font-size:11px; ' > </td><td align='right' style='font-family:Calibri; font-size:11px; ' > </td><td align='left' style='font-family:Calibri; font-size:11px; ' >MT323</td></tr><tr><td style='background-color:#cacaca' align='middle'>5</td><td align='right' style='font-family:Calibri; font-size:11px; ' > </td><td align='right' style='font-family:Calibri; font-size:11px; ' > </td><td align='right' style='font-family:Calibri; font-size:11px; ' > </td><td align='right' style='font-family:Calibri; font-size:11px; ' > </td><td align='right' style='font-family:Calibri; font-size:11px; ' > </td><td align='right' style='font-family:Calibri; font-size:11px; ' > </td><td align='right' style='font-family:Calibri; font-size:11px; ' > </td></tr><tr><td style='background-color:#cacaca' align='middle'>6</td><td align='right' style='font-family:Calibri; font-size:11px; ' > </td><td align='left' style='font-family:Calibri; font-size:11px; ' >Column hide</td><td align='right' style='font-family:Calibri; font-size:11px; ' > </td><td align='right' style='font-family:Calibri; font-size:11px; ' > </td><td align='right' style='font-family:Calibri; font-size:11px; ' > </td><td align='right' style='font-family:Calibri; font-size:11px; ' > </td><td align='right' style='font-family:Calibri; font-size:11px; ' > </td></tr></table><br><table border='3' cellspacing='0' cellpadding='2' valign='middle' colspan='2' style='table-layout:auto; color:#000000; background-color:#FFFFFF; font-family:Arial; font-size:10px;'><colgroup><col width='40pt'><col></colgroup><tr style='background-color:#eeaaaa'><td>Cell</td><td>Formula</td></tr><tr><td>B1</td><td><Span style='color:#222222'>=SUBSTITUTE</Span><Span style='color:#0000DD'>(9*10^15+SUMPRODUCT</Span><Span style='color:#222222'>(--ISNUMBER</Span><Span style='color:#0000DD'>(--MID</Span><Span style='color:#222222'>(" "&A1,SEARCH</Span><Span style='color:#0000DD'>("#",SUBSTITUTE</Span><Span style='color:#222222'>(" "&A1," ","#",ROW</Span><Span style='color:#0000DD'>($A$1:$E$15)</Span><Span style='color:#222222'>)</Span><Span style='color:#0000DD'>)</Span><Span style='color:#222222'>+3,3)</Span><Span style='color:#0000DD'>)</Span><Span style='color:#222222'>*10^</Span><Span style='color:#0000DD'>(15-ROW</Span><Span style='color:#222222'>($A$1:$E$15)</Span><Span style='color:#0000DD'>)</Span><Span style='color:#222222'>)</Span><Span style='color:#0000DD'>,9,)</Span><Span style='color:#222222'></Span></td></tr><tr><td>C1</td><td><Span style='color:#222222'>=IFERROR</Span><Span style='color:#0000DD'>(TRIM</Span><Span style='color:#222222'>(MID</Span><Span style='color:#0000DD'>(SUBSTITUTE</Span><Span style='color:#222222'>($A1," ",REPT</Span><Span style='color:#0000DD'>(" ",999)</Span><Span style='color:#222222'>)</Span><Span style='color:#0000DD'>,</Span><Span style='color:#222222'>(SEARCH</Span><Span style='color:#0000DD'>(2,SUBSTITUTE</Span><Span style='color:#222222'>($B1,1,2,COLUMN</Span><Span style='color:#0000DD'>(A1)</Span><Span style='color:#222222'>)</Span><Span style='color:#0000DD'>)</Span><Span style='color:#222222'>-1)</Span><Span style='color:#0000DD'>*999+1,999)</Span><Span style='color:#222222'>)</Span><Span style='color:#0000DD'>,"")</Span><Span style='color:#222222'></Span></td></tr><tr><td>G1</td><td><Span style='color:#222222'>=SUBSTITUTE</Span><Span style='color:#0000DD'>(TRIM</Span><Span style='color:#222222'>(C1&" "&D1&" "&E1)</Span><Span style='color:#0000DD'>," ",", ")</Span><Span style='color:#222222'></Span></td></tr></table>
 
Upvote 0
Or a monster-formula ... :LOL:

<table border='1' cellspacing='0' cellpadding='2' valign='middle' colspan='7' style='font-family:Arial; color:#000000; background-color:#FFFFFF; font-size:10px; font-weight:normal; font-style:normal; '><colgroup><col width='28pt'><col width='266,25pt'><col width='66,75pt'><col width='66,75pt'><col width='66,75pt'><col width='8,25pt'><col width='144pt'></colgroup><tr style='background-color:#cacaca'><td> </td><td align='middle'>A</td><td align='middle'>B</td><td align='middle'>C</td><td align='middle'>D</td><td align='middle'>E</td><td align='middle'>F</td></tr><tr><td style='background-color:#cacaca' align='middle'>1</td><td align='left' style='font-family:Calibri; font-size:11px; ' >Bananas BA426 Them more text</td><td align='middle' style='font-family:Calibri; color:#222222; font-size:11px; ' >BA426</td><td align='middle' style='font-family:Calibri; color:#222222; font-size:11px; ' > </td><td align='middle' style='font-family:Calibri; color:#222222; font-size:11px; ' > </td><td align='middle' style='font-family:Calibri; color:#222222; font-size:11px; ' > </td><td align='left' style='font-family:Calibri; font-size:11px; ' >BA426</td></tr><tr><td style='background-color:#cacaca' align='middle'>2</td><td align='left' style='font-family:Calibri; font-size:11px; ' >Apples AP654 Bananas BA525 More …</td><td align='middle' style='font-family:Calibri; color:#222222; font-size:11px; ' >AP654</td><td align='middle' style='font-family:Calibri; color:#222222; font-size:11px; ' >BA525</td><td align='middle' style='font-family:Calibri; color:#222222; font-size:11px; ' > </td><td align='middle' style='font-family:Calibri; color:#222222; font-size:11px; ' > </td><td align='left' style='font-family:Calibri; font-size:11px; ' >AP654, BA525</td></tr><tr><td style='background-color:#cacaca' align='middle'>3</td><td align='left' style='font-family:Calibri; font-size:11px; ' >More Fruit MF055 AP162 BA223X …</td><td align='middle' style='font-family:Calibri; color:#222222; font-size:11px; ' >MF055</td><td align='middle' style='font-family:Calibri; color:#222222; font-size:11px; ' >AP162</td><td align='middle' style='font-family:Calibri; color:#222222; font-size:11px; ' >BA223X</td><td align='middle' style='font-family:Calibri; color:#222222; font-size:11px; ' > </td><td align='left' style='font-family:Calibri; font-size:11px; ' >MF055, AP162, BA223X</td></tr><tr><td style='background-color:#cacaca' align='middle'>4</td><td align='left' style='font-family:Calibri; font-size:11px; ' >More Text Before Numbers MT323 and…</td><td align='middle' style='font-family:Calibri; color:#222222; font-size:11px; ' >MT323</td><td align='middle' style='font-family:Calibri; color:#222222; font-size:11px; ' > </td><td align='middle' style='font-family:Calibri; color:#222222; font-size:11px; ' > </td><td align='middle' style='font-family:Calibri; color:#222222; font-size:11px; ' > </td><td align='left' style='font-family:Calibri; font-size:11px; ' >MT323</td></tr><tr><td style='background-color:#cacaca' align='middle'>5</td><td align='right' style='font-family:Calibri; font-size:11px; ' > </td><td align='middle' style='font-family:Calibri; font-size:11px; ' > </td><td align='middle' style='font-family:Calibri; color:#222222; font-size:11px; ' > </td><td align='middle' style='font-family:Calibri; font-size:11px; ' > </td><td align='right' style='font-family:Calibri; font-size:11px; ' > </td><td align='right' style='font-family:Calibri; font-size:11px; ' > </td></tr><tr><td style='background-color:#cacaca' align='middle'>6</td><td align='left' style='font-family:Calibri; font-size:11px; ' >Maximum 35 words in column A !!!</td><td align='middle' style='font-family:Calibri; font-size:11px; ' > </td><td align='middle' style='font-family:Calibri; font-size:11px; ' > </td><td align='middle' style='font-family:Calibri; font-size:11px; ' > </td><td align='right' style='font-family:Calibri; font-size:11px; ' > </td><td align='right' style='font-family:Calibri; font-size:11px; ' > </td></tr></table><br><table border='3' cellspacing='0' cellpadding='2' valign='middle' colspan='2' style='table-layout:fixed; color:#000000; background-color:#FFFFFF; font-family:Arial; font-size:10px;'><colgroup><col width='40pt'><col></colgroup><tr style='background-color:#eeaaaa'><td>Cell</td><td>Formula</td></tr><tr><td>B1</td><td><Span style='color:#222222'>=IFERROR</Span><Span style='color:#0000DD'>(TRIM</Span><Span style='color:#222222'>(MID</Span><Span style='color:#0000DD'>(SUBSTITUTE</Span><Span style='color:#222222'>($A1," ",REPT</Span><Span style='color:#0000DD'>(" ",999)</Span><Span style='color:#222222'>)</Span><Span style='color:#0000DD'>,</Span><Span style='color:#222222'>(SEARCH</Span><Span style='color:#0000DD'>(2,SUBSTITUTE</Span><Span style='color:#222222'>(SUBSTITUTE</Span><Span style='color:#0000DD'>(9*10^15+SUMPRODUCT</Span><Span style='color:#222222'>(--ISNUMBER</Span><Span style='color:#0000DD'>(--MID</Span><Span style='color:#222222'>(" "&$A1,SEARCH</Span><Span style='color:#0000DD'>("#",SUBSTITUTE</Span><Span style='color:#222222'>(" "&$A1," ","#",ROW</Span><Span style='color:#0000DD'>($1:$15)</Span><Span style='color:#222222'>)</Span><Span style='color:#0000DD'>)</Span><Span style='color:#222222'>+3,3)</Span><Span style='color:#0000DD'>)</Span><Span style='color:#222222'>*10^</Span><Span style='color:#0000DD'>(15-ROW</Span><Span style='color:#222222'>($1:$15)</Span><Span style='color:#0000DD'>)</Span><Span style='color:#222222'>)</Span><Span style='color:#0000DD'>,9,)</Span><Span style='color:#222222'>,1,2,COLUMN</Span><Span style='color:#0000DD'>(A1)</Span><Span style='color:#222222'>)</Span><Span style='color:#0000DD'>)</Span><Span style='color:#222222'>-1)</Span><Span style='color:#0000DD'>*999+1,999)</Span><Span style='color:#222222'>)</Span><Span style='color:#0000DD'>,"")</Span><Span style='color:#222222'></Span></td></tr><tr><td>F1</td><td><Span style='color:#222222'>=SUBSTITUTE</Span><Span style='color:#0000DD'>(TRIM</Span><Span style='color:#222222'>(B1&" "&C1&" "&D1)</Span><Span style='color:#0000DD'>," ",", ")</Span><Span style='color:#222222'></Span></td></tr></table>
 
Upvote 0
Drag the formula C1 to the right and down ...

...duh. Sorry, it's been a busy morning and I haven't had enough coffee!

That's getting closer! I love that it's a formula-based solution too and I can scale it to multiple columns. It is, however, extracting a lot of extraneous information that doesn't match the XX000 / XX000X string. Here's some sample data along with the results from your formula:

SummaryString1String2String3StringListNotes
LOS ANGELES AK059 hvlaca1-ars3 US 11/7-5/3 THIS is more TEXTAK059USAK059, USExtra info
Pacoima AC147 and AC146 vnnyca2-ars14 US 12/8-4/0 HSD, VIDEO and PHONE OfflineAC147AC146USAC147, AC146, USExtra info
Rowland Heights - RH003 & RH015X - breaca1-ars5 US 11/0-5/2 - HSO & Phone - OfflineRH003RH015XUSRH003, RH015X, USExtra info
WEST HOLLYWOOD WH007 snmnca1-ars8 3/8.0 - 4/3.0 HSD/VOIP OfflineWH0073/8.0-4/3.0WH007, 3/8.0, -, 4/3.0Extra info
ARTESIA AA021 AA022 AA023 AA024 lkwdca1-bsr2 1/0/1/0 - 1/2/3/0 HSD, VOIP OfflineAA021AA022AA023AA024AA021, AA022, AA023, AA024Perfect!
Thousand Oaks TO117 agorca1-ars5 3/0.0-3/3.0 HSO and Phone OfflineTO1173/0.0-3/3.0TO117, 3/0.0-3/3.0Extra info
Los Angeles AH252 AH253 AH254 AH341 stcyca1-ars1 HSD/VOIP OfflineAH252AH253AH254AH341AH252, AH253, AH254, AH341Perfect!
Carson CA022, CA041, CA042X, CA044 crsnca1-ars2 12/16-4/4 All Services OfflineCA022,CA041,CA042X,CA044CA022,, CA041,, CA042X,, CA044Almost perfect - extra commas
Compton CP023 CP019 CP015 cmtnca1-bsr8 1/1/0/0 1/2/3/0 All services OfflineCP023CP019CP0151/1/0/0CP023, CP019, CP015, 1/1/0/0Extra info
Thousand Oaks TO119, TO117 agorca1-ars4-ars5 1/4.0 1/7.0 Hsd, Voice OfflineTO119,TO1171/4.01/7.0TO119,, TO117, 1/4.0, 1/7.0Extra info
Carson CA045 crsnca1-ars2.socal.rr.com 5/8.0 5/11.0 HSD, Voice OfflineCA0455/8.05/11.0CA045, 5/8.0, 5/11.0Extra info
Port Hueneme NV001, NV002, and NV005 oxnrca1-ars3 4/8.0 thru 4/11.0 HSO and Voice OfflineNV001,NV002,NV0054/8.0NV001,, NV002,, NV005, 4/8.0Extra info
Los Angeles AH017, AH111, AH112 egrkca1-ars3 2/0.0 2/3.0 HSD, Voice OfflineAH017,AH111,AH1122/0.0AH017,, AH111,, AH112, 2/0.0Extra commas, extra info
WOODLAND HILLS WO001 WO030 cnpkca3-ars7 1/2 HSD VOIP OFFLINEWO001WO0301/2WO001, WO030, 1/2Extra info
Tujunga AD041 AD042 AD046 tjgaca1-ars1 3/8 3/10 HSD Phone OfflineAD041AD042AD0463/10AD041, AD042, AD046, 3/10Extra info
South Pasadena San Marino PM013 PM016 SO002 SO006 spsdca1-ars1 Multiple blades HSOand voice OfflinePM013PM016SO002SO006PM013, PM016, SO002, SO006Perfect!
ARCADIA AR023 AR024 mnrvca1-ars3 US 13/8-2/10 HSO Phone OfflineAR023AR024USAR023, AR024, USExtra info
Pico Rivera PR098 PR096 whtrca1-ars3 1/8 1/11 HSD PHone OfflinePR098PR0961/11PR098, PR096, 1/11Extra info
Inglewood IW058 IW059 IW060I W063 igwdca1-ars1 Multiple blades HSO and voice OfflineIW058IW059IW060IIW058, IW059, IW060IMissing last item - original data entered wrong
Willowbrook WB009 cmtnca1-bsr2 1/3/0/0 1/3/3/0 HSD, Voice OfflineWB0091/3/0/01/3/3/0WB009, 1/3/0/0, 1/3/3/0Extra info
Los Anglese AI382X bwlaca1-ars17 7/8.0 thru 7/11.0 HSO and Voice OfflineAI382X7/8.07/11.0AI382X, 7/8.0, 7/11.0Extra info
Santa Monica - SM047,49,51,52,5,6,7,8,9,10,13, - snmnca1-ars1 - HSO & Phone - OfflineSM047,49,51,52,5,6,7,8,9,10,13,SM047,49,51,52,5,6,7,8,9,10,13,Extra info

<colgroup><col><col><col><col span="2"><col><col><col></colgroup><tbody>
</tbody>

<colgroup><col><col><col><col span="2"><col><col><col></colgroup><tbody>
</tbody>
Thanks!

--
Jawnathin
 
Upvote 0
Jawnathin,

I'm sorry, but it's too complex (for me)
Too many numbers ...

:oops:

Good luck and all the best !
 
Upvote 0
Rick: Your solution is great! Unfortunately I'm getting a Subscript out of Range error when I run it on the line "NextAND = 1 + Len(ANDS(0))".
That is probably because your data is not in the cell range I assumed it would be in (you did not tell us where it is in your original message). So, where is your data... what column are they in and what row contains your first piece of data (I'll modify the code to adjust around it)?
 
Last edited:
Upvote 0

Forum statistics

Threads
1,215,066
Messages
6,122,948
Members
449,095
Latest member
nmaske

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top