Extract text after specific char in a string

WmK

New Member
Joined
Jul 17, 2012
Messages
5
HI
Probably simple I just cant figure it out.

I have cell content of a variable length string and I need to extract all text after "///" an example is below:

897/C///Amarnath Leena CD20110316 09105348
I need to pull everything after the "///"

Another example would be:
744/I/373632///SUVARNA, VINOD K MD=20110926 231597
Again I need to pull everything after the "///"

Right, Left, and Mid wont work since the position is different any thoughts?

<tbody>
</tbody>

<tbody>
</tbody>
 
Wow! Mind is blown! LoL!

But really, I'm truly amazed István! This is a really fast reply (well to me that is ;)
Like I said I'm no expert, so I estimated it would take someone some more effort then this.
Very impressed that you pull this out of your head. Thank you SO MUCH!

I'm very intrigued as to what the different commands in the formula do. Would love to be able to understand and pass this on.
Could you explain please?

And as promised, my forever gratefulness ;)
Thanks again!!
 
Upvote 0

Excel Facts

Excel Can Read to You
Customize Quick Access Toolbar. From All Commands, add Speak Cells or Speak Cells on Enter to QAT. Select cells. Press Speak Cells.
=TRIM(MID(SUBSTITUTE(A1," ",REPT(" ",99)),FIND("%",SUBSTITUTE(A1," ",REPT(" ",99)))-99,198*(1/2+LEN(A1)-LEN(SUBSTITUTE(A1,"%","")))))

Thanks for the feedback.
Extracting is a regular task recurring on these pages and there are some „standard„ tricks which make solving these kind of tasks easier.

One of them is converting a space separating the words into spaces with known number of repeats, say 99, before processing. When you find the start of a word, you may need the length of that word for many functions, for MID function in this case, and the conversion above makes this easy, as each word is surrounded by a number of spaces (99 spaces, a word, 99 spaces), so even if the word is extra long, say 20 characters, will still be in the middle of a „space field” and the subsequent TRIM function, which delete surplus spaces, produces the word itself. The part substituting each space with 99 spaces is colored red.

This part:

FIND("%",SUBSTITUTE(A1," ",REPT(" ",99)))

finds the first occurrence of „%” in the extended string, and subtracting 99 from this value, we are somewhere in the space field in front of the first number followed by %, this is the starting point where the extraction begins (MID function start_num). Of course, MID function text part (in front of the FIND function) is not a1, but the extended string (colored blue).
The part starting with 198 (2*99) is the length of the string to be extracted (the number of characters for the MID function); here the difference of the two lengths (green) gives the number of „%”-s in the string, from this the whole length of the string to be extracted can be calculated.
Oops, I realize now that for 4-element compositions my formula extracts 1 plus word, so please let me know and correct the formula if you have strings with more than 3 constituents to process.
 
Last edited:
Upvote 0
Corrected:

=TRIM(MID(SUBSTITUTE(" "&A1," ",REPT(" ",99)),FIND("%",SUBSTITUTE(" "&A1," ",REPT(" ",99)))-99,99*(1+2*(LEN(A1)-LEN(SUBSTITUTE(A1,"%",""))))))
 
Upvote 0
Ok I was a fool for thinking I'd understand :eek: LoL

Strangely enough it worked fine for all 4-element compositions and even some 5-element compositions.
But for the bold font below, I had to remove the space between 10% and Alpace before showing the result seen on the right. Initially it only showed the 10%

Y10000667 Type 25% Viscose, 20% Polyamide, 50% Cotton, 5% Alpaca25% Viscose, 20% Polyamide, 50% Cotton, 5% Alpaca
Y10000667 Type 25% Viscose, 20% Polyamide, 50% Cotton, 5% Alpaca25% Viscose, 20% Polyamide, 50% Cotton, 5% Alpaca
Y10000286 Type 82% Viscose, 18% Polyamide82% Viscose, 18% Polyamide
Y10000603 Woolen 70% Acrylic, 20% Lambs wool, 10%Alpaca70% Acrylic, 20% Lambs wool, 10%Alpaca
Y10000603 Woolen 70% Acrylic, 20% Lambs wool, 10%Alpaca
70% Acrylic, 20% Lambs wool, 10%Alpaca



<colgroup><col><col></colgroup><tbody>
</tbody>
F10005761 Solid 48% Wool, 40% Polyester, 5% Polyamide, 4% Acrylic, 3% Viscose Flannel 360GSM Woven48% Wool, 40% Polyester, 5% Polyamide, 4% Acrylic, 3% Viscose

<tbody>
</tbody>


It still does that by the way but just with these two lines, when using the corrected formula.
 
Upvote 0
Well, of course, the formula will work for strings of similar structure, that is, where the number is directly followed by the %. Just to make sure, select the range of cells to process, go to /Editing/Replace by pressing Ctrl-H, and replace „ %” (space %) (without quotation marks) with „%”. This will delete the possible space between the number and the % character. Or you can use this modified formula to handle this kind of problem without prior replacing:

=TRIM(MID(SUBSTITUTE(" "&SUBSTITUTE(A1," %","%")," ",REPT(" ",99)),FIND("%",SUBSTITUTE(" "&SUBSTITUTE(A1," %","%")," ",REPT(" ",99)))-99,99*(1+2*(LEN(A1)-LEN(SUBSTITUTE(A1,"%",""))))))

Let me know if it works correctly now.
 
Upvote 0
It says there are no spaces to replace.
The modified formula works just like the ones before, somehow only for those two lines it cuts it off.

F10005339 Checks 65% Polyester, 30% Viscose, 5% Elastane Double Weave 420GSM Woven65% Polyester, 30% Viscose, 5% Elastane
F10005763 Yarn Dye 49% Polyester, 27% Cotton, 22% Polyamide, 2% Elastane Twill 370GSM Knitted Fabric49% Polyester, 27% Cotton, 22% Polyamide, 2% Elastane
F10005365 75% Acrylic, 15% Polyester, 10% Viscose Jacquard 370g Knitted Fabric75% Acrylic, 15% Polyester, 10% Viscose
Y10000667 Type 25% Viscose, 20% Polyamide, 50% Cotton, 5% Alpaca25% Viscose, 20% Polyamide, 50% Cotton, 5% Alpaca
Y10000667 Type 25% Viscose, 20% Polyamide, 50% Cotton, 5% Alpaca25% Viscose, 20% Polyamide, 50% Cotton, 5% Alpaca
Y10000286 Type 82% Viscose, 18% Polyamide82% Viscose, 18% Polyamide
Y10000603 Woolen 70% Acrylic, 20% Lambs wool, 10% Alpaca70% Acrylic, 20% Lambs wool, 10%
Y10000603 Woolen 70% Acrylic, 20% Lambs wool, 10% Alpaca70% Acrylic, 20% Lambs wool, 10%
Y10000731 Woolen 18% Wool, 72% Polyamide, 10% Mohair 2/1618% Wool, 72% Polyamide, 10% Mohair
Y10000661 Type 50% Acrylic, 30% Wool, 10% Alpaca, 10% Polyamide50% Acrylic, 30% Wool, 10% Alpaca, 10% Polyamide

<colgroup><col><col></colgroup><tbody>
</tbody>
 
Upvote 0
It says there are no spaces to replace.
The modified formula works just like the ones before, somehow only for those two lines it cuts it off.

F10005339 Checks 65% Polyester, 30% Viscose, 5% Elastane Double Weave 420GSM Woven65% Polyester, 30% Viscose, 5% Elastane
F10005763 Yarn Dye 49% Polyester, 27% Cotton, 22% Polyamide, 2% Elastane Twill 370GSM Knitted Fabric49% Polyester, 27% Cotton, 22% Polyamide, 2% Elastane
F10005365 75% Acrylic, 15% Polyester, 10% Viscose Jacquard 370g Knitted Fabric75% Acrylic, 15% Polyester, 10% Viscose
Y10000667 Type 25% Viscose, 20% Polyamide, 50% Cotton, 5% Alpaca25% Viscose, 20% Polyamide, 50% Cotton, 5% Alpaca
Y10000667 Type 25% Viscose, 20% Polyamide, 50% Cotton, 5% Alpaca25% Viscose, 20% Polyamide, 50% Cotton, 5% Alpaca
Y10000286 Type 82% Viscose, 18% Polyamide82% Viscose, 18% Polyamide
Y10000603 Woolen 70% Acrylic, 20% Lambs wool, 10% Alpaca70% Acrylic, 20% Lambs wool, 10%
Y10000603 Woolen 70% Acrylic, 20% Lambs wool, 10% Alpaca70% Acrylic, 20% Lambs wool, 10%
Y10000731 Woolen 18% Wool, 72% Polyamide, 10% Mohair 2/1618% Wool, 72% Polyamide, 10% Mohair
Y10000661 Type 50% Acrylic, 30% Wool, 10% Alpaca, 10% Polyamide50% Acrylic, 30% Wool, 10% Alpaca, 10% Polyamide

<tbody>
</tbody>

In the samples originally provided, the material names contained only one word, for example Polyester, Cotton (compared to Lambs wool, which contains two). If there are 2 (or more)-word material names in the strings, the task becomes not defined as the formula nor any other program can determine which words belong to the material name and which to other parts of the text, e.g.:

Y10000603 Woolen 70% Acrylic, 20% Alpaca, 10% Lambs(?) wool(?) and(?) other text

This problem can be overcome only if a separate list of material names is available, or the material names having more than one word are not joined with a space, but with another character, e.g. CHAR(160) or #; Lambs#wool, for example.
 
Upvote 0
Ahh of course!!
It's not a biggie, those couple of lines that I might have to manually adjust is the least of my worries but I just found it curious.
But this makes sense :)

Thank you so much again! Very much appreciated!

Until my next Excel adventure ;)
 
Upvote 0
It's not a biggie, those couple of lines that I might have to manually adjust is the least of my worries but I just found it curious.
The following formula, although some 2-plus times larger than István's formula, should work correctly for all of your possible values...

=TRIM(RIGHT(SUBSTITUTE(TRIM(LEFT(A1,FIND("%",A1)-1))," ",REPT(" ",99)),198))&MID(A1,FIND("%",A1),FIND("|",SUBSTITUTE(A1,"%","|",LEN(A1)-LEN(SUBSTITUTE(A1,"%",""))))-FIND("%",A1)+1)&" "&TRIM(LEFT(SUBSTITUTE(TRIM(TRIM(MID(A1,FIND("|",SUBSTITUTE(A1,"%","|",LEN(A1)-LEN(SUBSTITUTE(A1,"%",""))))+1,99)))," ",REPT(" ",99)),99))
 
Upvote 0
The following formula, although some 2-plus times larger than István's formula, should work correctly for all of your possible values...

=TRIM(RIGHT(SUBSTITUTE(TRIM(LEFT(A1,FIND("%",A1)-1))," ",REPT(" ",99)),198))&MID(A1,FIND("%",A1),FIND("|",SUBSTITUTE(A1,"%","|",LEN(A1)-LEN(SUBSTITUTE(A1,"%",""))))-FIND("%",A1)+1)&" "&TRIM(LEFT(SUBSTITUTE(TRIM(TRIM(MID(A1,FIND("|",SUBSTITUTE(A1,"%","|",LEN(A1)-LEN(SUBSTITUTE(A1,"%",""))))+1,99)))," ",REPT(" ",99)),99))


I must convert each formula into continental system before using/evaluating it and this conversion may enter errors into a formula – that said, I got with Rick’s above formula for:

F10005537 100 % Polyester 78gsm2 Woven

this:

F10005537 100% Polyester

This formula otherwise can handle „Lambs wool” in middle position, but can not in end position (see the example with red question marks created in post #47).
 
Upvote 0

Forum statistics

Threads
1,215,692
Messages
6,126,227
Members
449,303
Latest member
grantrob

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top