Extraction of multiple occurrences of a string

Iliketrekking

New Member
Joined
Feb 16, 2023
Messages
3
Office Version
  1. 365
Platform
  1. Windows
Dear all,
I searched among the past discussions if someone else made the same question, but I was not able to find it. Sorry if my search was not accurate.
This is my problem: I have a cell (actually a column, but it is the same) containing some text and i'm trying to extract from it each occurrence of a string with a fixed part and a variable part (the length of the string remains the same). Example:
Content cell A1: "This is an example of a CODEWORD_1011 in which the same word CODEWORD_3131 is replicated in several rows like CODEWORD_0001.
No matter ho many CODEWORD_4001 times you see it, it is meaningless"

I'm trying to extract all CODEWORD_XXXX.
Basically the output should be: "CODEWORD_1011, CODEWORD_3131, CODEWORD_0001, CODEWORD_4001"

Any help will be really appreciated and sorry again if this has already been solved in a past discussion.

kind regards,

ILT
 

Excel Facts

Can Excel fill bagel flavors?
You can teach Excel a new custom list. Type the list in cells, File, Options, Advanced, Edit Custom Lists, Import, OK
Nice question!

Book1
AB
1This is an example of a CODEWORD_1011 in which the same word CODEWORD_3131 is replicated in several rows like CODEWORD_0001. No matter ho many CODEWORD_4001 times you see it, it is meaninglessCODEWORD_1011,CODEWORD_3131,CODEWORD_0001,CODEWORD_4001
2A 2nd row with a CODEWORD_9999, and CODEWORD_1111CODEWORD_9999,CODEWORD_1111
Sheet1
Cell Formulas
RangeFormula
B1:B2B1=MAP(A1:A2,LAMBDA(x,TEXTJOIN(",",,TEXTSPLIT(x,TEXTSPLIT(x,"CODEWORD_"&TEXT(ROW(1:10000)-1,"0000"),,1)))))
Dynamic array formulas.
 
Upvote 1
Solution
Dear JvdV,
thank you very much for your quick reply.
I tried using your formula (first I had to translate it because I'm using an italian version of excel) but it's not working properly.
Some important work deadlines stopped me investigating on the problem...
I'll check again to see if it is something i wrongly translated and i'll be soon back.
thanks again
 
Upvote 0
Dear JvdV,
it works!!!
as i thought it was only a matter of wrong translation.
Thank you very much,

ILT
 
Upvote 0
Hi, apologies for resurrecting an older thread, but I have a very similar query that isn't working, I'm assuming due to the format of the string that I'm searching for. Instead of CODEWORD_xxxx my strings that I'm trying to find are in the format (6.#.#) or (6.#.#.#), where each # is a number between 0 and 30. I've tried replicating this in the TEXT format function, but I'm guessing it doesn't like multiple "." characters as a format. Can anyone suggest an alternative?
 
Upvote 0
Hi, apologies for resurrecting an older thread, but I have a very similar query that isn't working, I'm assuming due to the format of the string that I'm searching for. Instead of CODEWORD_xxxx my strings that I'm trying to find are in the format (6.#.#) or (6.#.#.#), where each # is a number between 0 and 30. I've tried replicating this in the TEXT format function, but I'm guessing it doesn't like multiple "." characters as a format. Can anyone suggest an alternative?
What version of Excel have you got. Access to PY() function perhaps?
 
Upvote 0
What version of Excel have you got. Access to PY() function perhaps?
Thanks for getting back to me, it's 365, so probably should be available... however the organisation I work for is quite risk averse and tend to frown upon use of VBA etc so I'm guessing that would also apply to Python...
 
Upvote 0
Welcome to the MrExcel board!

We don't know just what your actual data is like or how varied it is so if this is not what you need, please give us (preferably with XL2BB so we can easily copy for testing) a smallish set of sample data with the expected results manually filled in. Add any further written clarification if possible.

23 11 20.xlsm
AB
1abc (6.1.3) def (6.25.23.30) ghi (7.2.3) jkl (6.0.0.0)(6.1.3), (6.25.23.30), (6.0.0.0)
2czv 
3mno(62.3.4)pqr(6.22.22.22)stu(6.22.22.22)
Extract strings
Cell Formulas
RangeFormula
B1:B3B1=LET(ts,TEXTSPLIT(A1,{"(",")"}),s,TEXTJOIN("), (",1,FILTER(ts,LEFT(ts,2)="6.","")),IF(s="","","("&s&")"))
 
Upvote 0
Well, you don't need VBA since PY() is another function but it might not yet be available to you. It allows for some rather simple Python script to be executed. Here for example I see a lot of opportunities for a regular expression. Here I've used the sample data set out by user @Peter_SSs to show you what I meant:

Book1
AB
1abc (6.1.3) def (6.25.23.30) ghi (7.2.3) jkl (6.0.0.0)(6.1.3), (6.25.23.30), (6.0.0.0)
2czv
3mno(62.3.4)pqr(6.22.22.22)stu(6.22.22.22)
Sheet1
Cell Formulas
RangeFormula
B1:B3B1=PY("import re [', '.join(re.findall(r'(\(6(?:\.\d+){2,3}\))', s)) for s in xl(""A1:A3"")[0]]",0)
Dynamic array formulas.


Not sure if the formula used is captured properly here. If you open '=PY(', you'll notice you can start typing script before CTRL+Enter. The proper formatting for the script was:

Python:
import re
[', '.join(re.findall(r'(\(6(?:\.\d+){2,3}\))', s)) for s in xl("A1:A3")[0]]
 
Last edited:
Upvote 0

Forum statistics

Threads
1,213,549
Messages
6,114,261
Members
448,558
Latest member
aivin

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top