Extracting Specific Text within a variable text string

Chris8630

New Member
Joined
Sep 15, 2021
Messages
5
Office Version
  1. 2016
Platform
  1. Windows
Hi all
I'm working on a way to automate a invoice checking which my colleague was doing manually. We get a PDF copy of the invoice which I now convert to Excel via Adobe. Unfortunately the output looks horrific so I'm trying to go through and extra certain bits of information, some I'm fine with however there are some I have no idea how to extra out, made worse by the variability of the cell contents. So for example I have a cell which contains the following text sting (data references changed for security but I've left the number of characters the same)

AAAAA 150 BBB 150 BBB CCCCC-DDD

So the AAAAA is not needed data, the first 150 BBB is the ordered amount and the second 150 BBB is the delivered amount. Some of the cells then end of with a product code (CCCCC-DDD in this case) but not always. I'm after the underlined number and I don't need any of the rest of the stuff in the cell. As the invoices can cover 100's of products manually extracting the data doesn't really fix my issue. Is there any way I can extract this specific bit of text from the middle of all of this please?
Thank you all in advance for your help.
 

Excel Facts

Return population for a City
If you have a list of cities in A2:A100, use Data, Geography. Then =A2.Population and copy down.
Will it always be the 2nd number in the cell?
 
Upvote 0
Will it always be the 2nd number in the cell?
It should be, the first bit of the cell is normally a mixture of letters and numbers. The BBB's are letters and the part number at the end is normally a mixture of letters and numbers
 
Upvote 0
Ok, how about
+Fluff 1.xlsm
AB
1
2AAAAA 150 BBB 155 BBB CCCCC-DDD155
Lists
Cell Formulas
RangeFormula
B2B2=FILTERXML("<k><m>"&SUBSTITUTE(A2," ","</m><m>")&"</m></k>","//m[.=number()][2]")
 
Upvote 0
Or with regex

VBA Code:
Function jec(cell As String)
 With CreateObject("VBscript.RegExp")
   .Global = True
   .Pattern = "\d{1,10}"
   jec = .Execute(cell)(1)
 End With
End Function

Excel Formula:
=jec(A1)
 
Upvote 0
Another option if the number you want is always between the 3rd & 4th space
Excel Formula:
=REPLACE(LEFT(A2,FIND("^",SUBSTITUTE(A2," ","^",4))-1),1,FIND("^",SUBSTITUTE(A2," ","^",3)),"")+0
 
Upvote 0
Another option

Excel Formula:
=TRIM(MID(SUBSTITUTE(A2," ",REPT(" ",199)),599,99))+0
 
Upvote 0
Ok, how about
+Fluff 1.xlsm
AB
1
2AAAAA 150 BBB 155 BBB CCCCC-DDD155
Lists
Cell Formulas
RangeFormula
B2B2=FILTERXML("<k><m>"&SUBSTITUTE(A2," ","</m><m>")&"</m></k>","//m[.=number()][2]")
Hi Fluff, thanks for this, one the first example I've got in my file this works spot on. I'll run it through the rest of the file and check it grabs the info on the rest. Really appreciate the help :)
 
Upvote 0
Glad we could help & thanks for the feedback.
 
Upvote 0

Forum statistics

Threads
1,214,979
Messages
6,122,557
Members
449,088
Latest member
davidcom

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top