Extract Multiple Number Strings from Within Text (no uniform format)

Sarahsodapop

New Member
Joined
Jul 19, 2011
Messages
17
Hello!

I'm stumped -- I need to extract certain identifying numbers from a cell with a LOT of text, plus other miscellaneous numbers & years... The numbers I need all start with 34, but then can vary from 6-14 numbers, might end in zeroes, and can be followed immediately by either a space or letter or punctuation. I would love to get them all extracted into individual cells, and add zeroes onto the end of some (to make each one consist of 14 numbers). I'm not sure of what the maximum # of instances of 34xxx may be within any given cell.

I've found what might be a similar UDF here: http://www.mrexcel.com/forum/excel-questions/900604-extracting-multiple-numbers-text-string.html, but I'm not skilled enough at them (super newbie) to customize to my situation. Here are some examples of the "Notes" cells, along with what my goal columns are (API 1, 2, 3, etc).

82968eb4d3.jpg


Any help is greatly appreciated, as always -- you guys (and/or gals) are the best!!

Thanks,
Sarah
 

Excel Facts

How to create a cell-sized chart?
Tiny charts, called Sparklines, were added to Excel 2010. Look for Sparklines on the Insert tab.
Does this macro work for you...
Code:
[table="width: 500"]
[tr]
	[td]Sub ExtractNumbersBeginingWith34()
  Dim R As Long, X As Long, Max As Long, Data As Variant, Nums() As String
  Data = Range("A1", Cells(Rows.Count, "A").End(xlUp))
  For R = 1 To UBound(Data)
    For X = 1 To Len(Data(R, 1))
      If Not Mid(Data(R, 1), X, 1) Like "#" Then Mid(Data(R, 1), X) = " "
    Next
    Nums = Split(Application.Trim(Data(R, 1)))
    If UBound(Nums) > Max Then Max = UBound(Nums)
    For X = 0 To UBound(Nums)
      If Left(Nums(X), 2) <> 34 Or Len(Nums(X)) < 8 Or Len(Nums(X)) > 16 Then
        Nums(X) = ""
      Else
        Nums(X) = Replace(Format(Nums(X), "!@@@@@@@@@@@@@@@@"), " ", 0)
      End If
    Next
    Data(R, 1) = Application.Trim(Join(Nums))
    Cells(R, "B").Value = Data(R, 1)
  Next
  Columns("B").Resize(, Max + 1).NumberFormat = "0000000000000000"
  Columns("B").TextToColumns , xlDelimited, , , 0, 0, 0, 1
  Columns("B").Resize(, Max + 1).AutoFit
End Sub[/td]
[/tr]
[/table]
 
Upvote 0
Thanks so much for the reply! It did not, unfortunately. I got a "Run-time error '1004': No data was selected to parse." However, a couple of caveats: I am not positive that I ran it correctly. In the Developer tab, I opened Macros, created a new one, and pasted your string in as-is. Hit "save", closed the window with the code, then hit "run". Not sure if that is the right way to do it. Also, I don't know enough about macros to know if I need to customize at all -- i.e. reference certain columns, tell it where to put the data, etc. For example, the column that the first 34xxx number appears in is "T". Do I need to change any reference above from "A" to "T"?

Also, one change that I hope won't be too difficult -- a new set of data now has two columns (T & U) which may (or may not) contain the 34xxx numbers. Is there a way to have it look in both of those, or should I make an effort to merge the cells first?

Thanks so much!
 
Upvote 0
For example, the column that the first 34xxx number appears in is "T". Do I need to change any reference above from "A" to "T"
Of course the references need to be changed. The reason I set the references to Column A is because that is where you showed the text to be in the picture you posted in Message #1. There is no way I could figure your data was actually in Column T after I looked at that. I'll modify the code to work for you once you answer the question below.


Also, one change that I hope won't be too difficult -- a new set of data now has two columns (T & U) which may (or may not) contain the 34xxx numbers. Is there a way to have it look in both of those, or should I make an effort to merge the cells first?
No, I can have the code do that automatically in memory, but you have to tell me where you want the output to go to.... Columns V, W, X, etc. or Columns B, C, D, etc. where you showed the output going in your original message?
 
Upvote 0
Thanks, Rick. I'm just not familiar enough with macros or code to know what is required and what isn't. Now I know, so thank you for the knowledge. :) In this spreadsheet, I would want the output to go to V,W,X, etc... In your reply, would you mind pointing out what portion(s) would need to be changed in case I used this in a different spreadsheet with different columns?
Thanks,
Sarah
 
Upvote 0
Thanks, Rick. I'm just not familiar enough with macros or code to know what is required and what isn't. Now I know, so thank you for the knowledge. :)
It is almost always a bad idea to simplify your data or its layout when asking a question in a forum... if you VB coding skills are on the weakfish side, then it is imperative that you not misrepresent your data or its layout as making changes to the code to account for the misrepresentation will almost always prove to be challenging.



In your reply, would you mind pointing out what portion(s) would need to be changed in case I used this in a different spreadsheet with different columns?
I set the code up so that you can specify the column letter designation for the first text column, second text column and column letter designation for the first column where output will start at... you would specify these in the code lines highlighted in red.
Code:
[table="width: 500"]
[tr]
	[td]Sub ExtractNumbersBeginingWith34()
  Dim R As Long, X As Long, Max As Long, LastRow As Long
  Dim DataT As Variant, DataU As Variant, Nums() As String
[B][COLOR="#FF0000"]  Const ColLetter1 As String = "T"
  Const ColLetter2 As String = "U"
  Const FirstOutputColumn As String = "V"[/COLOR][/B]
  LastRow = Application.Max(Cells(Rows.Count, ColLetter1).End(xlUp).Row, Cells(Rows.Count, ColLetter2).End(xlUp).Row)
  DataT = Range(Cells(1, ColLetter1), Cells(LastRow, ColLetter1))
  DataU = Range(Cells(1, ColLetter2), Cells(LastRow, ColLetter2))
  For R = 1 To LastRow
    DataT(R, 1) = DataT(R, 1) & DataU(R, 1)
    For X = 1 To Len(DataT(R, 1))
      If Not Mid(DataT(R, 1), X, 1) Like "#" Then Mid(DataT(R, 1), X) = " "
    Next
    Nums = Split(Application.Trim(DataT(R, 1)))
    If UBound(Nums) > Max Then Max = UBound(Nums)
    For X = 0 To UBound(Nums)
      If Left(Nums(X), 2) <> 34 Or Len(Nums(X)) < 8 Or Len(Nums(X)) > 16 Then
        Nums(X) = ""
      Else
        Nums(X) = Replace(Format(Nums(X), "!@@@@@@@@@@@@@@@@"), " ", 0)
      End If
    Next
    DataT(R, 1) = Application.Trim(Join(Nums))
    Cells(R, FirstOutputColumn).Value = DataT(R, 1)
  Next
  Columns(FirstOutputColumn).Resize(, Max + 1).NumberFormat = "0000000000000000"
  Columns(FirstOutputColumn).TextToColumns , xlDelimited, , , 0, 0, 0, 1
  Columns(FirstOutputColumn).Resize(, Max + 1).AutoFit
End Sub[/td]
[/tr]
[/table]
 
Last edited:
Upvote 0
Woo! We're getting there! I popped it in, ran it, and got some results! Some comments:
1. I did still get an error message, "Run-time error '13': Type mismatch". When I clicked Debug, it highlighted this line: DataT(R, 1) = DataT(R, 1) & DataU(R, 1). I have no idea if this is significant or not...
2. The resulting numbers are formatted for 16 characters instead of 14 -- would I reduce the number of @@ by two, and remove two 00's from the NumberFormat = "0000000000000000" portion? Anywhere else?
3. Apologies for not anticipating this, but if the same # occurs in both Columns T & U, is there a way to only extract it once instead of twice?
 
Upvote 0
1. I did still get an error message, "Run-time error '13': Type mismatch". When I clicked Debug, it highlighted this line: DataT(R, 1) = DataT(R, 1) & DataU(R, 1). I have no idea if this is significant or not...
After you click the Debug button, hover your mouse cursor over the R variable and note the number that is displayed, then look at Column T and U for that row number and tell me what is in those cells (whatever is in one or both of them is what is triggering the error).


2. The resulting numbers are formatted for 16 characters instead of 14 -- would I reduce the number of @@ by two, and remove two 00's from the NumberFormat = "0000000000000000" portion? Anywhere else?
When you said this in Message #1, "The numbers I need all start with 34, but then can vary from 6-14 numbers", it sounded like the "34" was not to be counted in with the 6 to 14 digits. Yes, delete two of the @ symbols to reduce the output by two digits.


3. Apologies for not anticipating this, but if the same # occurs in both Columns T & U, is there a way to only extract it once instead of twice?
That will require additional code (I understand you did not anticipate it, but it is much easier to write code when we know all of the requirements up front... saves time tearing apart just written code in order to squeeze in new functionality without destroying existing functionality)... I'll look into patching the code a little bit later.
 
Upvote 0

Forum statistics

Threads
1,215,741
Messages
6,126,591
Members
449,320
Latest member
Antonino90

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top