Parse AlphaNumeric String

drmingle

Board Regular
Joined
Oct 5, 2009
Messages
229
Objective: To create a list of numbers based a cell which are present in an alpha-numeric string

Issues: "-" shows up at times without the context of numbers (i.e. MS-DRG)
Issues: I am not interested in float numbers (i.e. 37.63-37.66)

'Example data in cells (row 1 = A1, row 2 =A2, row 3 = A3, row 4 = A4)
MS-DRGs 767-768, 774-775
Rev 115, 125, 135, 145, 155, 655-656
Rev 174
MS-DRG 001 with ICD-9 37.63-37.66, 37.52

'Desired Results (row 1 = B1, row 2 =B2, row 3 = B3, row 4 = B4)
767,767,768,774,775
115, 125, 135, 145, 155, 655, 656
174
001

Any help would be appreciated.
 

Excel Facts

Convert text numbers to real numbers
Select a column containing text numbers. Press Alt+D E F to quickly convert text to numbers. Faster than "Convert to Number"
Try this on a copy of your worksheet:
Code:
Sub GetWholeNumbers()
Dim lR As Long, c As Range, R As Range, x As Variant, vA As _
    Variant, cOut As String

lR = Range("A" & Rows.Count).End(xlUp).Row
Set R = Range("A1", "A" & lR)
R.Offset(0, 1).NumberFormat = "@"
For Each c In R
    x = Replace(c.Value, ",", "")
    vA = Split(x, " ")
    For i = LBound(vA) To UBound(vA)
        If CStr(vA(i)) Like "*[A-Z]-*" Then vA(i) = " "
    Next i
    x = Join(vA, " ")
    x = Replace(x, "-", " ")
    x = Trim(x)
    vA = Split(x, " ")
    For i = LBound(vA) To UBound(vA)
        If IsNumeric(vA(i)) And InStr(vA(i), ".") = 0 Then
            cOut = cOut & ", " & CStr(vA(i))
        End If
    Next i
    c.Offset(0, 1).Value = Right(cOut, Len(cOut) - 2)
    cOut = ""
Next c
End Sub
 
Upvote 0
Issues: "-" shows up at times without the context of numbers (i.e. MS-DRG)
Issues: I am not interested in float numbers (i.e. 37.63-37.66)

'Example data in cells (row 1 = A1, row 2 =A2, row 3 = A3, row 4 = A4)
MS-DRGs 767-768, 774-775
Rev 115, 125, 135, 145, 155, 655-656
Rev 174
MS-DRG 001 with ICD-9 37.63-37.66, 37.52

'Desired Results (row 1 = B1, row 2 =B2, row 3 = B3, row 4 = B4)
767,767,768,774,775
115, 125, 135, 145, 155, 655, 656
174
001

Any help would be appreciated.
Two questions...

1) When floating point numbers are present, will they always be the last numbers in the cell (no whole numbers following them)?

2) See the text I highlighted in red (for text in A4)... why did you not show the whole number 9 as being outputted somewhere?
 
Upvote 0
Two questions...

1) When floating point numbers are present, will they always be the last numbers in the cell (no whole numbers following them)?
Yes. You can assume that the floating point numbers will be at the end of the cell and that no whole numbers will follow them.

2) See the text I highlighted in red (for text in A4)... why did you not show the whole number 9 as being outputted somewhere?
This is because I am only interested in numeric to numeric relationshipships rather than alpha to numeric relationships. For example, in the case you mentioned ICD-9 (the first part "ICD" is alpha bridged by "-" and then the ending part is numeric "9") that is not what I am needing. However, in the case of 755-798 this would be considered a numeric to numeric relationship since the begining "755" is numeric and the end "798" is numeric.

Please ask me any other question you may have...
 
Upvote 0
Try this on a copy of your worksheet:
Code:
Sub GetWholeNumbers()
Dim lR As Long, c As Range, R As Range, x As Variant, vA As _
    Variant, cOut As String

lR = Range("A" & Rows.Count).End(xlUp).Row
Set R = Range("A1", "A" & lR)
R.Offset(0, 1).NumberFormat = "@"
For Each c In R
    x = Replace(c.Value, ",", "")
    vA = Split(x, " ")
    For i = LBound(vA) To UBound(vA)
        If CStr(vA(i)) Like "*[A-Z]-*" Then vA(i) = " "
    Next i
    x = Join(vA, " ")
    x = Replace(x, "-", " ")
    x = Trim(x)
    vA = Split(x, " ")
    For i = LBound(vA) To UBound(vA)
        If IsNumeric(vA(i)) And InStr(vA(i), ".") = 0 Then
            cOut = cOut & ", " & CStr(vA(i))
        End If
    Next i
    c.Offset(0, 1).Value = Right(cOut, Len(cOut) - 2)
    cOut = ""
Next c
End Sub

JoeMo -

I appreciate the effort, however I am running into a Run-Time error on this line:

"c.Offset(0, 1).Value = Right(cOut, Len(cOut) - 2)"
 
Upvote 0
The below code gets me pretty close to what I am needing.

However, it has the following issues:
  • It displays floating point numbers
  • It displays alpha-numeric relationships (for example, ICD-9)
  • It also does not have a component for expanding code ranges (doesn't make use of upperbound and lowerbound numbers). For example, 655-657 should display as 655, 656,657.


Code:
Function GetNumbers(s As String)
Dim temp As String, i As Long, oMatches
With CreateObject("vbscript.regexp")
    .Global = True
    .Pattern = "\d+"
    If .test(s) Then
        Set oMatches = .Execute(s)
        For i = 0 To oMatches.Count - 1
            temp = temp & oMatches(i) & ","
        Next i
        GetNumbers = Left(temp, Len(temp) - 1)
    End If
End With
End Function

Source: http://www.mrexcel.com/forum/excel-questions/365575-separating-alpha-numeric.html
 
Upvote 0
Give this macro a try...
Code:
Sub GetNumericRanges()
  Dim X As Long, Y As Long, Z As Long, LastRow As Long, FirstDigit As Long, FirstLetter As Long
  Dim Pages As String, Numbers() As String, sRange() As String
  LastRow = Cells(Rows.Count, "A").End(xlUp).Row
  For X = 1 To LastRow
    Pages = Cells(X, "A").Value
    FirstDigit = Evaluate("MIN(FIND(ROW(1:10)-1,""" & Pages & """&""0123456789""))")
    For Z = FirstDigit To Len(Pages) + 1
      If Mid(Pages & "x", Z, 1) Like "[a-z]" Then
        FirstLetter = Z
        Exit For
      End If
    Next
    Pages = Replace(Mid(Pages, FirstDigit, FirstLetter - FirstDigit), " ", "")
    Numbers = Split(Pages, ",")
    For Z = 0 To UBound(Numbers)
      If Numbers(Z) Like "*-*" Then
        sRange = Split(Numbers(Z), "-")
        Numbers(Z) = ""
        For Y = sRange(0) To sRange(1)
          Numbers(Z) = Numbers(Z) & "," & Y
        Next
        Numbers(Z) = Mid(Numbers(Z), 2)
      End If
    Next
    Cells(X, "B").Value = "'" & Join(Numbers, ", ")
  Next
End Sub
 
Upvote 0
JoeMo -

I appreciate the effort, however I am running into a Run-Time error on this line:

"c.Offset(0, 1).Value = Right(cOut, Len(cOut) - 2)"
I'm guessing you may have some cell(s) with nothing numeric to output and I failed to anticipate that. See if this works:
Code:
Sub GetWholeNumbers()
Dim lR As Long, c As Range, R As Range, x As Variant, vA As _
    Variant, cOut As String

lR = Range("A" & Rows.Count).End(xlUp).Row
Set R = Range("A2", "A" & lR)
R.Offset(0, 1).NumberFormat = "@"
For Each c In R
    x = Replace(c.Value, ",", "")
    vA = Split(x, " ")
    For i = LBound(vA) To UBound(vA)
        If CStr(vA(i)) Like "*[A-Z]-*" Then vA(i) = " "
    Next i
    x = Join(vA, " ")
    x = Replace(x, "-", " ")
    x = Trim(x)
    vA = Split(x, " ")
    For i = LBound(vA) To UBound(vA)
        If IsNumeric(vA(i)) And InStr(vA(i), ".") = 0 Then
            cOut = cOut & ", " & CStr(vA(i))
        End If
    Next i
    If cOut <> "" Then
        c.Offset(0, 1).Value = Right(cOut, Len(cOut) - 2)
    End If
    cOut = ""
Next c
End Sub
 
Upvote 0
I'm guessing you may have some cell(s) with nothing numeric to output and I failed to anticipate that. See if this works:
Code:
Sub GetWholeNumbers()
Dim lR As Long, c As Range, R As Range, x As Variant, vA As _
    Variant, cOut As String

lR = Range("A" & Rows.Count).End(xlUp).Row
Set R = Range("A2", "A" & lR)
R.Offset(0, 1).NumberFormat = "@"
For Each c In R
    x = Replace(c.Value, ",", "")
    vA = Split(x, " ")
    For i = LBound(vA) To UBound(vA)
        If CStr(vA(i)) Like "*[A-Z]-*" Then vA(i) = " "
    Next i
    x = Join(vA, " ")
    x = Replace(x, "-", " ")
    x = Trim(x)
    vA = Split(x, " ")
    For i = LBound(vA) To UBound(vA)
        If IsNumeric(vA(i)) And InStr(vA(i), ".") = 0 Then
            cOut = cOut & ", " & CStr(vA(i))
        End If
    Next i
    If cOut <> "" Then
        c.Offset(0, 1).Value = Right(cOut, Len(cOut) - 2)
    End If
    cOut = ""
Next c
End Sub
Your function picks up the floating point values whereas the OP said he did not want them (see the OP's desired output in Message #1). Also, your function only picks up the numbers on either side of the dashes whereas the OP wanted the dashed numbers to indicate a range of numbers and he wanted those middle numbers filled in (see Message #6).
 
Upvote 0
Your function picks up the floating point values whereas the OP said he did not want them (see the OP's desired output in Message #1). Also, your function only picks up the numbers on either side of the dashes whereas the OP wanted the dashed numbers to indicate a range of numbers and he wanted those middle numbers filled in (see Message #6).
my code excludes numbers containing a decimal point. as for numbers between the bookmarks of the
dash, that wasn't requested in the original post.
 
Upvote 0

Forum statistics

Threads
1,215,563
Messages
6,125,568
Members
449,237
Latest member
Chase S

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top