Extract Number from Alphanumeric String

dwgnome

Active Member
Joined
Dec 18, 2005
Messages
441
Looking for assistance in figuring out the best formula to extract the VOC content value from variable length alphanumeric string. The VOC content is the number just before the "g/L".

I need to extract just the number portion just before the units “g/L”. Most of the time there is a space before the number and after the number and before “g/L”. However, there are times when there may not be any spaces. In either case, I would like to extract the numbers with decimals if it has any.

See the following examples:


Excel Workbook
D
1SEALANT-RETAINING COMPOUND SINGLE COMPONENT ANAEROBIC, SCR40372A, *Loctite 242, 10 ML BOTTLE,146 g/L VOC
2RESIN-EPOXY ALUMINUM FILLED 4 OUNCE PATCH KIT\ * * HYSOL 6C10 g/L VOC
3SEALANT-THREAD LOCK *HIGH STR *MIL-S-22473 *GRADE AA *50 ML BOTTLE * * *SCR802269A\ * Loctite 089 Weld Sealant125.3 g/L VOC
4SEALANT-RETAINING COMPOUND ANAEROBIC MIL-S-22473 * * *GRADE C 50 ML BOTTLE * *SCR802266A\ *Loctite 84 * 167.4 g/L VOC
5RESIN-LAMINATING SYSTEM FOR REPAIR OF 250/350 DEG F CURING ADVANCED COMPOSITES ONE QUART KIT\ * EPOCAST 35A *12 g/L VOC
6CLEANER-THRUST REVERSER 55 GAL DRUM\ *TURCO 5805 * *0 g/L VOC
7SOLVENT-TETRAHYDROFURAN REAGENT GRADE IN 4-LITER * * JT Baker JT9450-5 * * 887 g/L VOC
8CLEANER-AIRCRAFT PARTS * HOT TANK * UNTHICKENED *55 GALLON DRUM\ * CALA 805 *25g/L VOC
9CLEANER\ * Fine Organics FO-585 * * *0 g/L VOC
10SEALANT-THREAD LOCK HIGH STRENGTH MIL-S-22473 *GRADE AV, *50 ML BOTTLE, *SCR802270A, 175.2 g/L VOC
11RESIN-EPOXY FIBERGLASS SELF EXTINGUISHING KIT BMS8-201TYII, *1 Qt Kit, *Epocast 50-A1, * *357 g/L VOC
12CLEANER-AIRCRAFT INTERIOR UTILITY 55 GAL DRUM\ * *AEROCLEAN X-410Q * *20 g/L VOC
13RESIN-PASTE PINHOLE FILLER FOR RADOMES 50 GRAM KIT\ * * * EPIBOND 156 A/B * * 10 g/L VOC
14SEALANT-RETAINING COMPOUND ANAEROBIC MIL-R-46082 TY 1 *250 ML BOTTLE * *SCR18016A\ * Loctite 675 141.7g/L *VOC
15RESIN-EPOXY CLEAR 3.35 OUNCE PATCH KIT\ * * *Hysol 0151 *Chemical Database uses UMS *ADH3002-96 * 34. g/L *VOC
Sheet1


The results I am looking for:


Excel Workbook
H
1146
210
3125.3
4167.4
512
60
7887
825
90
10175.2
11357
1220
1310
14141.7
1534
Sheet1


Any help would be appreciated.
 

Excel Facts

Workdays for a market open Mon, Wed, Friday?
Yes! Use "0101011" for the weekend argument in NETWORKDAYS.INTL or WORKDAY.INTL. The 7 digits start on Monday. 1 means it is a weekend.
Hi dwgnome,

I agree with Jon (post 18) in that, where possible, the use of native excel functions are better than a udf. The problem here is that the data is not always consistent so the solution for me was to write a udf.

That in mind, here's a revamped version on my original udf which still struggles with the line SEALANT-RETAINING COMPOUND ANAEROBIC MIL-R-46082 TY 1 *250 ML BOTTLE * *SCR18016A\ * Loctite 675 141.7g/L *VOC as it (perhaps correctly?) returns 675,141.70:

Code:
Function MyNumberExtract(rngCell As Range) As Currency

    Dim intChrCnt As Integer
    Dim varTempString As Variant
    
    'If there is no 'g/L' and no 'VOC' in the cell, go straight to the _
    'MyNumberExtractResult' label which will in turn return a zero.
    If InStr(rngCell.Value, "g/L") + InStr(rngCell.Value, "VOC") = 0 Then
        GoTo MyNumberExtractResult
    End If
    
    'Start the string search from the immediate left of where 'g/L' resides then _
    search the string from right to left, but return the allowable characters _
    left to right to the 'varTempString' variable
    For intChrCnt = InStr(rngCell.Value, "g/L") - 1 To 1 Step -1
        Select Case Mid(rngCell, intChrCnt, 1)
            'Add allowable characters to the 'varTempString' variable.
            Case 0 To 9, ".", " "
                varTempString = Trim(Mid(rngCell, intChrCnt, 1)) & varTempString
            'Exit the routinue if an unallowable character is found.
            Case Else
                Exit For
        End Select
    Next intChrCnt
    
MyNumberExtractResult:
    
    'If the 'varTempString' variable is null, then...
    If varTempString = "" Then
        '...set the variable to zero.  This stops the udf returning '#VALUE!".
        varTempString = 0
    End If
    
    'Convert the 'varTempString' variable to a currency variable.
    MyNumberExtract = CCur(varTempString)
       
End Function

Post back re how it goes.

Regards,

Robert
 
Upvote 0
It is nice to see others taking an interest to learn more on pgc's fabulous formula. While I agree that it is the best solution, I was hoping to get more familiar with UDFs - hence my interest in Trebor76's solution.

Trebor76 your amended UDF actually got worse in that it picked up additional numbers that are in front of the VOC value.

For example the 4th row resulted in 84167.4, where it should only be 167.4.
Row 6 resulted in 58050, where it should only be 0.
Row 7 resulted in 5887, where it should 887.
Row 8 resulted in 850525, where it should be 25.

I hope you see the pattern without me having to repost the same table. Others are similar. I think where ever there is a number before the VOC value even with many spaces, the UDF insists on picking up the numbers.

Perhaps a different tweak would help.

Thank you again.
 
Upvote 0
Hi dwgnome,

Based on the data you originally posted I don't the results you've just mentioned :confused:

The only strange result for me is Row 14 which returns 675,141.70 (as I previously mentioned).

HTH

Robert
 
Upvote 0
hi all
it is really intersting solutions
Gurus Gurus Gurus
MrExce.com "Great Site"
 
Upvote 0
Robert, sorry I wasn't online, but I could not explain it better than Barry, and Matty had it also correct.

dwgnome

Here is another vba alternative:

Code:
Function ExtractNumber(s As String) As Double
 
With CreateObject("VBScript.RegExp")
    .Pattern = ".*[^\.\d](\d+\.?|\d*\.\d+)\s*g/l.*"
    .IgnoreCase = True
    If .test(s) Then ExtractNumber = .Replace(s, "$1")
End With
End Function
 
Last edited:
Upvote 0
Also, more efficient:

Code:
Function ExtractNumber(s As String) As Double
Dim rexMatches As Object
 
With CreateObject("VBScript.RegExp")
    .Pattern = "(\d+\.?|\d*\.\d+)\s*g/l"
    .IgnoreCase = True
    Set rexMatches = .Execute(s)
    If rexMatches.Count > 0 Then ExtractNumber = rexMatches(0).SubMatches(0)
End With
End Function
 
Upvote 0
One more with Regex....

Code:
Function ExtNum(stInp As String) As Double
    Dim Temp, oMatches As Object
    Temp = 0
    With CreateObject("VBScript.RegExp")
        .Global = True
        .Pattern = "\d+\.?\d* ?g\/L"
        If .Test(stInp) Then
            Set oMatches = .Execute(stInp)
            Temp = Trim(Replace(oMatches(0), "g/L", ""))
        End If
    End With
    ExtNum = Temp
End Function
<input id="gwProxy" type="hidden"><!--Session data--><input *******="jsCall();" id="jsProxy" type="hidden"><input id="gwProxy" type="hidden"><!--Session data--><input *******="jsCall();" id="jsProxy" type="hidden">
 
Upvote 0
Again thanks to all for your contributions.

Pgc01, your UDF worked perfectly. :)

Robert, I tested your UDF again and I do get the inconsistent results as stated in my previous post. Not sure why you don't. I did notice a lot of asteriks where there should be spaces when using ExcelJeanie. Not sure if your data set had them or not. However, your help did not go unappreciated.

Sandeep, your UDF works great. Lightly tested for VOC content of .001 g/L, your result showed 1.000. If I put .7 g/L, result is 7. If I put a zero in front as it probably should, it works. Very minor and perhaps isolated case.:confused:


Keep on Exceling..
 
Upvote 0

Forum statistics

Threads
1,215,231
Messages
6,123,756
Members
449,120
Latest member
Aa2

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top