just extract the alphanumeric characters

vacation

Board Regular
Joined
Dec 6, 2003
Messages
56
Hi,

In column C, there are over 30,000 items of data that look like this:

...
R#CS2719==
R#CS2653=
R#CS2720=
R#CS2654=
R#CS2770===
BMHH235.5===#
BMFA63023.1=%
BMFA63023===)
BMHP286.1===$
BMHP286=====!
BMRS192A1===~
BMFD280.1===#
BMFD281.1===#
BMHD196======
BMRS283=====(
BMH34641====~
...

They contain all kinds of non-alphanumeric characters.
Note: All possible non-alphanumeric characters are not shown above.
The above is just a small sample.

I would like column E to contain data from the corresponding cells in
column C but exclude all the non-alphanumeric characters.
So the above column C data would appear in column E as:

...
RCS2719
RCS2653
RCS2720
RCS2654
RCS2770
BMHH2355
BMFA630231
BMFA63023
BMHP2861
BMHP286
BMRS192A1
BMFD2801
BMFD2811
BMHD196
BMRS283
BMH34641
...


What combination of nested functions should I assign to each cell in column E to achieve this?


Thanks.
 

Excel Facts

When they said...
When they said you are going to "Excel at life", they meant you "will be doing Excel your whole life".
This one is well suited to using the parsing techniques of the Regular Expression Object.

The code below replaces all non alphanumeric characters with ""

Sub GetAlpha()
Dim Myrange As Range
Dim Cel As Range
Dim RegExp
Dim Reg
Set Myrange = Intersect(ActiveSheet.UsedRange, ActiveSheet.Columns("C"))
Set RegExp = CreateObject("vbscript.regexp")
RegExp.Global = True
RegExp.ignorecase = True
'exclude anything that isn't a letter or number
RegExp.Pattern = "[^\w+]"
For Each Cel In Myrange
Cel.Offset(0, 2).Value = RegExp.Replace(Cel.Value, "")
Next
End Sub

Cheers

Dave
 
Upvote 0
OK Brett :oops:
I tested your code and it worked great :oops: ... and I didn't think it would ... reason being I'm not understanding the "vbscript.regexp" .. could you please explain that for me ... pleassssse :biggrin: :biggrin: :pray:
 
Upvote 0
Aloha...

I'm not a programmer, but I have come up with a light-weight response for you.

Rather than trying to do something in the formula bar, I wrote a short code that will go in a module, and will be called from the worksheet formula bar.

It has no error correction, and it does not use Option Explicit or Dim the variables as I customarily do.

In E1, for example, you would enter

= RemoveSymbols(A1)

Copy it down.

I have another version that does not require you to enter it in the cells. It's a one-time shot.

Is your data dynamic or you just have these 30,000 items and that's it?

I see now that somebody posted some code for you while I was working on this. But I offer you the product of my efforts anyway. :coffee:

Code:
Function RemoveSymbols(cellValue As String) As String
        valueLen = Len(cellValue)
        
        For pos = 1 To valueLen
            cellChar = UCase(Left(cellValue, 1))
            If cellChar = "A" Or cellChar = "B" Or cellChar = "C" Or cellChar = "D" Or cellChar = "E" Or cellChar = "F" Or cellChar = "G" Or _
                cellChar = "H" Or cellChar = "I" Or cellChar = "J" Or cellChar = "K" Or cellChar = "L" Or cellChar = "M" Or cellChar = "N" Or _
                cellChar = "O" Or cellChar = "P" Or cellChar = "Q" Or cellChar = "R" Or cellChar = "S" Or cellChar = "T" Or cellChar = "U" Or _
                cellChar = "V" Or cellChar = "W" Or cellChar = "X" Or cellChar = "Y" Or cellChar = "Z" Or IsNumeric(cellChar) Then
            
                newCellValue = newCellValue & cellChar
            End If
            valueLen = valueLen - 1
            cellValue = Right(cellValue, valueLen)
        Next pos
            RemoveSymbols = newCellValue
            newCellValue = ""
End Function
Tana-Lee
 
Upvote 0
OK Brett
I tested your code and it worked great ... and I didn't think it would ... reason being I'm not understanding the "vbscript.regexp" .. could you please explain that for me ... pleassssse

Hi Nimrod

I've been playing with Regular Expressions at Experts Exchange for a while now. The RegExp object is used for parsing strings, I think it was developed in the 50's. I'm happy to post or send you some examples that I've used to solve problems at EE (ie convert any nth cell reference in a range of formulas to absolute, parse names, reverse name order etc)

As to understanding RegExp I'd start looking at these two links
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/script56/html/vsobjregexp.asp
http://www.electrified.net/dev/RegexEvaluate.aspx

The first is a microsoft explanatory article on the rather arcane pattern matching, the second is an excellent pattern tester.

As for this code. The \w+ pattern is equivalent to a-zA-Z0-9, ie any alphanumeric charcter. By using the [] brackets with a ^, I am saying find any character that is NOT alphanumeric. The Replace method then replaces any non-alphanumeric character with a "".

If my pattern had been "\w+" rather than "[^\w+]" it would have stripped the alphanumerics out instead.

Cheers

Dave
 
Upvote 0
Aloha...

Thanks for the extra info on RegExp. I didn't know there was such a thing in Windows/VB. I thought it was for Unix/Linux using Perl, etc.

Tana-Lee
 
Upvote 0
No problem. :)

VBscript RegExp isn't widely used with Excel, I've had to experiment a fair bit as there hasn't been much to learn from. There is some decent info at the Microsoft site about how to access the technique itself but you'll find most of the good practical examples (ie validating email adresses) are in Perl.

I think that the main issue to get to grips with is how to use the SubMatches collection which are generated by the () in a pattern. The example below shows how to access two of four possible submatches

If you run it over a selected range it converts every second range reference to absolute. The code could be prettier but it is a useful example of the power of RegExp

Cheers

Dave

Code:
Sub Main()
' David Brett 9th Dec 2003
Dim Myrange    As Range
Dim Cel        As Range
    On Error Resume Next
    ' Look at formulas only, the error check prevents an error if there are no formulas in the selection
    Set Myrange = Intersect(Selection, Selection.Cells.SpecialCells(xlFormulas))
    On Error GoTo 0
    ' Leave Main sub if there are no formulas
    If Myrange Is Nothing Then Exit Sub
    With Application
        .ScreenUpdating = False
        .Calculation = xlCalculationManual
    End With
    For Each Cel In Myrange.Cells
        ' Four valid arguments
        ' "Rel" = Relative Referenced
        ' "Abs" = Absolute Referenced
        ' "Row" = Absolute Row Referenced
        ' "Col" = Absolute Column Referenced

        'This example converts every second cell reference of each formula in the selection to Absolute
        ' ReplaceRef Function = (Pattern to find formulas, Desired Reference, formula ref to be changed)
        Cel.Formula = ReplaceRef(Cel.Formula, "([$]{0,1})([A-Z]{1,2})([$]{0,1})(\d{1,5})", "Abs", 2)
    Next
    With Application
        .ScreenUpdating = True
        .Calculation = xlCalculationAutomatic
    End With
End Sub

Function ReplaceRef(str1 As String, Patrn As String, replStr As String, FormulaRef As Integer)
Dim Reg
Dim RegEx
Dim Subm
Dim NewForm    As String

    Set RegEx = CreateObject("vbscript.regexp")       ' Create regular expression.
    RegEx.Global = True
    RegEx.Pattern = Patrn                             ' Set pattern.
    RegEx.IgnoreCase = True                           ' Make case insensitive.
    Set Reg = RegEx.Execute(str1)
    If Reg.Count < FormulaRef Then                    'If the formula is not long enough then exit
        ReplaceRef = str1
        Exit Function
    End If
    'the first submatch is (0) so submatch n = Reg(n-1)
    Set Subm = Reg(FormulaRef - 1).submatches
    Select Case Application.WorksheetFunction.Proper(replStr)
        Case "Rel"
            NewForm = Subm(1) & Subm(3)
        Case "Abs"
            NewForm = "$" & Subm(1) & "$" & Subm(3)
        Case "Col"
            NewForm = "$" & Subm(1) & Subm(3)
        Case "Row"
            NewForm = Subm(1) & "$" & Subm(3)
        Case Else
            'conversion input was invalid
            ReplaceRef = str1
            Exit Function
    End Select
    ReplaceRef = Application.WorksheetFunction.Replace(str1, Reg(FormulaRef - 1).FirstIndex + 1, Len(Reg(FormulaRef - 1)), NewForm)
End Function
 
Upvote 0
Hi vacation:

In addition to beautiful contribution from brettdj, you may also want to try ...
Code:
Sub yWriteAlphaNumericOnlyFromColumnCtoColumnE()
    For Each cell In Intersect(ActiveSheet.UsedRange, Columns("c"))
        For j = 1 To Len(cell)
            If Asc(Mid(cell, j, 1)) > 47 And Asc(Mid(cell, j, 1)) < 59 Or _
                Asc(Mid(cell, j, 1)) > 64 And Asc(Mid(cell, j, 1)) < 91 Then _
                ylet = ylet & Mid(cell, j, 1)
            cell.Offset(0, 2) = ylet
        Next
        ylet = ""
    Next
End Sub
 
Upvote 0
Thanks again, brettdj. Excellent info.

I'd tried using [A-Z] at some point, but of course it didn't work.

Yogi! That's the kind of function I was looking for before I wrote my code!!

I kept thinking if there is an IsNumeric function there ought to be an IsAlpha function...No such thing. I can't imagine why.

This turned out to be a very informative post for me. I hope vacation got the help s/he needs.

Thank you very much.

Merry CHRISTmas!!

Tana-Lee
 
Upvote 0
As Dave demonstrated, an overlooked capability is the use of Regular Expressions. To use a regexp embedded in a UDF see my post in http://www.mrexcel.com/board2/viewtopic.php?t=52283&start=10

You can then use those functions in an XL formula such as =RegExpSubstitute(A1,"[^\w+]","") to get the result you want.
vacation said:
Hi,

{snip}

I would like column E to contain data from the corresponding cells in
column C but exclude all the non-alphanumeric characters.
So the above column C data would appear in column E as:

{snip}
What combination of nested functions should I assign to each cell in column E to achieve this?


Thanks.
 
Upvote 0

Forum statistics

Threads
1,214,967
Messages
6,122,503
Members
449,090
Latest member
RandomExceller01

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top