Formula to extract upper case words in a text string

spbcson

New Member
Joined
Feb 11, 2013
Messages
15
IS THERE A FORMULA TO EXTRACT A UPPER CASE FROM A TEXT STRING IN A CELL? Example in cell A1 i have this text string:
//-- y MICROSOFT EXCEL computer software June 23489​

How can I get only the words "MICROSOFT EXCEL"
 
To allow numbers, just add the 0-9 that I show in red below...
Code:
Function UpperCaseWords(S As String) As String
  Dim X As Long, TempText As String
  TempText = " " & S & " "
  For X = 2 To Len(TempText) - 1
    If Mid(TempText, X, 1) Like "[!A-Z[B][COLOR="#FF0000"]0-9[/COLOR][/B] ]" Or Mid(TempText, X - 1, 3) Like "[!A-Z][A-Z][!A-Z]" Then
      Mid(TempText, X) = " "
    End If
  Next
  UpperCaseWords = Application.Trim(TempText)
End Function
Hi Rick,

I know this post was from a long time ago, but this VBA code is very similar to what I need for my problem.

In a cell, I have a list of genes (capital letters, sometimes also containing numbers in random locations) with their descriptions after them. I need to extract only the full gene names, separated by something (like a space) so I can parse them into individual cells with other excel functions. For example, I have the following:

[CACNG2 - calcium channel, voltage-dependent, gamma subunit 2, STXBP5L - syntaxin binding protein 5-like, SLC8A3 - solute carrier family 8 (sodium/calcium exchanger), member 3, RIMS1 - regulating synaptic membrane exocytosis 1, AKAP12 - a kinase (prka) anchor protein 12

after extraction should look like:

CACNG2 STXBP5L SLC8A3 RIMS1 AKAP12

the current VBA code (incorrectly) yields:

CACNG2 2 STXBP5 5 SLC8 3 8 3 RIMS1 1 AKAP12

Is there anyway to modify the code to extract these gene names?

Thank you so much for any help you can give.
 
Upvote 0

Excel Facts

Select all contiguous cells
Pressing Ctrl+* (asterisk) will select the "current region" - all contiguous cells in all directions.
@nickfiore, Does this function (note I changed the function's name for you) do what you want...
VBA Code:
Function Genes(S As String) As String
  Dim X As Long, TempText As String, Arr() As String
  TempText = " " & S & " "
  For X = 2 To Len(TempText) - 1
    If Mid(TempText, X, 1) Like "[!A-Z0-9]" Or Mid(TempText, X - 1, 3) Like "[!A-Z][A-Z][!A-Z]" Then Mid(TempText, X) = " "
  Next
  Arr = Split(Application.Trim(TempText))
  For X = 0 To UBound(Arr)
    If Not Arr(X) Like "*[!0-9]*" Then Arr(X) = ""
  Next
  Genes = Application.Trim(Join(Arr))
End Function
 
Last edited:
Upvote 0
@nickfiore, Does this function (note I changed the function's name for you) do what you want...
VBA Code:
Function Genes(S As String) As String
  Dim X As Long, TempText As String, Arr() As String
  TempText = " " & S & " "
  For X = 2 To Len(TempText) - 1
    If Mid(TempText, X, 1) Like "[!A-Z0-9]" Or Mid(TempText, X - 1, 3) Like "[!A-Z][A-Z][!A-Z]" Then Mid(TempText, X) = " "
  Next
  Arr = Split(Application.Trim(TempText))
  For X = 0 To UBound(Arr)
    If Not Arr(X) Like "*[!0-9]*" Then Arr(X) = ""
  Next
  Genes = Application.Trim(Join(Arr))
End Function
Hi Rick,

Thanks for responding so quickly! It seems to work with most genes, but the script seems to exclude any letters after a number (if a number is present in the gene). For instance, with the an input of: "STXBP5L - syntaxin binding protein 5-like, SLC8A3 - solute carrier family 8 (sodium/calcium exchanger)"

I am getting this as a result: "STXBP5 SLC8", where the "L" in the first gene and the "A3" in the second gene are chopped off the ends of the full gene name.

Other than this issue, the code works beautifully.
 
Upvote 0
Actually, I think I may have to rework the code more than I originally thought. Let me think about this a little bit (I'm about to have my lunch right now) and get back to you.
 
Upvote 0
Hi Rick,

I really appreciate the hard work, I don't have enough words to thank you. Enjoy your lunch! Looking forward to hearing back.
 
Upvote 0
Okay, give this function a try...
VBA Code:
Function Genes(S As String) As String
  Dim X As Long, TempText As String, Arr() As String
  TempText = " " & S & " "
  For X = 2 To Len(TempText) - 1
    If Mid(TempText, X, 1) Like "[!A-Z0-9]" Or Mid(TempText, X - 1, 3) Like "[!A-Z0-9][A-Z0-9][!A-Z0-9]" Then Mid(TempText, X) = " "
  Next
  Arr = Split(Application.Trim(TempText))
  For X = 0 To UBound(Arr)
    If Not Arr(X) Like "*[!0-9]*" Then Arr(X) = ""
  Next
  Genes = Application.Trim(Join(Arr))
End Function
 
Upvote 0
Okay, give this function a try...
VBA Code:
Function Genes(S As String) As String
  Dim X As Long, TempText As String, Arr() As String
  TempText = " " & S & " "
  For X = 2 To Len(TempText) - 1
    If Mid(TempText, X, 1) Like "[!A-Z0-9]" Or Mid(TempText, X - 1, 3) Like "[!A-Z0-9][A-Z0-9][!A-Z0-9]" Then Mid(TempText, X) = " "
  Next
  Arr = Split(Application.Trim(TempText))
  For X = 0 To UBound(Arr)
    If Not Arr(X) Like "*[!0-9]*" Then Arr(X) = ""
  Next
  Genes = Application.Trim(Join(Arr))
End Function
It works! Thank you for everything. Much appreciated.
 
Upvote 0
Hi,

Wondering how to extract uppercase words with a sign "&" or "." for example "Receive 10000 P.I.E INDUSTRIAL at 3.220 per lot" or "Send 2200 D&O GREEN TECH at 5.900 per lot"

Thank you so much for any help you can give.
 
Upvote 0
Hi,

Wondering how to extract uppercase words with a sign "&" or "." for example "Receive 10000 P.I.E INDUSTRIAL at 3.220 per lot" or "Send 2200 D&O GREEN TECH at 5.900 per lot"

Thank you so much for any help you can give.
Show us the answer you expect from that text string.
 
Upvote 0
"Receive 10000 P.I.E INDUSTRIAL at 3.220 per lot" the answer should be P.I.E INDUSTRIAL and "Send 2200 D&O GREEN TECH at 5.900 per lot" should be D&O GREEN TECH
Thanks
 
Upvote 0

Forum statistics

Threads
1,216,172
Messages
6,129,289
Members
449,498
Latest member
Lee_ray

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top