VBA to return capitalized words or phrases

yits05

Board Regular
Joined
Jul 17, 2020
Messages
56
Office Version
  1. 2016
Platform
  1. Windows
Hello,

I have found a few answers on here that get me 75% of the way to what I need, but was hoping somebody could help me get it to 100%. I am looking for VBA script that will scan a single cell text string and return all words or phrases containing capital letters.

The rules I need are:
- IF the capitalized word is directly adjacent to another capitalized word(s), group them as a phrase.
- IF the capitalized word is on its own (i.e. not directly neighboring another capital word) then it is parsed on its own
- IF the word comes after a period (.), ignore.

I would also like each word or phrase to be placed in its own cell. So, for example, the result should look like this:

John Smith is joining ACME Solutions as its new CTO. He will replace Susan Johnson who is departing for ACME2John SmithACME SolutionsCTOSusan JohnsonACME2

This is the VBA I found that gets me partially there:

VBA Code:
Option Explicit
Function StrExtract(Str As String, Optional IgnoreFirst As Boolean = False) As String
Dim tmp As Variant, tmp2 As String, x As Long, y As Long

If IgnoreFirst Then x = 1
    tmp = Split(Str, " ")
For y = x To UBound(tmp)
    If tmp(y) = StrConv(tmp(y), vbProperCase) Then tmp2 = tmp2 & tmp(y) & ", "
Next
    If IgnoreFirst And tmp2 = vbNullString Then
        StrExtract = tmp(0)
    Else
        StrExtract = Replace(Left(tmp2, Len(tmp2) - 2), "?", "")
    End If
End Function

Many thanks!
 

Excel Facts

Will the fill handle fill 1, 2, 3?
Yes! Type 1 in a cell. Hold down Ctrl while you drag the fill handle.

Eric W

MrExcel MVP
Joined
Aug 18, 2015
Messages
11,555
These kind of macros are really tricky. There are ALWAYS exceptions and quirks, and it's nearly impossible to allow for them all. Nevertheless, here's a stab at it:

VBA Code:
Option Explicit

Function StrExtract(Str As String) As String
Dim tmp As Variant, wk As String, i As Long, cap As Boolean, cap2 As Boolean, GotPeriod As Boolean

    tmp = Split(WorksheetFunction.Trim(Str))
    
    GotPeriod = True
    wk = ""
    For i = 0 To UBound(tmp)
        cap = IIf(tmp(i) Like "[A-Z]*", True, False)
        cap2 = False
        If i < UBound(tmp) Then
            If tmp(i + 1) Like "[A-Z]*" Then cap2 = True
        End If
        If cap2 Then GotPeriod = False
        If cap And Not GotPeriod And tmp(i) <> "I" Then
            wk = wk & Replace(tmp(i), ".", "") & IIf(cap2 And InStr(tmp(i), ".") = 0, "|", " ")
        End If
        GotPeriod = IIf(Right(tmp(i), 1) = ".", True, False)
    Next i
    
    StrExtract = Replace(Replace(WorksheetFunction.Trim(wk), " ", ", "), "|", " ")

End Function

This gives the following results compared to your original function:

Book2
ABC
1John Smith is joining ACME Solutions as its new CTO. He will replace Susan Johnson who is departing for ACME2John, Smith, Solutions, He, Susan, JohnsonJohn Smith, ACME Solutions, CTO, Susan Johnson, ACME2
2I hate to work.I 
3I always love ABC Deals. They work so hard for me.I, Deals., , TheyABC Deals
4Julia Roberts worked with Richard Gere.Julia, Roberts, Richard, Gere.Julia Roberts, Richard Gere
5Nobody works as hard as I do.Nobody, I 
6I hired Mr. Smith to work in the Accounting department.I, Mr., Smith, AccountingMr, Accounting
Sheet6
Cell Formulas
RangeFormula
B1:B6B1=strextract1(A1)
C1:C6C1=strextract(A1)


It seems to be better, but it still has trouble with things like Mr. within the sentences.
 

Forum statistics

Threads
1,144,693
Messages
5,725,796
Members
422,640
Latest member
KazPL

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Top