Remove list of keywords in Sentence

goble

New Member
Joined
Oct 7, 2010
Messages
19
Office Version
  1. 2019
Platform
  1. Windows
Hello,

this has been asked before but my problem is a bit different. I want to get rid of some words in a column using a Macro (or formulae)

textrevised textText to remove
This is a sentence containing keywords. For Analysis, for general.sentence,keywords,analysis,generalis
this is a General Analysisgeneral,analysisthis
a
containing
for

Notes:
  1. Revised text should not contain spaces i.e word1,word2
  2. "This" (starting with capital letter) has been removed even though "this" lower case has been specified
  3. Column "Text to remove" includes the list of predefined words that will need to be removed
  4. "Analysis" has not (or part of it) been removed even though it includes "is"
  5. "." and "," should be removed but result should be delimited with ","
  6. Revised text should all be in lower case.
P.S I know I'm asking a lot ! But thanks in advance ;)
 

Peter_SSs

MrExcel MVP, Moderator
Joined
May 28, 2005
Messages
48,457
Office Version
  1. 365
Platform
  1. Windows
Is this closer to the mark?

VBA Code:
Function Remove(s As String, rRemove As Range, rSpecial As Range) As String
  Dim RX As Object, m As Object
  
  Set RX = CreateObject("VBScript.RegExp")
  RX.Global = True
  RX.IgnoreCase = True
  RX.Pattern = "\b(" & Join(Application.Transpose(rSpecial), "|") & ")\b"
  For Each m In RX.Execute(s)
    s = Replace(s, m, Replace(m, " ", "#"), 1, -1, 1)
  Next m
  RX.Pattern = "[^A-Za-z#0-9\-]"
  s = LCase(RX.Replace(s, " "))
  RX.Pattern = "\b(" & Join(Application.Transpose(rRemove), "|") & ")\b"
  Remove = Replace(Replace(Application.Trim(RX.Replace(s, "")), " ", ","), "#", " ")
End Function

goble.xlsm
ABCD
1textText to removeSpecial words
2This is a sentence containing keywords. For Analysis, for general.islas vegassentence,keywords,analysis,general
3this is a General Analysisthislos angelesgeneral,analysis
44k desktop wallpaperasalt lake city4k,desktop,wallpaper
5state-of-the-art techniquecontainingstate-of-the-art,technique
6Building in Las Vegasforbuilding,las vegas
7For Los Angeles and Las Vegas or in Salt Lake Cityinlos angeles,and,las vegas,or,salt lake city
Sheet1
Cell Formulas
RangeFormula
D2:D7D2=Remove(A2,B$2:B$7,C$2:C$4)
 

Excel Facts

Easy bullets in Excel
If you have a numeric keypad, press Alt+7 on numeric keypad to type a bullet in Excel.

goble

New Member
Joined
Oct 7, 2010
Messages
19
Office Version
  1. 2019
Platform
  1. Windows
That's incredible, thank you so much.

But I would have preferred getting them the 2 process separately. Let me tell you why.

The generated Tags have to be somewhat manually adjusted. For example:

textresult 1result 2Special WordsTo replace by
Las Vegas Flowerlas vegas,flowerlas vegas,flowersflowerflowers
Flowers in Los Angelesflowers,los angelesflowers,los angelestulipflowers
Tulip in New Yorktulip,new yorkflowers,new york

Even though the texts are correct in column A (singular, plural and a kind of flower). There shouldn't be a Tag for "flowers", one for "flower" another one for "tulip". This is where the "Replace by" Column become handy and a human intervention is needed there.

Actually when someone click on flowers, he will get a list of pages which are tagged under flowers. This is done to prevent generating too many tags on the website.
 

Peter_SSs

MrExcel MVP, Moderator
Joined
May 28, 2005
Messages
48,457
Office Version
  1. 365
Platform
  1. Windows
But I would have preferred getting them the 2 process separately.
Perhaps like this?

VBA Code:
Function Remv(s As String, rkeys As Range) As String
  With CreateObject("VBScript.RegExp")
    .Global = True
    .Pattern = "[^A-Za-z0-9'\-]"
    Remv = LCase(.Replace(s, " "))
    .Pattern = "\b(" & Join(Application.Transpose(rkeys), "|") & ")\b"
    Remv = Replace(Application.Trim(.Replace(Remv, "")), " ", ",")
  End With
End Function

Function Repl(s As String, rRepl As Range) As String
  Dim a As Variant
  Dim i As Long
  Dim tmp As String
  
  a = rRepl.Value
  tmp = "," & s & ","
  For i = 1 To UBound(a)
    tmp = Replace(tmp, "," & a(i, 1) & ",", "," & a(i, 2) & ",")
  Next i
  Repl = Mid(tmp, 2, Len(tmp) - 2)
End Function

goble.xlsm
ABCDEF
1textText to removeresult 1Special WordsTo replace byresult 2
2This is a sentence containing keywords. For Analysis, for general.issentence,keywords,analysis,generallas,vegaslas vegassentence,keywords,analysis,general
3this is a General Analysisthisgeneral,analysislos,angeleslos angelesgeneral,analysis
44k desktop wallpapera4k,desktop,wallpapersalt,lake,citysalt lake city4k,desktop,wallpapers
5state-of-the-art techniquecontainingstate-of-the-art,techniquewallpaperwallpapersstate-of-the-art,technique
6Building in Las Vegasforbuilding,las,vegasbuilding,las vegas
7For Los Angeles and Las Vegas or in Salt Lake Cityinlos,angeles,and,las,vegas,or,salt,lake,citylos angeles,and,las vegas,or,salt lake city
8This is John-Paul O'Connor's analysis!john-paul,o'connor's,analysisjohn-paul,o'connor's,analysis
9What is this? Is it a 6?what,it,6what,it,6
Sheet1
Cell Formulas
RangeFormula
C2:C9C2=Remv(A2,B$2:B$7)
F2:F9F2=Repl(C2,D$2:E$5)
 

goble

New Member
Joined
Oct 7, 2010
Messages
19
Office Version
  1. 2019
Platform
  1. Windows
Beautiful ! Thank you !

I'm incredibly amazed by the level of support you provide ! Again thank you so much !
 

Peter_SSs

MrExcel MVP, Moderator
Joined
May 28, 2005
Messages
48,457
Office Version
  1. 365
Platform
  1. Windows
You're welcome. Thanks for the follow-up. :)
 

Watch MrExcel Video

Forum statistics

Threads
1,129,279
Messages
5,635,283
Members
416,850
Latest member
Sidddharth

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Top