Find "brand" keywords in free text fields

TrimFunction

New Member
Joined
Jan 9, 2018
Messages
18
background: I'm building a reference file of branded product names. My aim is to have a clean list that can be used to search through a free text fields from a CRM system.

I want to find brand-names, but do not want common words like "account" or "management", since that will return almost every entry. the list of terms i search for is usually about 300+, using {=INDEX($B$2:$B$300,MATCH(1,COUNTIF(I3,"*"&$A$2:$A$300&"*"),0))}.

Question: What is the best way to "scrub" out the brand names from free text?

my current process is start with the list:

Original Product Name
Accounts Receivable, JD Edwards EnterpriseOne
IBM Kenexa Employee Assessments
Catalog Management, PeopleSoft
SAP Ariba Invoice Management

Text to columns, sort, then manually read through to determine which are "brand" names:

Column A Column B
AccountsSKIP-Common
AribaUSE
AssessmentsSKIP-Common
CatalogSKIP-Common
EdwardsUSE
EmployeeSKIP-Common
EnterpriseOneUSE
IBMUSE
InvoiceSKIP-Common
JDUSE
KenexaUSE
ManagementSKIP-Common
Management,SKIP-Common
PeopleSoftUSE
Receivable,SKIP-Common
SAPUSE

<colgroup><col><col></colgroup><tbody>
</tbody>

I've been using countifs to determine how frequent the word occurs and LEN to quickly go through short words, but this is not efficient.

Is there a better way to do this?

<colgroup><col></colgroup><tbody>
</tbody>
 

Excel Facts

Can you AutoAverage in Excel?
There is a drop-down next to the AutoSum symbol. Open the drop-down to choose AVERAGE, COUNT, MAX, or MIN
Is there a better way to do this?
Probably.

But from what you have posted, I can't really tell what you have, where, and exactly what you are trying to do.

Can you give us some small dummy sample data and the expected results and clearly show or describe where various data is and where the results should go?
Better (clearer) ways to present your sample data are given in a link in my signature block below.
 
Upvote 0
How about PowerQuery (Get&Transform) ?

Table1Table2 (user defined)Result table
Product nameCommonKeywords
Accounts Receivable, JD Edwards EnterpriseOneAccountsAriba
IBM Kenexa Employee AssessmentsAssessmentsEdwards
Catalog Management, PeopleSoftCatalogEnterpriseOne
SAP Ariba Invoice ManagementEmployeeIBM
InvoiceJD
ManagementKenexa
ReceivablePeopleSoft
SAP

Table1: source
Table2: user-defined words that should not be taken into account

for table1:
Code:
[SIZE=1]let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Replaced Value" = Table.ReplaceValue(Source,",","",Replacer.ReplaceText,{"Product name"}),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Replaced Value", {{"Product name", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Product name")
in
    #"Split Column by Delimiter"[/SIZE]
for result table:
Code:
[SIZE=1]let
    Source = Table.NestedJoin(Table1,{"Product name"},Table2,{"Common"},"Table3",JoinKind.LeftAnti),
    #"Removed Other Columns" = Table.SelectColumns(Source,{"Product name"}),
    #"Sorted Rows" = Table.Sort(#"Removed Other Columns",{{"Product name", Order.Ascending}}),
    #"Renamed Columns" = Table.RenameColumns(#"Sorted Rows",{{"Product name", "Keywords"}})
in
    #"Renamed Columns"[/SIZE]

example file
 
Upvote 0
I forgot about duplicates, so

for result table
Code:
[SIZE=1]let
    Source = Table.NestedJoin(Table1,{"Product name"},Table2,{"Common"},"Table3",JoinKind.LeftAnti),
    #"Removed Other Columns" = Table.SelectColumns(Source,{"Product name"}),
    #"Sorted Rows" = Table.Sort(#"Removed Other Columns",{{"Product name", Order.Ascending}}),
    #"Removed Duplicates" = Table.Distinct(#"Sorted Rows"),
    #"Renamed Columns" = Table.RenameColumns(#"Removed Duplicates",{{"Product name", "Keywords"}})
in
    #"Renamed Columns"[/SIZE]
 
Upvote 0

Forum statistics

Threads
1,214,827
Messages
6,121,812
Members
449,048
Latest member
greyangel23

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top