Fuzzy matching on textual data in Power Query

absherzad

New Member
Joined
Jun 19, 2017
Messages
42
Hello everyone,
Is there a way to incorporate Microsoft Fuzzy Lookup Add-In for Excel in Power Query to perform fuzzy matching on textual data in Power Query? Or is it possible to calculate string similarity in Power Query?
 
let
Table1 = #table(type table [value = text],
{ {"spruce"}, {"cedar"}, {"birch"}, {"pine"}, {"poplar"}, {"aspen"},
{"baobab"}, {"larch"}, {"willow"} }),

Table2 = #table(type table [word = text],
{ {"sprce"}, {"cidar"}, {"pane"}, {"aspern"}, {"babab"} }),

AddCol = Table.AddColumn(Table2, "Custom", each Table1),
Expand = Table.ExpandTableColumn(AddCol, "Custom", {"value"}, {"value"}),

FuzzyCalc = Table.AddColumn(Expand, "Prct", each
2 * List.Count( List.Intersect(
{ Text.ToList([word]), Text.ToList([value]) } ) )
/ (List.Count(Text.ToList([word])) + List.Count(Text.ToList([value])))),

SortRows = Table.Sort(FuzzyCalc,{{"word", Order.Ascending}, {"Prct", Order.Descending}}),
AddIndex = Table.AddIndexColumn(SortRows, "Index", 1, 1),
RemoveDupls = Table.Distinct(AddIndex, {"word"})
in
RemoveDupls
 
Upvote 0

Excel Facts

Is there a shortcut key for strikethrough?
Ctrl+S is used for Save. Ctrl+5 is used for Strikethrough. Why Ctrl+5? When you use hashmarks to count |||| is 4, strike through to mean 5.
Andvg - another approach using Cartesian product.

let
Src_A = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
Src_B = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
AddCol_Cartesian = Table.AddColumn(Src_B, "Custom", each Src_A),
Expand = Table.ExpandTableColumn(AddCol_Cartesian, "Custom", {"Words1"}, {"Words1"}),

FuzzyCalc = Table.AddColumn(Expand, "Prct", each 2 * List.Count(
List.Intersect( { Text.ToList([Words2]), Text.ToList([Words1]) }
) ) / (List.Count(Text.ToList([Words2])) + List.Count(Text.ToList([Words1])))),

SortRows = Table.Sort(FuzzyCalc,{{"Words2", Order.Ascending}, {"Prct", Order.Descending}}),
AddIndex = Table.AddIndexColumn(SortRows, "Index", 1, 1),
RemoveDups = Table.Distinct(AddIndex, {"Words2"}),
RemoveCol = Table.RemoveColumns(RemoveDups,{"Index"})
in
RemoveCol
 
Upvote 0

Forum statistics

Threads
1,214,646
Messages
6,120,717
Members
448,985
Latest member
chocbudda

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top