Fuzzy matching on textual data in Power Query

absherzad

New Member
Joined
Jun 19, 2017
Messages
42
Hello everyone,
Is there a way to incorporate Microsoft Fuzzy Lookup Add-In for Excel in Power Query to perform fuzzy matching on textual data in Power Query? Or is it possible to calculate string similarity in Power Query?
 
let
Table1 = #table(type table [value = text],
{ {"spruce"}, {"cedar"}, {"birch"}, {"pine"}, {"poplar"}, {"aspen"},
{"baobab"}, {"larch"}, {"willow"} }),

Table2 = #table(type table [word = text],
{ {"sprce"}, {"cidar"}, {"pane"}, {"aspern"}, {"babab"} }),

AddCol = Table.AddColumn(Table2, "Custom", each Table1),
Expand = Table.ExpandTableColumn(AddCol, "Custom", {"value"}, {"value"}),

FuzzyCalc = Table.AddColumn(Expand, "Prct", each
2 * List.Count( List.Intersect(
{ Text.ToList([word]), Text.ToList([value]) } ) )
/ (List.Count(Text.ToList([word])) + List.Count(Text.ToList([value])))),

SortRows = Table.Sort(FuzzyCalc,{{"word", Order.Ascending}, {"Prct", Order.Descending}}),
AddIndex = Table.AddIndexColumn(SortRows, "Index", 1, 1),
RemoveDupls = Table.Distinct(AddIndex, {"word"})
in
RemoveDupls
 
Upvote 0

Excel Facts

Show numbers in thousands?
Use a custom number format of #,##0,K. Each comma after the final 0 will divide the displayed number by another thousand
Andvg - another approach using Cartesian product.

let
Src_A = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
Src_B = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
AddCol_Cartesian = Table.AddColumn(Src_B, "Custom", each Src_A),
Expand = Table.ExpandTableColumn(AddCol_Cartesian, "Custom", {"Words1"}, {"Words1"}),

FuzzyCalc = Table.AddColumn(Expand, "Prct", each 2 * List.Count(
List.Intersect( { Text.ToList([Words2]), Text.ToList([Words1]) }
) ) / (List.Count(Text.ToList([Words2])) + List.Count(Text.ToList([Words1])))),

SortRows = Table.Sort(FuzzyCalc,{{"Words2", Order.Ascending}, {"Prct", Order.Descending}}),
AddIndex = Table.AddIndexColumn(SortRows, "Index", 1, 1),
RemoveDups = Table.Distinct(AddIndex, {"Words2"}),
RemoveCol = Table.RemoveColumns(RemoveDups,{"Index"})
in
RemoveCol
 
Upvote 0

Forum statistics

Threads
1,214,793
Messages
6,121,619
Members
449,039
Latest member
Mbone Mathonsi

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top