Fuzzy matching on textual data in Power Query

absherzad

New Member
Joined
Jun 19, 2017
Messages
42
Hello everyone,
Is there a way to incorporate Microsoft Fuzzy Lookup Add-In for Excel in Power Query to perform fuzzy matching on textual data in Power Query? Or is it possible to calculate string similarity in Power Query?
 
let
Table1 = #table(type table [value = text],
{ {"spruce"}, {"cedar"}, {"birch"}, {"pine"}, {"poplar"}, {"aspen"},
{"baobab"}, {"larch"}, {"willow"} }),

Table2 = #table(type table [word = text],
{ {"sprce"}, {"cidar"}, {"pane"}, {"aspern"}, {"babab"} }),

AddCol = Table.AddColumn(Table2, "Custom", each Table1),
Expand = Table.ExpandTableColumn(AddCol, "Custom", {"value"}, {"value"}),

FuzzyCalc = Table.AddColumn(Expand, "Prct", each
2 * List.Count( List.Intersect(
{ Text.ToList([word]), Text.ToList([value]) } ) )
/ (List.Count(Text.ToList([word])) + List.Count(Text.ToList([value])))),

SortRows = Table.Sort(FuzzyCalc,{{"word", Order.Ascending}, {"Prct", Order.Descending}}),
AddIndex = Table.AddIndexColumn(SortRows, "Index", 1, 1),
RemoveDupls = Table.Distinct(AddIndex, {"word"})
in
RemoveDupls
 
Upvote 0

Excel Facts

Save Often
If you start asking yourself if now is a good time to save your Excel workbook, the answer is Yes
Andvg - another approach using Cartesian product.

let
Src_A = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
Src_B = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
AddCol_Cartesian = Table.AddColumn(Src_B, "Custom", each Src_A),
Expand = Table.ExpandTableColumn(AddCol_Cartesian, "Custom", {"Words1"}, {"Words1"}),

FuzzyCalc = Table.AddColumn(Expand, "Prct", each 2 * List.Count(
List.Intersect( { Text.ToList([Words2]), Text.ToList([Words1]) }
) ) / (List.Count(Text.ToList([Words2])) + List.Count(Text.ToList([Words1])))),

SortRows = Table.Sort(FuzzyCalc,{{"Words2", Order.Ascending}, {"Prct", Order.Descending}}),
AddIndex = Table.AddIndexColumn(SortRows, "Index", 1, 1),
RemoveDups = Table.Distinct(AddIndex, {"Words2"}),
RemoveCol = Table.RemoveColumns(RemoveDups,{"Index"})
in
RemoveCol
 
Upvote 0

Forum statistics

Threads
1,214,927
Messages
6,122,311
Members
449,080
Latest member
jmsotelo

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top