Power Query - Alternative to hard-coding a translation table?

macfuller

Active Member
Joined
Apr 30, 2014
Messages
319
Office Version
  1. 365
Platform
  1. Windows
I frequently use a handy function (from Chris Webb's site as I recall) to translate data codes from source data into human-readable values. Like many people I get data from ERP and OLTP sources with inscrutable identifiers.
The translation values are hard-coded in the function. So if my data has "N" and I want the output column to return "North"


Code:
(Lookup as text) => 
let
    values = {
    {"N", "North"},
    {"E", "East"},
    {"S", "South"},
    {"W", "West"},
    {Lookup, "Other"}
     },
     Result = List.First(List.Select(values, each _{0}=Lookup)){1}
in
    Result

And it's called within the primary query by
AddNewDirectionCol = Table.AddColumn(RenameOldDirection, "Direction", each fnTranslateCompass([CompassPoint]), type text)


I can have 30+ rows for the translation values, with separate translation tables for separate queries. These values change more often than I'd like. I thought that if I maintain the data in a local table within the spreadsheet I can modify the spreadsheet data once and the queries will pick it up without having to adjust each query...

Code:
(SourceTable as text, LookupCode as text) => 
let
//    Get the translation table
    Source = Excel.CurrentWorkbook(){[Name=SourceTable]}[Content],
//    Rename the columns in case they aren't standard
    RenameCols = Table.RenameColumns(Source,{{Table.ColumnNames(Source){0}, "Index"}, {Table.ColumnNames(Source){1}, "Value"}}),
     Result = List.First(Table.Column(Table.SelectRows(RenameCols, each [Index] = LookupCode),"Value"))
in
    Result
This works fine as a generic function, but of course it's going to load the translation table for every line of the source data where the translation is happening. That will push the query time into next week for million+ data sources.

Is there a way to call the translation table just once so that the function within the query can re-use the table for every line of the primary query? Or am I stuck with hard-coding each separate translation function?
 

Excel Facts

What do {} around a formula in the formula bar mean?
{Formula} means the formula was entered using Ctrl+Shift+Enter signifying an old-style array formula.
:)
Yes, I can load the table in the primary query and have the function parameter as a table instead of a string. But I was kind of hoping I could have a self-contained function to keep the logic entirely in one place.
 
Upvote 0
Hi macfuller,

Regardless of how you use functions, I would think a Merge Queries operation (Table.NestedJoin) would be the most efficient way of looking up values for every row of your original table.

You would have to do this for every translation you have to perform, so you would need a set of translation tables.

If you wanted to you could write a function that works something like:

Code:
TableWithTranslatedColumn = fn_TranslateColumn( OriginalTable, OriginalColumnName, TranslationTable)
Please post back if needed - you may well have a good solution already! :)

Regards,
Owen
 
Upvote 0
Merges seem to take a very long time on large record sets, though I haven't done a test to compare.

As part of that question I would be curious to know if a query processes all operations on a row before moving to the next, thus having to iterate through the record set only once, instead of iteration for each alteration. For example, if I'm creating a calendar table and I have a couple of added columns for month name (Sep) and workday (Mon-Fri, Y/N), even though the query steps would suggest it iterates once for each column it would be tons faster if it did both operations at the same time. In that way, I think a function to add a lookup value would be faster than a merge.
 
Upvote 0
5.6 million as of this week. There are quite a few other operations performed on the table as well so I haven't split out just the initial merge operation. And with 3 or 4 lookup activities for the various columns coming with codes from the ERP system that would be 3-4 merges I believe.

Another question on the merge is how it would handle the - occasional - missing values. With the lookup function I can enter a default such as "Other" but with a merge I think the values are blanks? On some of the columns a blank could imply an unknown code, but on other columns blanks indicate something else so I would need a default for when a code value is unknown rather than missing.
 
Upvote 0

Forum statistics

Threads
1,214,812
Messages
6,121,699
Members
449,048
Latest member
81jamesacct

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top