I've inherited a spreadsheet for a golf pool that I'm trying to generalize. Since the golfer and participant names change, I'd like to reference columns by their index number for certain functions instead of relying on hard-coded names.
Participants select who they're backing. I have generic participant names here for the latest Open golfers instead of "Bob", "Fred", etc.
I use this as a connection. Before bumping against the data from The Open - Golf Leaderboard and Results - ESPN I want to eliminate all the golfers whom nobody picked. If no one picked Aaron Jarvis why keep him in the query? In other words, where the values for every player column in a row are null, delete that row. I do this by creating a new column that merges the values from the player columns, and rows with nothing but null values are removed.
The M wizard hard-codes the column names. I'd like to be able to replace [Player 1] - or [Bob] for that matter - with something along the lines of Table.Columns {1} so that the participant name doesn't matter. I've tried variations of Table.ColumnNames() but it just gives me the column name.
Yes, I could unpivot the participant names, which I in fact do at a later stage, but I'd like to adapt this technique to non-golf larger data sets. If I can drastically reduce the number of rows before running a join query that would greatly help query speed on much larger data.
Participants select who they're backing. I have generic participant names here for the latest Open golfers instead of "Bob", "Fred", etc.
2022 MCC Masters.xlsx | ||||||
---|---|---|---|---|---|---|
C | D | E | F | |||
4 | Golfer | Player 1 | Player 2 | Player 3 | ||
5 | Aaron Jarvis (a) | |||||
6 | Aaron Rai | |||||
7 | Aaron Wise | |||||
8 | Abraham Ancer | |||||
9 | Adam Scott | x | x | |||
10 | Adri Arnaus | |||||
11 | Adrian Meronk | |||||
12 | Aldrich Potgieter (a) | x | ||||
13 | Alex Wrigley | |||||
14 | Alexander Björk | |||||
15 | Anthony Quayle | |||||
Player Selection |
I use this as a connection. Before bumping against the data from The Open - Golf Leaderboard and Results - ESPN I want to eliminate all the golfers whom nobody picked. If no one picked Aaron Jarvis why keep him in the query? In other words, where the values for every player column in a row are null, delete that row. I do this by creating a new column that merges the values from the player columns, and rows with nothing but null values are removed.
Power Query:
MergeCols = Table.AddColumn(ChangeColTypes, "AllSelect",
each Text.Combine(
{ Text.From([Player 1], "en-US"),
Text.From([Player 2], "en-US"),
Text.From([Player 3], "en-US"),
Text.From([Player 4], "en-US"),
Text.From([Player 5], "en-US"),
Text.From([Player 6], "en-US"),
Text.From([Player 7], "en-US"),
Text.From([Player 8], "en-US"),
Text.From([Player 9], "en-US"),
Text.From([Player 10], "en-US"),
Text.From([Player 11], "en-US"),
Text.From([Player 12], "en-US"),
Text.From([Player 13], "en-US"),
Text.From([Player 14], "en-US")
},
""),
type text),
ReplaceSpaces = Table.ReplaceValue(MergeCols," ",null,Replacer.ReplaceValue,{"AllSelect"}),
RemoveUnselectedGolfers = Table.SelectRows(ReplaceSpaces, each [AllSelect] <> null and [AllSelect] <> ""),
The M wizard hard-codes the column names. I'd like to be able to replace [Player 1] - or [Bob] for that matter - with something along the lines of Table.Columns {1} so that the participant name doesn't matter. I've tried variations of Table.ColumnNames() but it just gives me the column name.
Yes, I could unpivot the participant names, which I in fact do at a later stage, but I'd like to adapt this technique to non-golf larger data sets. If I can drastically reduce the number of rows before running a join query that would greatly help query speed on much larger data.