Difference vs Previous Row & DistinctCount

ag007

New Member
Joined
Apr 8, 2015
Messages
36
I want to calculate a Delta Weeks column in Power Query WeekNum[current row] - WeekNum[previous row]
I found a way to do it using the [Index] column, but it is painfully slow, and my table is 100k rows.

Code:
[COLOR=#000000][FONT=inherit]let 
[/FONT][/COLOR][COLOR=#660066][FONT=inherit]    Source[/FONT][/COLOR][COLOR=#666600][FONT=inherit]=[/FONT][/COLOR][COLOR=#660066][FONT=inherit]Excel[/FONT][/COLOR][COLOR=#666600][FONT=inherit].[/FONT][/COLOR][COLOR=#660066][FONT=inherit]CurrentWorkbook[/FONT][/COLOR][COLOR=#666600][FONT=inherit](){[[/FONT][/COLOR][COLOR=#660066][FONT=inherit]Name[/FONT][/COLOR][COLOR=#666600][FONT=inherit]=[/FONT][/COLOR][COLOR=#008800][FONT=inherit]"Table3"[/FONT][/COLOR][COLOR=#666600][FONT=inherit]]}[[/FONT][/COLOR][COLOR=#660066][FONT=inherit]Content[/FONT][/COLOR][COLOR=#666600][FONT=inherit]],[/FONT][/COLOR][COLOR=#000000][FONT=inherit]
    [/FONT][/COLOR][COLOR=#880000][FONT=inherit]#"Changed Type" = Table.TransformColumnTypes(Source,{{"Customer", type text}, {"Product", type text}, {"WeekNum", Int64.Type}}),[/FONT][/COLOR][COLOR=#000000][FONT=inherit]

    [/FONT][/COLOR][COLOR=#880000][FONT=inherit]#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),[/FONT][/COLOR][COLOR=#000000][FONT=inherit]
    [/FONT][/COLOR][COLOR=#880000][FONT=inherit]#"Added Custom" = Table.AddColumn(#"Added Index", "Delta Weeks", each try Source[WeekNum]{[Index]} - Source[WeekNum]{[Index]-1} otherwise 0)[/FONT][/COLOR][COLOR=#000000][FONT=inherit]

[/FONT][/COLOR][COLOR=#000088][FONT=inherit]in
[/FONT][/COLOR][COLOR=#880000][FONT='inherit']#"Added Custom"[/FONT][/COLOR]



Also, after this, I need another column who would count the distinct values from the beginning up to that row.
Most of the weeks are consecutive, so basically the distinct count will increase when they are not.
(I don't know how to do this in Power Query).

 
Last edited:

Excel Facts

Is there a shortcut key for strikethrough?
Ctrl+S is used for Save. Ctrl+5 is used for Strikethrough. Why Ctrl+5? When you use hashmarks to count |||| is 4, strike through to mean 5.
Hi there,

I don't normally attempt these sorts of Power Query questions, but had a go at this one.
Also not quite sure why you need to build the table this way, but that's fine.

No idea if this is the best way, but performance should be acceptable on 100k rows :)

  • To get the previous row's WeekNum, I used the old "merge table with itself" trick.
  • To get the 'cumulative distinct count' of Delta Weeks, I counted how many "first occurrences" of Delta Weeks have Indices on or before the current row.
Code:
let 
    Source=Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Customer", type text}, {"Product", type text}, {"WeekNum", Int64.Type}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),

    // Add an "Index-1" column, floored at zero
    AddedIndexMinus1 = Table.AddColumn(#"Added Index", "Index-1", each List.Max({0,[Index]-1}), Int64.Type),

    // Merge the table with itself by joining on "Index" = "Index-1"
    MergedWithSelf = Table.NestedJoin(AddedIndexMinus1,{"Index-1"},AddedIndexMinus1,{"Index"},"NewColumn",JoinKind.LeftOuter),

    // Expand Previous Week Number and calculate Delta Weeks
    ExpandedPrevWeekNum = Table.ExpandTableColumn(MergedWithSelf, "NewColumn", {"WeekNum"}, {"PrevWeekNum"}),
    AddedDeltaWeeks = Table.AddColumn(ExpandedPrevWeekNum, "Delta Weeks", each [WeekNum] - [PrevWeekNum], Int64.Type),

    // Clean up
    RemovedColumns = Table.RemoveColumns(AddedDeltaWeeks,{"Index-1", "PrevWeekNum"}),

    // Sort by Index & buffer to prepare for Cumulative Distinct
    SortedByIndex = Table.Buffer(Table.Sort(RemovedColumns,{{"Index", Order.Ascending}})),

    // Get the first index of each Delta Weeks value & buffer
    FirstIndicesExcludingZero = List.Buffer(List.Difference( Table.Group(SortedByIndex, {"Delta Weeks"}, {{"FirstIndex", each List.Min(_[Index]), Int64.Type}})[FirstIndex], {0})),

    // Calculate Cumulative Distinct and correct row with Index=0 to match your sample output
    AddedCumulativeDistinct = Table.AddColumn(
       SortedByIndex,
       "Count Distinct",
       each if [Delta Weeks]=0 then 1 else let CurrentIndex = [Index] in List.Count(List.Select(FirstIndicesExcludingZero, each _ <= CurrentIndex)),
       Int64.Type
    )
in
    AddedCumulativeDistinct

Any Power Query experts out there care to optimize? :)

Owen
 
Upvote 0

Forum statistics

Threads
1,215,984
Messages
6,128,110
Members
449,421
Latest member
AussieHobbo

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top