PowerPivot model: Duplicate rows adding up to over inflated total.

HarryR1

New Member
Joined
Dec 8, 2014
Messages
18
Hi all
I am new to PowerPivot and data models, and I am querying an SSAS cube in a PowerPivot data model. I have a problem with duplicate names inflating figures. The duplicates come from a Project Server SSAS cube.
RowProject NameCost
11004 - Parent Project£500
21004 - Sub Project£200
31004 - Sub Project with same name as Parent Project£300

<tbody>
</tbody>
Rows 2 and 3 are both sub projects that make up the whole project that is totalled up in row 1. However, when looking a the results in PowerPivot, the model is adding all these three rows separately, producing a total figure of £1000.
How can I get the model to a)recognise the hierarchy from Project Server. it does if querying the cube directly in excel into a pivot chart but not in a PowerPivot model.
Is there a DAX or MDX calculation I can use to only select or count the highest figure of any duplicates rows with the same name?
I understand it's a pretty complex question but hoping someone can help.
Thanks in advance.
Harry
 

Excel Facts

Ambidextrous Undo
Undo last command with Ctrl+Z or Alt+Backspace. If you use the Undo icon in the QAT, open the drop-down arrow to undo up to 100 steps.
So, in the your sample, there is just NOTHING that makes it clear there is a parent/child relationship between those rows. You have two choices... (at least :))
1) What you are referencing... create a calculated column which only has a value if it's the highest value of its grouping.
2) actually bring in some parent/child info. (You seem to have a Row# column is there a ParentRow you can bring in?)

the first should generally be easier, but ... does have some risks. Negative #'s could really mess w/ you and ... a parent w/ just 1 child... both would have the same value. Need to be careful not to include both in that scenario. (So, logic that is like... "only include this node if its the largest in the set..." would typically include both which isn't what you want).

We can probably sleeze our way around the 2nd issue.... the negative issues seems harder.
Parent 500
Kid1 600
Kid2 -100

We would choose Kid1, since it is largest.

Is that an issue for you?
 
Upvote 0
So, in the your sample, there is just NOTHING that makes it clear there is a parent/child relationship between those rows. You have two choices... (at least :))
1) What you are referencing... create a calculated column which only has a value if it's the highest value of its grouping.
2) actually bring in some parent/child info. (You seem to have a Row# column is there a ParentRow you can bring in?)

the first should generally be easier, but ... does have some risks. Negative #'s could really mess w/ you and ... a parent w/ just 1 child... both would have the same value. Need to be careful not to include both in that scenario. (So, logic that is like... "only include this node if its the largest in the set..." would typically include both which isn't what you want).

We can probably sleeze our way around the 2nd issue.... the negative issues seems harder.
Parent 500
Kid1 600
Kid2 -100

We would choose Kid1, since it is largest.

Is that an issue for you?

scottsen

Thanks for your reply.

The row numbers are just for illustrative purposes, there actually isn't anything else apart from the project title that identifies a project (lazy cube work in my opinion).

Creating a calculated column that only picks the highest of the rows with the same name sounds ideal, except I don't know yet how to write a formula that compares the names of a rows, identifies duplicates and only shows the value of the highest one in the set.

Could you point me in the direction of a function/formula or resource area where I can research how to do it?

Thanks
 
Upvote 0
Yep, I can help there :)

So, no issues w/ negative numbers or "just 1 sub-project" to worry about?

Something like:
Code:
=MyTable[Cost] = 
     CALCULATE(MAX(MyTable[Cost]), ALL(MyTable), MyTable[Project Name] = EARLIER(MyTable[Project Name]))
 
Upvote 0
I had a similar issue last week with an event log and duplicate values for coverage_id's. I made the two Measures below.

MaxTime:=MAX(Events[TimingMinutes])
DistinctAvgTimeMinutes:=AVERAGEX(DISTINCT(Events[coverage_id]),[MaxTime])
 
Upvote 0
Yep, I can help there :)

So, no issues w/ negative numbers or "just 1 sub-project" to worry about?

Something like:
Code:
=MyTable[Cost] = 
     CALCULATE(MAX(MyTable[Cost]), ALL(MyTable), MyTable[Project Name] = EARLIER(MyTable[Project Name]))

Thanks a lot

I am going to give those a whirl and will update the thread with my findings. Thanks again.
 
Upvote 0
Yep, I can help there :)

So, no issues w/ negative numbers or "just 1 sub-project" to worry about?

Something like:
Code:
=MyTable[Cost] = 
     CALCULATE(MAX(MyTable[Cost]), ALL(MyTable), MyTable[Project Name] = EARLIER(MyTable[Project Name]))

Hi Scottsen

That worked great in that it marked a 'true' next to unique names and a 'false' next to repating project names with a lower value.

Now I just need to call these into another table but only the ones that are marked as true. Could you advise me how to do that?

BTW: This has inspired me to dive into learning as much DAX as I can :).


Thanks

Harry
 
Upvote 0
To be more specific:

I am using this solution in a table that acts as a lookup for a column in another table.

How can I amend the lookup in my target table's calculated column to only fetch the value from the project name marked with a 'True' value?

Lookup Table.
Project A: 100
Project A: 50

Target Table.
Project A: 100

Thanks again.
 
Last edited:
Upvote 0
I get pretty nervous when you start copying data around between tables. Sooome reference to that in my most recent blog post... it's a vlookup/old-skool-excel thing to do.

The theory is that you just need to apply a filter on that new column... to only include the ones that are marked true. The application of said theoy will depend on what you are trying to do :)

LOOKUPVALUE will probaby work for you. It can take multiple columns as input... one of which would be the "IsTheMax" column with a value of True.
 
Upvote 0

Forum statistics

Threads
1,214,788
Messages
6,121,575
Members
449,039
Latest member
Arbind kumar

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top