Is using Data from Excel slowing my data model refresh?

bigck2

Board Regular
Joined
Feb 20, 2014
Messages
147
Hello community,

I'm curious if building a data model off data directly from a table in excel could cause my data model to perform slower than if I got the data directly from a SQL Server query?

I have a data model that I built for work, it has one Fact table that is 48 Columns and about 10,000 rows. I then have 3 Dimension tables, each one is relatively small. A dates table that covers 5 years, and then two other dim tables that are less than 100 rows. Lastly I have about 50 floating tables without relationships to the Fact of Dimension tables. I set these up to associate different calculated fields together.

Each month, I manually add some data to the Fact table and may have to change a few values in one of the dimension tables. After I make the changes I hit the 'update all' button and then have to wait for like 10 minutes for the data model to calculate.

I have built other data models with Fact and Dimension tables based on SQL queries, and sometimes those Fact tables are much much larger (~1m rows). When I refresh the query and refresh the data model it doesn't seem to take nearly as long.

So I'm just trying to confirm if building a data model from data in excel is slower? Maybe there is something else wrong with my data model. Or maybe this is normal, and my expectations are unrealistic, ie it should take this long to re-calculate everything.

Thanks for any suggestions.

-Chris
 

Excel Facts

Return population for a City
If you have a list of cities in A2:A100, use Data, Geography. Then =A2.Population and copy down.
Hey Matt,

Thank you for the suggestion, and the link resource.

The data I am using is Financial data, and each column represents a particular GL Account. Here is a small example, but in my real model there are about 50 columns:

id_propdt_dateTotal_RevenueTotal_ExpenseTotal_NOI
ab11907/01/2017500000100000400000
bb12307/01/201725000075000175000

<tbody>
</tbody>

This structure made sense to me because when I work with regular pivot table it is usually best for the data to be organized like this. I am used to thinking about "tidy data" principles: Each variable forms a column. Each observation forms a row. Each type of observational unit forms a table.

This data format was based on a VBA script that cleans up an income statement report in excel, but is not the way the data is stored in our database. In the database the data is stored like in a "tall" fashion with fewer columns and more rows:


id_propdt_dateid_acctdec_amount
ab1197/1/201729542580.35
ab1197/1/2017305715001.25
ab1197/1/2017332733.75
ab1197/1/201729918100.00

<tbody>
</tbody>


The way the data is presented in this table above doesn't seem "tidy" to me because multiple variables are stored in one column.

However, as I have learned more about PowerPivot this structure seems better because it uses a tall table with more rows than columns. To get the financial totals I need I can use measures:

Total_Revenue:= CALCULATE( SUM( dec_amount ), Fact_Data[id_acct] = 2589 )


For this second structure of the data/Fact Table, is this what you would recommend for financial data like this?


In my data model there aren't that many calculated columns. I currently have 4, but even if I change the data structure from wide to tall, I think I will need 2. I need to calculate my companies proportionate share of these amounts:


id_propdt_dateid_acctdec_amountownershippro_rata_amount
ab1197/1/201729542580.350.551419.19
ab1197/1/2017305715001.250.558250.68
ab1197/1/2017332733.750.5518.56
ab1197/1/201729918100.000.554455

<tbody>
</tbody>


This way I can create a measure to calculate proportionate / pro rata amounts:

Pro_Rata_Total_Revenue:= CALCULATE( SUM( pro_rata_amount ), Fact_Data[id_acct] = 2589 )


I don't think it is possible to get these numbers correct without using calculated columns. Although it does worry me a little bit based on your article that this would be very computationally expensive to have this calculated column, especially considering it would be a much taller table with more rows, if I use this second data format.


Thanks again for your input, and all your posts online. I can't tell you how many times your articles have helped me.

Cheers,

Chris
 
Upvote 0
Long and thin tables are the way to go. Don't worry about "tidy", just worry about compression, efficiency and the DAX. I would always put GL data with an account column indicating the account, and a single value column.

2 Calc columns in a long thin table is fine. But maybe you only need 1 - not sure. Is the ownership always 0.55? Anyway, I think 2 wont hurt, so don't worry about it. 50 is a problem, 6 is is not. And don't worry about 'computational expensive' for calc columns - this is not the issue. The issue is compression. The reason you WOULD use a calc column is if it were computational expensive. in other words, if it is slow to calc at run time, then a calc column maybe preferred.
 
Upvote 0

Forum statistics

Threads
1,216,739
Messages
6,132,442
Members
449,728
Latest member
teodora bocarski

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top