Is upgrading to Office 2013 Pro the answer to my memory problems?

masplin

Active Member
Joined
May 10, 2010
Messages
413
Hi

I have 64-bit 16GB Ram machine that is constantly running out of memory on a big messy project I'm working on. The underlying data is coming form software written in Pascal and it a right told mess!!! i am having to do an enormous amount of filter calculation which I assume where the problem lies. I have read that the new Office 2013 is a massive improvement specifically in this area.

I was thinking I had to buy a much more powerful PC, but sounds like I should be trying the Office 2013 version first. I understand as an at home consultant this is going to be tricky as i have no interest in paying £10 a month for Office 365. So the question is could I go down the volume licensing route mentioned in another post even though I am based in the UK? Has anything changed on how a lowly individual can get hold of this?

Thanks for any advice

Mike
 
Apologies in that this is partially my fault - I was focused on getting you that single table I wasn't really thinking about the wider picture!

Fundamentally 1.5M records is a data set that PowerPivot should be able to handle with ease but when its say 20 columns wide then it runs into problems - have a look at Rob's post on this issue: Less Columns, More Rows = More Speed! « PowerPivotPro

Your fact table (the 1.5m rows) needs to have as few columns as possible with the other information sat in separate table e.g. the only customer info you have in the fact table is customercode, this is then related to a customer table on the customer code that has one record per customer with all the customer data residing in this 'look-up' table.

Obviously to make this work you are going to have to get access to the data in a slightly different way which sounds like it could be problematic.
 
Upvote 0

Excel Facts

Repeat Last Command
Pressing F4 adds dollar signs when editing a formula. When not editing, F4 repeats last command.
Hi Jacob I obviosuly need some education on reorgansiing the data as best I can within my limitiations. Before I try and explain what i have I need to walk the dogs. However quick question. Why does it take 15 minutes and 10GB of RAM for me to hide or unhide a sheet? This makes no sense as sure this doesn't involve any calculations. Is this reasonable no matter how rubbish the data structure is? I'm trying to work out if the problems are all in the workbook or something structural. I'm sure 2010 doesn't recalculate the whole workbook every time you make changes to the powerpivot sheet so is there an option to turn this off?

mike
 
Upvote 0
Ok so maybe i have gone on a bit of a wild goose chase with what i asked them to do for me. If you coudl give me some advice on the best way to arrange the data that would be appreciated.

So as background this data is all to do with car servicing. the table i have at the moment consists of individual invoice lines with each invoice containing up to 20 lines for parts etc. There is no customer table as all the information is buried in the invoices. So in this table I have a list of non-unique invoice numbers and then say 10 columns of transaction data that is unique to that line i.e. a part and its price/quantity. Then I have maybe 10-15 more columns of invoice specific data such as customers name, address, car reg num etc so this data is duplicated say 20 times on a 20 line invoice. A lot of the columns contain rubbish so I also have calculated columns to reformat and clean up postcodes/mobiles/email etc. So this table is getting pretty wide.

To create a customer list I have concatenated surname and postcode to create a sort of unique identifier. I made pivot table of this calcualted field and then reimported it as a new table. I have then extracted using a filter function the customer static data form the big table. So again massive duplication of all this static data on both tables.

So should I be trying to get the big table split so I have one table of 500,000 invoice lines but only the 10 columns that are changing i.e. not the static data for the invoice. Then a second table also with invoice number but only a single line with the static data maybe 100,000 rows. My problem is that i know there are some duplicated invoice numbers so I wont be able to create a relationship between the transaction table and the static data table. This probably torpedoes this as an idea?

Thanks for any suggestions

Mike
 
Upvote 0
Mike, you've basically got the right idea.

Based on the info you've provided so far I would bet that there is a customer table you just haven't seen it - the main table you are looking at is a 'view' which has probably been setup to summarise a bunch of other tables in a easier to digest way however it is extremely unlikely that the data was stored like this in the first place. Its worth asking again - if you get no joy its actually possible to create a customer table from the fact table using SQL.

If INVNUM is going to be the common field you should create a separate table of invoice numbers to use as an intermediate table to link the other two to (i.e when you drag invoice number into a pivot you would use the one in this table). The SQL would look something like:

Code:
select
   distinct(INVNUM)
from dbo.[COLOR=#333333]VW_InOutStock[/COLOR]

Doing your calculated columns in the SQL will also help - there are basically a load of functions in SQL that you can use to manipulate text strings etc that are pretty simple and this could have a HUGE impact on the performance of your model!

Hope this helps somewhat.............
Jacob
 
Upvote 0
Based on your description of the data you import, I suggest you simple SQL queries to reduce the amount of data you import and have a better data model.

You might want to have a Customers table, that includes any field that are specific to customers: fields whose value does not depend on date, invoice, or any other entity ...

If a customer only ever has one address then you might want to include it in the table. If it depends on the order then you might want to have this information with the invoiced table.

If you do not need to have the address in your model, do not import it. Same thing for first name, last name, ... (full name might be ok for your requirements)

If I understood correctly the second table represents the item list corresponding to an invoice.

Whereas a pure star-schema is usually recommended, in that case, you might want to go for a more straightforward model:
Invoice Details
-> Invoice
Invoice
-> Customer
-> Date
Customer

As mentioned above, the SQL for getting the data in the required format will follow this simple pattern:
SELECT DISTINCT
CTfirstN
, CTSurname
...
FROM YourTable

This shoudl greatly simplify your calculations as well. Most measures only require very simple DAX expressions when the data model is right.
 
Upvote 0
Hi Laurent

I know nothing about SQL!!! OK so seems you are suggestion instead of dumping the whole of this InoutStock table in one query I can do 2 queries, one that lists all 500k rows of transactions, but without all the duplicated customer details columns. Then I do a query to extract just the unique invnum details into a 2nd table and create a relationship between the InvNum on the transaction table and Invnum on the new table (which might only have 100,000 rows).

I'm a little unclear on the query as what I need is a new table with InvNum (unique), surname, address, phone etc all of which may be duplicated (customer came in twice). I'm assuming the query above gives a list of unique InvNum, but how do I get the rest of the data for that InvNum? Do I just add the other columns like this


select distinct(INVNUM)
CTFirstN
CTSurname
from dbo.VW_InOutStock

One small hitch is I'm told there are duplicated invNum used for different transactions.
The only way be sure it is actually unique is to join the InvNum with the Date. is it
possible to make the distinct (InvNum) a distinct combination and then get the other fields?

My big problem is there is no such concept as customer and the data is patchy i.e. sometimes no surname or address. Does the 2nd query
SELECT DISTINCT find a distinct entry for all the fields you enter below i.e. distinct combination? If I had all surnames and post codes I would say a unique customer is postcode-surname, but i don't.

Thanks again

mike
 
Upvote 0
What I suggested was to have 3 tables in your model: Customer, Invoice, Invoice Details.

As Jacob mentioned, there is probably a Customer table somewhere, so check if you can access it. Ask your IT if need be.

Note that IT departments are used to provide views, with repeated information if need be, so that business users do not have to JOIN tables in queries.

Since you use PowerPivot, you do not need to write JOIN queries, and you do not need to have a single view that pulls every piece of information. Don't be afraid to ask your IT department if they can provide these 3 views. (Note their DB schema probably follows this schema)

Otherwise, some SQL will be required (assuming you access these data through a database system.)

You do not need to know much SQL though.

You can use a DISTINCT query to get one row per customer:
SELECT DISTINCT
CustomerIdentifier
, CTFirstN
, CTSurname
...
FROM dbo.VW_InOutStock

At the risk of repeating myself, if you do not need some columns, then do not import them.

Do not add the invoice number to that query!! The goal is to have one row per customer. You should have a field that uniquely identifies a customer in there. If there is no technically available key, then you will have to make up one. What to take is what most appropriate for your data. If there are rows where the customer is unknown, then let it be unknown.

For the Invoice table, do not import first name and last name: just import the identifier for your customer.

The Invoice number should go into the invoice table. If you need to concatenate the date with the invoice number to build a key, then do it. You can do it as a calculated column by the way.

As you fear, a SELECT DISTINCT query, will indeed list all combinations. That means, for example, if the phone number for a customer was not filled then such a query, with the telephone number would return two row for this customer.

Some cleansing would be required. The required SQL is no witchcraft, but might be intimidating if you have never touched SQL before.

Letting your IT department deliver cleansed data should be the first choice. This will require you to provide some detailed input about what you want to get, of course. If they cannot, then see if you can hire a qualified consultant for a short-term contract.

If none is possible, then you might have to do it by yourself, and learn some SQL.
 
Upvote 0
Hi Laurent. The big issue is there is no IT department!!!! There is also no unique customer identifier so I make one postcode&surname, however if no postcode then surane&carregnum, if no surname postcode&carregnum etc etc. Basically I created a pivot table of these unqiue codes, copied nto a table and then reimported ot powerivot then extracted the detaials by filtering on the unique code. all very memory intensive. Sounds like i'm gonig to need ot leanr some SQL to able ot cretae a uniquecusotmer identifier that can then be pulled down into a customer table and an invoice table!!!
 
Upvote 0
Sounds so.

Since we had started talking about SQL, I assumed you pulled your data from a server, with IT guys managing it.

Remember that calculated columns are only calculated upon updating your model. Once updated, they do not require more work than if they had been imported, even when they are very complex. So, if it just about adding one column key (which you will need anyway), then PP might be enough.

However, since neither PowerPivot or Excel is made for extensive data cleansing, you may want to use Access or SQL Express if you require more than that.
 
Upvote 0

Forum statistics

Threads
1,215,436
Messages
6,124,869
Members
449,192
Latest member
MoonDancer

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top