Optimal Amount of Data
Results 1 to 5 of 5

Thread: Optimal Amount of Data
Thanks Thanks: 0 Likes Likes: 0

  1. #1
    New Member
    Join Date
    Oct 2016
    Posts
    31
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Optimal Amount of Data

    Hi All -

    I have been attempting to put together a tool and was wondering how much data and the best way to structure it for re usability.

    I have several clients and am trying to measure each client employee base on how prepared they are for retirement. My initial thought was to handle this in a one-off scenario, using the individual client (company) employee data and then changing the data each time I run the analysis.

    Would it make more sense to put together a database of every employee tied to each employer and then pull that data into the data model, filtering by employer to get the correct data set?

    Since I am starting from scratch, I could create this in excel, access, or even on sharepoint - allowing the clients to maintain their own employee base.

    The other issue for consideration is updating the data on a regular basis. For each client there could be a different provider for the retirement plan - meaning different data downloads and formats for each employee group - some have fields ordered in different ways, etc. To run the calculations, I need accurate account balance and deferral information for each employee. What would be the best format for storing the data so this could be updated on a regular basis (ideally by someone other than me). I'm assuming I would have to do this manually, employee by employee. As the number of employees is relatively small right now, this isn't an issue, but I would need to consider that the number could be in excess of 100,000 unique employees within a few years, making data updates very time consuming.

    I apologize if this is a little disjointed - just need some advice on where to start housing the data so I can make use of powerpivot for the analysis.

    John

    Edit: One thing I forgot - I use a sharepoint server, so whatever I use to store the data, I would prefer to keep there (revision history, etc.). I know the "apps" in access have less functionality than a regular database, but that might be easier to use for others in my organization.
    Last edited by jgedwardsv; Dec 29th, 2016 at 10:59 AM.

  2. #2
    MrExcel MVP
    Join Date
    Dec 2014
    Location
    Sydney, Australia
    Posts
    1,179
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Optimal Amount of Data

    Yep, there's a lot in this question .


    There are no right or wrong answers and it depends on what you need. Based on your description of the problem, here are my thoughts.


    Power Pivot or Power BI is the right reporting tool. There is no amount of data that is too big, and it seems your needs are modest. So don't make a decision on how much to load based on size - it can handle it.


    Power Pivot and Power BI are not data storage tools, they are reporting tools. So you need somewhere else to store and enter data. SharePoint is a good option for small data and when you need distributed solutions. A database is better for large data but distributed sharing can be a challenge. Given you have lots of different clients with different systems, I am thinking store the data in Excel format any way you can get it. The closer it is to what is easy for the client the better. That way it is easy for them to update (just give you a new extract).


    I think there is value in having all data in one place at some stage in the process. Imagine the bench marking benefit of being able to say to Client A how they compare with "like clients" etc. This implies you would need a standardised end format for the data. The good news is you can use Power Query to manage the transformation from a non standard starting point to a common end point.


    Based on what I understand, this is what I would do
    1 gather data from each client in one or more Excel files as needed. Make it simple for them so it is easy for them to refresh
    2 store these excel files in a SharePoint library and use the version control and online editing features to manage data entry etc
    3 build a transformation tool for each client using power query in Power BI Desktop to get the data into the require data format
    4 build out the reporting in Power BI and distribute via the service. Apply Row Level Security if necessary
    5 later copy this desktop file and repurpose for the next client. All that is needed is to change the power queries to deliver the same starting data shape as before. The starting point is irrelevant other than you need access to the same data.
    6 even later, if you want to benchmark, you can combine the queries from multiple clients into a single workbook and bring the data together for cross industry reporting.
    Last edited by Matt Allington; Dec 29th, 2016 at 03:37 PM.
    Matt Allington is a professional Power BI Consultant, Power BI Trainer and Microsoft MVP based in Sydney Australia. Matt is author of the Power BI book Supercharge Power BI and the Power Pivot Book Supercharge Excel.
    What is Power BI?

  3. #3
    New Member
    Join Date
    Oct 2016
    Posts
    31
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Optimal Amount of Data

    Thanks Matt -

    Also have to say that you blew my mind by adding comparison tools for the report... with enough data on a large scale that would be extremely helpful and valuable to the process.

    I think I have a good place to start now and can start gathering the data I need through sharepoint from each client. Most of what I need I can get from the recordkeeper data file, the census data would be something they would input.

    Along those lines - could I split the data needed? For instance, have the client fill out the census information (name, DOB, salary) and then match that with recordkeeper data (account balance, deferral amount)?

    I'm guessing I need to learn to use PowerQuery next...

  4. #4
    MrExcel MVP
    Join Date
    Dec 2014
    Location
    Sydney, Australia
    Posts
    1,179
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Optimal Amount of Data

    You can definitely split those things out and rejoin them later. Power Query is an incredible tool and easy to learn. I have started building my online training course but it won't be ready for a couple of months. Here is a insight of what it will be like.

    Power Query for Excel and Power BI Online Training - Excelerator BI

    the timing may not be great for you though
    Matt Allington is a professional Power BI Consultant, Power BI Trainer and Microsoft MVP based in Sydney Australia. Matt is author of the Power BI book Supercharge Power BI and the Power Pivot Book Supercharge Excel.
    What is Power BI?

  5. #5
    New Member
    Join Date
    Oct 2016
    Posts
    31
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Optimal Amount of Data

    Perhaps not - I just registered for the updates... By the time it comes around, I should have a solid foundation of PowerPivot and how I want to structure the data to make it worthwhile.

    As the defacto tech person in the firm (defacto because I am the youngest by far - I beat everyone by at least 15 years) I have a lot of time to work through all these things. Short of someone with the skills already coming in and taking this project up with me, there is no way I will be done in the next few months. Luckily, since we are starting from the bottom here and I have pulled a lot of input from other industry people into the creation of this, I don't think it will be a problem taking my time to ensure this is put together correctly and the process can be repeated easily many times over. Also, luckily for me, this is a way of looking at data that the industry has largely neglected over the last few years - there are existing tools out there, but all of them were developed by a mutual fund company with the express purpose of pushing more sales to said mutual fund company.

Some videos you may like

User Tag List

Like this thread? Share it with others

Like this thread? Share it with others

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •