Is this the better way to go instead of MASS Index/Match formulas?

Agrajag · Jul 31, 2012

I have a spreadsheet I created (and a dozen or more tabs) to analyze sports data. The main sheet is a master sheet of the raw player data and it includes a unique player ID.

However, since all the positions are in one list and each position needs to be analyzed differently I end up having to populate each sheet with a slew of formulas that look like this:

=IF(ISNA(INDEX(Master!A$2:A$6500,MATCH("RB"&" "&$A2,Master!$CE$2:$CE$6500,0))),"",INDEX(Master!A$2:A$6500,MATCH("RB"&" "&$A2,Master!$CE$2:$CE$6500,0)))

With that the manual copying and pasting is just endless. However, with this the sheet is now slow and sucks memory. Changes take seconds per change.

The above is just to grab the NAME of the player and repeats down the rows for about 500 lines. Then I do that again for every attribute I need to track. One sheet could have literally thousands of these.

Pivot tables by default wouldn't easily let me fill SHEET 2 from the Master sheet. Essentially I want to say, "On this sheet, the RB sheet, look at all the data on the Master sheet and create an entry for every player that has a POS of RB." I also need to be able to apply apply formulas and sort the data once done.

Will this work and improve things for me?

Thanks.

Agrajag · Jul 31, 2012

That should say "Without that the manual copying and pasting is just endless".

Laurent C · Aug 2, 2012

Can you provide a link to the file, or some simplified version of it? Skydrive would be perfect.

Agrajag · Aug 2, 2012

Sure. I'm not sure I did this right as I don't really use Skydrive but here you go:

http://sdrv.ms/MBqEY4

The key to remember is that I don't want to have to do a lot of work on the Master sheet. Ultimately it'd be optimal if I could just limit it to just the raw CSV file dump when I want to update. The only thing that happens now to it is I add one final column at the far right to give each player a Position ID. Each player already has a unique player ID so I suspect I can use that better.

I've spent a huge amount of time getting this the way it is now and I'm still tweaking (mainly working on standard deviation issues and weight balancing). Any input you can provide I'd greatly appreciate. Thanks so much.

NOTE: There's also a Documentation tab near the end of the tabs explaining much of the approach and reasoning.

Laurent C · Aug 4, 2012

Looking at your spreadsheet, I have the following recommendations to improve performance and also readability.

So, if you go the classic Excel formula road:
- to avoid copy & paste, you can create a connection to the csv file
- define all metrics in the master sheet, in columns next to your original data, whether they are relevant for the position or not
- THEN filter & choose which metrics should be displayed (instead of INDEX/MATCH formulas, you can as well use pivot tables)
- define dynamic ranges for your parameters:.
For example, you could replace the different references to the Settings!B:B column by a named range:
Denominator := CHOOSE(MATCH($C2,{"C","CB","DT", ... },0), Settings!$B$3, Settings!$B$8, ... )
Note the range definition should be relative to the current row.

Using Powerpivot, you will have to :
- create a table for your settings with the player's position as a key for this table - I also suggest that for the old-school Excel road
- create a relation between your two tables, with the Pos column as a key
- create a calculated column for each metric that you want use - whether the metric is relevant or not for the position - just as suggested above. Use the RELATED function to reference the relevant parameters.

Either way will require some work. You can also try to replace the last argument of your MATCH functions and set it to 1, if your data is sorted in ascending by the Pos column. Excel then uses a binary search which means it will not have to scan all the rows to find the correct one. If you use Excel 2007 or later, try to replace your IF(ISNA(INDEX(...), "", INDEX(...)) formulas with IFERROR(INDEX(...),"") formulas.

Agrajag · Aug 4, 2012

Thanks! This is some really effective insight. I'm tackling this item-by-item so hopefully you won't mind some back-and-forth questions:

1. The data connection option is amazing. The one issue I have is the provider limits the file size for non-paying members. I pay but the URL approach doesn't have a username/password option so by default it assumes I don't and won't provide the full list. I'm working with them to see if we can get around that.

2. I assume, when you say "next to your original data" that you mean the metrics would appear ALL TOGETHER but off the right right of the CSV that I'd be importing automatically. Otherwise, if Column A was from the CSV and Column B was a metric then I don't see how #1 would work (just clarifying that's what you meant).

3. The problem I can't figure out with this approach is that the weightings for each attribute (for example, Speed) are different depending on the position. It's not important at all to some positions, like a Kicker but then has a different importance/calc for a running back than it has for, say, a quarterback. That's why I ended up going with different tabs for each unique position. This way, I suspect I'm right back to Index/Match to deal with the different calcs for each position. No?

The Dynamic Range I'm still digesting so I may have more questions on that later.

Thanks again. Great stuff in here.

Agrajag · Aug 6, 2012

Also, the CSV file isn't static. Is anything you recommended above no longer valid now that I've mentioned that, during a season, this file can add new players and remove retired players? In other words, the rows of the CSV can change quite a bit from week to week.

Laurent C · Aug 6, 2012

2. You are right: I did not mean you should mix imported and calculated columns. Even though this might work in some import scenarios, that is not good practise.
3. You will indeed still have to make your weights depandant on the Pos column. Note however, that by putting your formulas on the same sheet, you will eliminate some formulas, as well as the need to use INDEX / MATCH to find the position of the current player. Using named ranges is just a way to make your formulas more readable. Just note they can use relative references.
4. The variable number of rows should not be an issue. Classic CSV connection: you check the option "Fill down formulas in columns adjacent to data" (Data Range Properties). Import through MS Query: the import will return a (ListObject) table with this behaviour by default.

Agrajag · Aug 6, 2012

Well, I ran into my first log-jam. The issue is "virtual positions". If you look at the sheet I sent, you'll see that on the master sheet there is a position called "DE". Just in case you don't know American Football, that's Defensive End. However, there is no DE tab in the workbook. Why? Because in the game itself there's a major distinction between a Left Defensive End and a Right Defensive End (more detail in the Documentation tab). There is an LDE and RDE tab. The LDE tab takes all the DE's and copies them to the LDE tab but then all the formulas are specific to what's important for an LDE. Then on the RDE tab I copy all the DE's yet again but here their formulas are very different. Doing this all on the Master sheet I'm not sure how I can filter for seeing RDE's and LDE's when the POS column only has the generic DE. Follow?

So with your approach I'd have a column for say, STRENGTH WEIGHTING and it would do the very useful (thanks) CHOOSE(MATCH($C2,{"CB"....) and get the weighting from the Settings tab. That worked great for the (in alphabetical order) C, then the CB but then when I got to the DE rows I realized I was stuck.

Centers (C) all have one weighting for the attribute Strength. Same for Cornerbacks (CB). Nice and simple. Not so for DE's (RDE, LDE), Safeties (SS, FS), Linebackers (MLB, SLB, WLB). Ultimately there could be a lot more. I'm considering breaking out Tight Ends (TE) into blocking TE's and pass-catching TE's. Some people have literally 8 different types of RB's. Power, speed, pass-catching, blocking, etc.

Laurent C · Aug 8, 2012

How have you made the distinction between LDE and RDE until now? Was there any logic you used based on the data, or did you just use your own knowledge of the players?

Is this the better way to go instead of MASS Index/Match formulas?

Agrajag

New Member

Excel Facts

Agrajag

New Member

Laurent C

Board Regular

Agrajag

New Member

Laurent C

Board Regular

Agrajag

New Member

Agrajag

New Member

Laurent C

Board Regular

Agrajag

New Member

Laurent C

Board Regular

Similar threads

Forum statistics

Share this page

Is this the better way to go instead of MASS Index/Match formulas?

New Member

Excel Facts

New Member

Board Regular

New Member

Board Regular

New Member

New Member

Board Regular

New Member

Board Regular

Similar threads

Forum statistics

Share this page

We've detected that you are using an adblocker.

Which adblocker are you using?

Disable AdBlock

Disable AdBlock Plus

Disable uBlock Origin

Disable uBlock