Creating List of Distinct Values with 1 condition for very large data set

Jayy2121

New Member
Joined
Nov 5, 2013
Messages
11
Hello,
I've used this site and forum extensivly to solve issues for a dashboard I'm building, but can't find a good solution to the following:

Creating a list of distinct values (removing all duplicates) when a condition is met. I know there are ways to do this with Index formulas, but my data has 170,000+ data sets so the calculations I've tried get bogged down when trying to run it.

Since this will be a dynamic scorecard sent to many people, it can't contain vba, pivot tables, or filters that the recipient would need to run or update. My goal is that once updated data is dropped into the dashboard, the formulas will do the rest.

The formula should look at the result of a market drop down list on another sheet to define the market to get the list for.

Example: For the data below I will have selected San Diego market and need to generate a list of all distinct TSMs in that market, keeping in mind there are 170,000+ rows. If I select a different market, the results wil update.

I hope I havn't reached the limits of what Excel can do.

I'm a first time poster so please be kind if I didn't follow standard protocal.
Thanks in advance!!!

Jayy


Sample data:

Market</SPAN>TSM</SPAN>
Los Angeles Mkt</SPAN> TSM: JOHN MARKARIAN</SPAN>
San Diego Mkt</SPAN> TSM: CARLOS GONZALEZ-CEJA</SPAN>
San Diego Mkt</SPAN> TSM: CARLOS GONZALEZ-CEJA</SPAN>
San Diego Mkt</SPAN> TSM: CARLOS GONZALEZ-CEJA</SPAN>
San Diego Mkt</SPAN> TSM: CARLOS GONZALEZ-CEJA</SPAN>
San Diego Mkt</SPAN> TSM: SHAEFER WALL</SPAN>
San Diego Mkt</SPAN> TSM: CARLOS GONZALEZ-CEJA</SPAN>
San Diego Mkt</SPAN> TSM: CARLOS GONZALEZ-CEJA</SPAN>
San Diego Mkt</SPAN> TSM: SHAEFER WALL</SPAN>
San Diego Mkt</SPAN> TSM: SACHARY_CR NARANJO</SPAN>
San Diego Mkt</SPAN> TSM: SHAEFER WALL</SPAN>
San Diego Mkt</SPAN> TSM: SHAEFER WALL</SPAN>
San Diego Mkt</SPAN> TSM: SHAEFER WALL</SPAN>
Inland Empire Mkt</SPAN> TSM: ALPHA OWENS</SPAN>
Las Vegas Mkt</SPAN> TSM: JAMARR OWENS</SPAN>

<TBODY>
</TBODY><COLGROUP><COL span=2></COLGROUP>
 

Excel Facts

Quick Sum
Select a range of cells. The total appears in bottom right of Excel screen. Right-click total to add Max, Min, Count, Average.
I know you said you used Index formulas before but it should be the fastest way to avoid VBA and Pivot Tables. Even though an output with VBA is going to beat it every time.

So I made this example for you its in Google Docs because its easier for me to share. https://drive.google.com/file/d/0B2fr4qZAYHuRQUlBTy1ZVG40WE0/edit?usp=sharing

Formulas Below

The column to find where the market exists =IFERROR(MATCH($D$1,OFFSET($A$1:$A$200,D2,0),0)+D2," ")

The column to pull in names where the Name is present : =IFERROR(INDEX($A$1:B200,$D3,MATCH($E$2,$A$1:$B$1,0)),"")

Try it out with your data, (TEST DATA first). I didn't make it pretty but it works so mess with it how you can to make it fit your style. This should work, the column just needs to be long enough to fit all the names, same for the column referencing the row. (You can hide this onto a helper sheet and just have what you want on the scorecard?
 
Upvote 0
Thanks Slizer, I'll give it a shot and let you know.
I know VBA is the best way to go, but I can't trust my audience to use it propoerly (or not disable it when opening).

Thanks again!!
 
Upvote 0
Not a problem. Let me know if you get errors or if you want me to explain the formulas more. I actually got this formula idea from a poster a few months ago who helped me on something so passing on the knowledge feels good.
 
Upvote 0
Slizer, your first formula worked great. The second one didn't seem to work when I applied it to my worksheet.
But I realized that your results did not remove duplicate which is something I'm hoping to acheive. What I'm working with is a huge list of 170,000 accounts, their markets, and TSMs in each market. I'm hoping to get a dynamic list of the distinct TSMs depending on which region is selected. My plan is to feed another dropdown with this list of distinct TSMs which will, in turn, feed more formulas.

I feel like I'm about 10 levels beyond my ability here so your help is appreciated.
 
Upvote 0
So you like the way it pulled all the information but when you select something like...San Diego Mkt. It will only pull unique names from the list. So it may be something like 400 different names. Basically all the formula needs is a way to filter out the results before posting. This can be done with a helper worksheet if you are into making a hidden sheet. Basically the helper sheet would post the results then the sheet which would be presented to whomever would only see the unique names.

A few questions
-Would this sheet just be given to someone to select a market and then they would get the results they desire or are you using it to create the list then adding that information elsewhere?
-If you can lock a hidden sheet a helper sheet will do all this for you and teach you a few new tricks
-Can you provide an example of the kind of output that would be desired? other than the one you posted (Of course removing an confidential information)

I will work on this more on my downtime.
 
Upvote 0
Slizer,
I have a 25mb data pull of 170,000 rows and 26 columns. I'm trying to design a front page dashboard that would allow market supervisors to pick their market and TSM to see sales opportunities they should persue with that TSM. I have the formulas built to pull the opportunities and they are tied to a spot for each drop down (market and TSM). But the TSM cell is static until I can build a drop down that allows them to choose their TSM.

There are 62 markets and 800 TSMs and I can't create a static list of TSMs for each market as the TSMs are liable to change with every data pull depending on personelle assignments. So my goal is to have a list of distinct TSMs generated based off the region selected (on a different sheet than the actual data or dashboard) that can feed my TSM drop down list.

One other note if it helps is that while each Market has 10-15 TSMs, each TSM only reports to a single market.

I can certainly add helper pages and already use a scratch sheet to feed formulas. But the file size and calculation time have to stay somewhat reasonable (I'm already taking about 10 seconds to calculate)

If you really want the full sheet, it would take me some time to scrub out confidential info and it would be big, but if that's the only way, I'll do it.

Let me know your thoughts and thanks again.
 
Upvote 0
Hello,
I've used this site and forum extensivly to solve issues for a dashboard I'm building, but can't find a good solution to the following:

Creating a list of distinct values (removing all duplicates) when a condition is met. I know there are ways to do this with Index formulas, but my data has 170,000+ data sets so the calculations I've tried get bogged down when trying to run it.

Since this will be a dynamic scorecard sent to many people, it can't contain vba, pivot tables, or filters that the recipient would need to run or update. My goal is that once updated data is dropped into the dashboard, the formulas will do the rest.

The formula should look at the result of a market drop down list on another sheet to define the market to get the list for.

Example: For the data below I will have selected San Diego market and need to generate a list of all distinct TSMs in that market, keeping in mind there are 170,000+ rows. If I select a different market, the results wil update.

I hope I havn't reached the limits of what Excel can do.

I'm a first time poster so please be kind if I didn't follow standard protocal.
Thanks in advance!!!

Jayy


Sample data:

MarketTSM
Los Angeles Mkt TSM: JOHN MARKARIAN
San Diego Mkt TSM: CARLOS GONZALEZ-CEJA
San Diego Mkt TSM: CARLOS GONZALEZ-CEJA
San Diego Mkt TSM: CARLOS GONZALEZ-CEJA
San Diego Mkt TSM: CARLOS GONZALEZ-CEJA
San Diego Mkt TSM: SHAEFER WALL
San Diego Mkt TSM: CARLOS GONZALEZ-CEJA
San Diego Mkt TSM: CARLOS GONZALEZ-CEJA
San Diego Mkt TSM: SHAEFER WALL
San Diego Mkt TSM: SACHARY_CR NARANJO
San Diego Mkt TSM: SHAEFER WALL
San Diego Mkt TSM: SHAEFER WALL
San Diego Mkt TSM: SHAEFER WALL
Inland Empire Mkt TSM: ALPHA OWENS
Las Vegas Mkt TSM: JAMARR OWENS

<tbody>
</tbody>

Will the output be restricted to just one output sheet? The output would look like autofiltering: a market chosen, the associated distinct list created. When another market chosen, the new list will replace the old in the same area.
 
Upvote 0
Correct Aladin. There will be only one drop down location for market and that will feed only one distinct TSM list.
To be fair, there are 8 Regions and each region will have it's specific Market list. But I'm hoping once I solve for one Region, the same formulas will apply to the other regions, just pointing to a different Market drop down.


Let me know any other questions and thanks again.
 
Upvote 0
Correct Aladin. There will be only one drop down location for market and that will feed only one distinct TSM list.
To be fair, there are 8 Regions and each region will have it's specific Market list. But I'm hoping once I solve for one Region, the same formulas will apply to the other regions, just pointing to a different Market drop down.


Let me know any other questions and thanks again.

Market
TSM
0
San Diego Mkt
Los Angeles Mkt
TSM: JOHN MARKARIAN
3
San Diego Mkt
TSM: CARLOS GONZALEZ-CEJA
1
List
San Diego Mkt
TSM: CARLOS GONZALEZ-CEJA
TSM: CARLOS GONZALEZ-CEJA
San Diego Mkt
TSM: CARLOS GONZALEZ-CEJA
TSM: SHAEFER WALL
San Diego Mkt
TSM: CARLOS GONZALEZ-CEJA
TSM: SACHARY_CR NARANJO
San Diego Mkt
TSM: SHAEFER WALL
2
San Diego Mkt
TSM: CARLOS GONZALEZ-CEJA
San Diego Mkt
TSM: CARLOS GONZALEZ-CEJA
San Diego Mkt
TSM: SHAEFER WALL
San Diego Mkt
TSM: SACHARY_CR NARANJO
3
San Diego Mkt
TSM: SHAEFER WALL
San Diego Mkt
TSM: SHAEFER WALL
San Diego Mkt
TSM: SHAEFER WALL
Inland Empire Mkt
TSM: ALPHA OWENS
Las Vegas Mkt
TSM: JAMARR OWENS

<TBODY>
</TBODY>

A:B houses the data of interest.

C1 must house a 0.

G1 displays a choice from the data validation list on that spot.

C2, just enter and copy down:

=IF($A2=$G$1,IF(ISNUMBER(MATCH($B2,$B$1:B1,0)),"",LOOKUP(9.99999999999999E+307,$C$1:C1)+1),"")

G2, just enter:

=LOOKUP(9.99999999999999E+307,C:C)

G4, just enter and copy down as far as needed:

=IF(ROWS($G$4:G4)<=$G$2,LOOKUP(ROWS($G$4:G4),C:C,B:B),"")

Note. Actually, a simple VBA code should look at G2 and copy the above formula G2 times from G4 on downwards.
 
Upvote 0

Forum statistics

Threads
1,214,584
Messages
6,120,387
Members
448,956
Latest member
JPav

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top