Tricky counting formula

mjones

Board Regular
Joined
Oct 27, 2007
Messages
94
Office Version
  1. 365
Platform
  1. Windows
Hi All,

At > 6000 rows, it’s impossible to manually count so I hope a formula genius out there can help.

All cells are general format with text.

Count how many situations occur with rows with the same text in columns B, D & E that also have a row with x in column G and another row with y in column G.

The case below would count 1 for rows 1 and 2 matching the scenario, but rows 3 and 4 do not count because even though B, D & E match, there are not two rows where one has an x and one has a y in column G.

B D E G
1 AB CD EF x
2 AB CD EF y
3 GH KL TY z
4 GH KL TY y

The above is in the 'Songs in Playlists'! tab, but the count/answer will be in another tab.

Thank you for considering my formula,

Michele
 
Thanks for the note. You might discover the issue using the method I described. If not, feel free to post back.
 
Upvote 0

Excel Facts

Show numbers in thousands?
Use a custom number format of #,##0,K. Each comma after the final 0 will divide the displayed number by another thousand
Michele,
I just noticed an error in the formula (sorry I missed it earlier). When the various columns of the source data are being assigned to variable names, the G column is the 6th column in src ('Songs in Playlists'!$B$1:$G$12). So the "s" variable definition is correct, as it extracts columns 1,3,4,6 from src (you want matches in columns B, D, E, and G). The problem is that when "s" is used to define variables b, d, e, and g, I made a mistake when defining "g"...it should be the 4th column of "s", not the 6th...in other words, it should be CHOOSECOLS(s,4) rather than CHOOSECOLS(s,6). I fixed this on some more recent versions of the formula like the one in post #7, but failed to recognize the issue with the earlier version. So before doing any more complicated debugging (which might not be necessary), give this corrected version a try:
Excel Formula:
=LET(src,'Songs in Playlists'!$B$1:$G$12, s,CHOOSECOLS(src,1,3,4,6), b,CHOOSECOLS(s,1), d,CHOOSECOLS(s,2), e,CHOOSECOLS(s,3), g,CHOOSECOLS(s,4),
fres,FILTER(s,(b=A3)*(d=B3)*(e=C3),EXPAND("",1,COLUMNS(s),"")), lcol,CHOOSECOLS(fres,4), xs,SUM(--(lcol=F3)), ys,SUM(--(lcol=G3)), IF(AND(xs>0,ys>0),MIN(xs,ys),0))
And again, the F3 and G3 in the formula are where your search terms ("Rock", "80's", etc.) would be found, either hardwired into the formula or placed in cells that are referenced by the formula as in this example.
 
Upvote 0
One more version that ultimately constructs a two column array: one column where the B,D,E, and G-1st term criteria either match (1's will be shown) or do not match (0's will be shown), and the other column where the B,D,E, and G-2nd term criteria either match (1's) or do not (0's). This two-column array is operated on by a BYCOL LAMBDA function to sum each of these columns, and then we take the minimum of those two sums. That would be the number of pairs. I'm showing two different implementations: one where the source data G column search terms are referenced from other cells (F8:G8 in the formula, the other where you would hardwire the search terms into an array, so the names are each enclosed by double quotes and separated with a comma, and the entire array is surrounded by curly brackets. The second line of the formula is where the calculations are performed...the first line defines the source data and assigns variable names.
MrExcel_20240301.xlsx
ABCDEFG
7CatBCatDCatEPair Count
8ABCDEF1Oldies (Pre 1970)Easy Listening/'Smooth' Jazz
9GHKLTY2
mjones_tab
Cell Formulas
RangeFormula
D8D8=LET(src,'Songs in Playlists'!$B$1:$G$12, s,CHOOSECOLS(src,1,3,4,6), b,CHOOSECOLS(s,1), d,CHOOSECOLS(s,2), e,CHOOSECOLS(s,3), g,CHOOSECOLS(s,4), MIN(BYCOL((b=A8)*(d=B8)*(e=C8)*(g=F8:G8),LAMBDA(c,SUM(c)))))
D9D9=LET(src,'Songs in Playlists'!$B$1:$G$12, s,CHOOSECOLS(src,1,3,4,6), b,CHOOSECOLS(s,1), d,CHOOSECOLS(s,2), e,CHOOSECOLS(s,3), g,CHOOSECOLS(s,4), MIN(BYCOL((b=A9)*(d=B9)*(e=C9)*(g={"Oldies (Pre 1970)","Easy Listening/'Smooth' Jazz"}),LAMBDA(c,SUM(c)))))
 
Upvote 0
One more variant on the last idea to use BYCOL: We really do not need the intermediate "s" array. That was done earlier to make it easier to see how the relevant columns of the source data could be extracted into a separate array before its four columns were each assigned to variables b, d, e, and g. That makes the formula a little longer but I find it easier to understand. If you want to take a shortcut, the formula could use direct references back to the original source data without assigning those separate variables b, d, e, and g. And to make the formula even shorter, instead of using the CHOOSECOLS function, we can use the slightly shorter INDEX function (fewer letters, but we need an extra comma in the arguments). So the shortened version of the last post is shown in the upper block below.

An additional note: this type of problem is well suited for using the COUNTIFS function as suggested by @Tetra201, but the "IF(s)" group of functions (SUMIF, SUMIFS, COUNTIF, COUNTIFS for example) take a range reference, not an array for each of the criteria range arguments. And because this formula is on a different worksheet than the source data, the formula would need to have the worksheet name included with the column references, and the column G comparison would be done differently to obtain counts for each of the search terms...and then take the minimum of those two counts to determine the number of pairs. This version is shown in the lower block below.

In both cases, two implementations are shown: 1) the search terms are located in cells referenced by the formula, 2) the search terms are hardwired in the formula as text strings inside an array.
MrExcel_20240301.xlsx
ABCDEFG
12CatBCatDCatEPair Count
13ABCDEF1Oldies (Pre 1970)Easy Listening/'Smooth' Jazz
14GHKLTY2
15GHKLAB0Oldies (Pre 1970)Easy Listening/'Smooth' Jazz
16
17CatBCatDCatEPair Count
18ABCDEF1Oldies (Pre 1970)Easy Listening/'Smooth' Jazz
19GHKLTY2
20GHKLAB0Oldies (Pre 1970)Easy Listening/'Smooth' Jazz
mjones_tab
Cell Formulas
RangeFormula
D13,D15D13=LET(s,'Songs in Playlists'!$B$1:$G$12,MIN(BYCOL((INDEX(s,,1)=A13)*(INDEX(s,,3)=B13)*(INDEX(s,,4)=C13)*(INDEX(s,,6)=F13:G13),LAMBDA(c,SUM(c)))))
D14D14=LET(s,'Songs in Playlists'!$B$1:$G$12,MIN(BYCOL((INDEX(s,,1)=A14)*(INDEX(s,,3)=B14)*(INDEX(s,,4)=C14)*(INDEX(s,,6)={"Oldies (Pre 1970)","Easy Listening/'Smooth' Jazz"}),LAMBDA(c,SUM(c)))))
D18,D20D18=MIN(COUNTIFS('Songs in Playlists'!$B:$B,A18,'Songs in Playlists'!$D:$D,B18,'Songs in Playlists'!$E:$E,C18,'Songs in Playlists'!$G:$G,F18:G18))
D19D19=MIN(COUNTIFS('Songs in Playlists'!$B:$B,A19,'Songs in Playlists'!$D:$D,B19,'Songs in Playlists'!$E:$E,C19,'Songs in Playlists'!$G:$G,{"Oldies (Pre 1970)","Easy Listening/'Smooth' Jazz"}))
 
Upvote 0
Hi KRice,

You’re getting too complicated for my level of understanding.

Here’s a link to the file where the first two tabs are of interest - www.quality-computing.com/?url=/Michele's iTunes Music.xlsx.

On the Playlist Totals tab in B24, you’ll see I was trying to get the total of “Easy Listening” and "70's Music".

Good luck and thank you again very much!

Michele
 
Upvote 0
Okay...thanks. Fundamental misunderstanding on my part. So you don't actually specify the song name (b), artist (d), or album (e) on the 'Songs in Playlists' worksheet? You simply want to know that if all entries in the 'Songs in Playlists' worksheet were formed into groups where each grouping had b,d,e in common, and also had either of the "g" search terms (that you do specify), then how many pairs are there, where each pair has a representative for each of the g search terms? My description is complicated...but let me know if you need clarification. If this is the case, the earlier formulas are appropriate. I'll look into it.
 
Upvote 0
Yes, only a number is needed. For example, how many songs have both "easy listening" and "70's music" in column G. The size of the number will show if that combo has enough songs to make a separate playlist or if there are too many songs and there should be a separate list. This is especially important because the source data is expected to double or more.

Since I can't get the formula to work, I hope you can look at the spreadsheet for an idea.

Thanks again . . . Michele
 
Upvote 0
I'll post just an excerpt here. Try this in the yellow cells to see if the results are what you expect. The numbers do not quite reflect the column heading (Songs in Playlists Tab), since they represent the number of pairs of songs where the Name, Artist, and Album are the same within each pair, and one of the pair meets the "x" criteria in the Opt. Playlist, while the other of the pair meets the "y" criteria in the Opt. Playlist (referring to columns B, D, E, and G on the 'Songs of Playlists' worksheet). After seeing your worksheet layout, I think it would be prudent to revise the row headings slightly so that they can be used directly by the formula. To do that, I inserted a colon after Easy Listening and then ensured that whatever followed the colon matched exactly with the categories in column G on the 'Songs of...' worksheet...for example, Easy Listening: Oldies (Pre 1970). The formula splits this text at the ": " and uses the two "x" and "y" strings for matching.
I'm not entirely happy with this solution, as it is slow, requiring about 10-15 seconds on my computer to update the numbers. This is mainly due to the repeated BYROW functions. As a spot check, I filtered the Playlist songs for the criteria in cell A26 and then manually scrolled through to confirm that there were 12 pairs where B,D,E were common within each pair, and G was split between the x and y matches. If these numbers are correct, then I'll look into other ways of doing it faster.
Michele's iTunes Music.xlsx
AB
22Easy Listening300
23Easy Listening: Oldies (Pre 1970)34
24Easy Listening: 70's Music28
25Easy Listening: 80's18
26Easy Listening: 90's & Newer12
Playlist Totals
Cell Formulas
RangeFormula
B22B22=COUNTIF('Songs in Playlists'!G$2:G$5748,A22)
B23:B26B23=LET(src,'Songs in Playlists'!B$2:G$5748, x,TEXTBEFORE(A23,": "), y,TEXTAFTER(A23,": "), b,INDEX(src,,1), d,INDEX(src,,3), e,INDEX(src,,4), g,INDEX(src,,6), su,UNIQUE(CHOOSECOLS(src,1,3,4)), resx, BYROW(su,LAMBDA(r,SUM(--(FILTER(g,(b=INDEX(r,1))*(d=INDEX(r,2))*(e=INDEX(r,3)),"")=x)))), resy, BYROW(su,LAMBDA(r,SUM(--(FILTER(g,(b=INDEX(r,1))*(d=INDEX(r,2))*(e=INDEX(r,3)),"")=y)))), SUM(BYROW(HSTACK(resx,resy),LAMBDA(r,MIN(INDEX(r,1),INDEX(r,2))))))
 
Upvote 0
Yeah! It works!
View attachment 107867
But you're correct. It's super slow. I'd have to remove the formulas to use the worksheet and add them back when needed. Ways to do it faster would be great!

They don't total to 300 because I'm still working on adding decades to all the songs but they will eventually.

Thank you again!

Michele
 
Upvote 0
Songs in Playlists2.JPG
 
Upvote 0

Forum statistics

Threads
1,215,077
Messages
6,122,991
Members
449,094
Latest member
masterms

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top