Finding the most frequent combination

excelNewbie22

Well-known Member
Joined
Aug 4, 2021
Messages
510
Office Version
  1. 365
Platform
  1. Windows
hi!
how to find the most frequent combination of 1's and 0's (or just 1's), which bigger then 30,
could be 31 or 40, and etc....
it can skip columns, for instance, 30 and 39 and 40 to 42, or 1-9 and etc,
now, the data below is just a sample, the real data is way more big, like 2000 rows and 160 columns,
(which in this case i'll lookup to find combo which bigger then 120)
the possible combinations is soooooo biggggg, like hundres of millions,
i find it hard for excel to handle, but i thought maybe there's a clever way to approach it?

test
ABCDEFGHIJKLMNOPQRSTUVWXYZAAABACADAEAFAGAHAIAJAKALAMANAOAPAQARAS
1123456789101112131415161718192021222324252627282930313233343536373839404142434445
2111111111111111111111111111111111111111111111
3111111111111011111111111111111111111111111110
4111111111111111111111111111111111111111111111
5111111111111111111111111111111111110111111111
6111111111111111111111111111111111111111111111
7111111111111111111111111111111111111111111110
8111111111101111111111111110111111111111111111
9111111111111111111111111111111111111111111111
10111111111110111111111111111111111111111111111
11111111111111111110111111111111111011111111111
12111111111111111111111111111111111111111111111
13111111111101111111111111111111111110111111111
14111111111111111111111111111111111111111111111
15111111111101111111111111111111111111111111111
16111111111111111111111111111111111111111111111
17111111111111111111111111111111111111111111111
test
 

Excel Facts

Lock one reference in a formula
Need 1 part of a formula to always point to the same range? use $ signs: $V$2:$Z$99 will always point to V2:Z99, even after copying
Just an idea I have no time to test it:
Multiply each row by the row number up to 126 then start again from 1
Change this number to it's ascii character, concatenate a, b c etc for multiples of 127.
Concatenate all the rows together at the bottom of each column you can the count the character strings
 
Upvote 0
I would insert a column and use the CONCAT function to join together all cells of each row. Then create a pivot table of just the new column. Drag the field to both the row and the value area and sort on the value field (Descending).
 
Upvote 0
offthelip - thank you but i didn't understand, maybe someone can demonstrate ?
jkpieterse - if i understood correctly you talking about all the values joined? if so, it isn't my intention, if not, didn't understand you too
sample will help me to understand
 
Upvote 0
'/*******************************************************************\
'
' copy the below Sub and function into a module in your workbook
'
' In the below change the line of code
' If Cells(x, countones) >= 20 Then
' to suit the count of ones your testing for, I used 20
'
' run the below code to fill in the ConcatRow for all rows
'
' create a pivot table based on these 2 columns
' Rows = ConcatRow , Values = Count of countofones
' sort the pivot table on 'Count of countofones' Largest to Smallest
'
'\*******************************************************************/

Sub concatrow()
Dim rowno As Long, colno As Long
Dim countones As Long
Dim x As Long, y As Long
Dim newstr As String

Range("A1").Activate

rowno = LastRowColumn(ActiveSheet, "R")
colno = LastRowColumn(ActiveSheet, "C")

Cells(1, colno + 1) = "CountofOnes"
Cells(1, colno + 2) = "ConcatRow"

countones = colno + 1


For x = 2 To rowno ' exclude header row
newstr = "" ' reset string

' this counts the number of ones in the row
Cells(x, countones).FormulaR1C1 = "=COUNTIF(RC[-" & colno & "]:RC[-1],1)"

If Cells(x, countones) >= 20 Then


For y = 1 To colno ' for all columns
newstr = newstr & Cells(x, y)
Next y
Cells(x, y + 1) = newstr
End If
Next x

End Sub





Function LastRowColumn(sht As Worksheet, RowColumn As String) As Long
'PURPOSE: Function To Return the Last Row Or Column Number In the Active Spreadsheet
'INPUT: "R" or "C" to determine which direction to search

Select Case LCase(Left(RowColumn, 1)) 'If they put in 'row' or column instead of 'r' or 'c'.
Case "c"
LastRowColumn = sht.Cells.Find("*", LookIn:=xlFormulas, SearchOrder:=xlByColumns, _
SearchDirection:=xlPrevious).Column
Case "r"
LastRowColumn = sht.Cells.Find("*", LookIn:=xlFormulas, SearchOrder:=xlByRows, _
SearchDirection:=xlPrevious).Row
Case Else
LastRowColumn = 1
End Select

End Function
 
Upvote 0
I_know_nuffin - so sorry for the late response, i didn't login for a long time, and even when i did, a few days ago, i didn't saw that,
anyway, did as you instructed and it doesn't make any sense, cause the concat rows returned wrong 1's and 0's (in a different db), for example instead of 41 cells = 11111101111111111111111111111111111110011 it returned 11111101111111100000000000000000000000000
 
Upvote 0
Not sure If this is what you're after exactly. But how about this?

EXCEL
ABCDEFGHIJKLMNOPQRSTUVWXYZAAABACADAEAFAGAHAIAJAKALAMANAOAPAQARASATAUAVAW
1123456789101112131415161718192021222324252627282930313233343536373839404142434445CombinationCount
21111111111111111111111111111111111111111111114592
311111111111101111111111111111111111111111111012,31103
411111111111111111111111111111111111111111111145112
511111111111111111111111111111111111011111111135,9121
611111111111111111111111111111111111111111111145152
711111111111111111111111111111111111111111111044171
811111111110111111111111111011111111111111111110,15,18181
911111111111111111111111111111111111111111111145241
1011111111111011111111111111111111111111111111111,33311
1111111111111111111011111111111111101111111111117,15,11331
1211111111111111111111111111111111111111111111145341
1311111111110111111111111111111111111011111111110,24,9351
1411111111111111111111111111111111111111111111145441
1511111111110111111111111111111111111111111111110,34458
1611111111111111111111111111111111111111111111145
1711111111111111111111111111111111111111111111145
Data
Cell Formulas
RangeFormula
AV1:AW15AV1=LET(ts,TEXTSPLIT(TEXTJOIN(",",,AT2:AT17),,",")+0,u,SORT(UNIQUE(ts)),VSTACK({"Combination","Count"},HSTACK(u,BYROW(u,LAMBDA(br,SUM(--(br=ts)))))))
AT2:AT17AT2=TEXTJOIN(",",,LET(f,FREQUENCY(IF(A2:AS2=1,COLUMN(A2:AS2)),IF(A2:AS2<>1,COLUMN(A2:AS2))),FILTER(f,f>0)))
Dynamic array formulas.
Cells with Conditional Formatting
CellConditionCell FormatStop If True
A2:AS17Cell Value=0textNO
AV2:AW15Expression=MOD(ROW(),2)=0textYES
 
Upvote 0
lrobbo314 - appreciate the try, but it isn't that

here's more detail example,
in col at there's the first combination of 34 columns and in au the second....
so i would like to find if there is 10% and above of duplicates of any combinations from col at,
then from the second combination in col au,
and so on all the way up to millions

and ofcourse if there is, save it with the column numbers aka first row headers like:
1-0-0-1-0-1-0-1-0-0-0-0-1-1-1-1-1-1-1-1-1-1-0-0-1-1-0-1-0-1-1-1-1-1
1-5-6-7-8-9-10-11-12-13-14-15-16-17-18-19-20-21-22-23-24-25-26-27-28-29-30-31-32-33-34-35-36-37

but excel can't contain it,
so i thought if there is a smart way to do so

test
ABCDEFGHIJKLMNOPQRSTUVWXYZAAABACADAEAFAGAHAIAJAKALAMANAOAPAQARASATAUAVAWAXAY
1123456789101112131415161718192021222324252627282930313233343536373839404142434445first combination of 34 columnssecond combination of 34 columnsthird combination of 34 columns
2111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
3111111111111011111111111111111111111111111110111111111111011111111111111111111111111111111101111111111111111111111111111111110111111111111111111111
4111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
5111111111111111111111111111111111110111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111110
6111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
7111111111111111111111111111111111111111111110111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
8111111111101111111111111110111111111111111111111111111101111111111111110111111111111111110111111111111111011111111111111111011111111111111101111111
9111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
10111111111110111111111111111111111111111111111111111111110111111111111111111111111111111111011111111111111111111111111111111101111111111111111111111
11111111111111111110111111111111111011111111111111111111111111110111111111111111011111111111111111011111111111111111111111111111111101111111111111111
12111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
13111111111101111111111111111111111110111111111111111111101111111111111111111111111111111110111111111111111111111111111111111011111111111111111111110
14111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
15111111111101111111111111111111111111111111111111111111101111111111111111111111111111111110111111111111111111111111111111111011111111111111111111111
16111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
17111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
test
Cell Formulas
RangeFormula
AT2:AT17AT2=TEXTJOIN("",,A2:AH2)
AU2:AU17AU2=TEXTJOIN("",,A2:AG2)&AI2
AV2:AV17AV2=TEXTJOIN("",,A2:AG2)&AJ2
 
Upvote 0
I'm so confused as to what you are wanting to do.
 
Upvote 0
hi johnny,
i want to check the first combination of zero's and one's, of 34 out of 41,
across 12.000 rows,
if there's a combination which repeats itself more then 10%, save it

then the second combination, then the third, all the way up to 22,000,000
 
Upvote 0

Forum statistics

Threads
1,216,096
Messages
6,128,807
Members
449,468
Latest member
AGreen17

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top