Sampling of data using formula or macros

Sailadarohit

New Member
Joined
Sep 7, 2022
Messages
39
Office Version
  1. 365
  2. 2021
  3. 2019
  4. 2016
Platform
  1. Windows
Book1
ABC
1U.ID(Not Unique)Main ID (Unique)Priority
2110High
3111Medium
4212Medium
5313Low
6514High
7415Low
8716Medium
9517High
10718Medium
11119Medium
12220Low
13821High
14922Low
15623Medium
16524High
17425Medium
18226Medium
19827Low
20128High
21629Low
22930Medium
23331High
24632Medium
25933Medium
26134Low
27135High
28236Low
29337Medium
30538High
31439Medium
32740Medium
33541Low
34742High
35143Low
36244Medium
37845High
38946Medium
39647Medium
40548Low
41449High
42250Low
43851Medium
44152High
45653Medium
46954Medium
47355Low
48656High
49957Low
Sheet1


There is a use case where we have U.ID which is not unique, main ID which is unique and priority which covers high, medium and low. In a general scenerio we have mutiple Main ID's associated with one U.ID . For audit purpose we require a sample of this data which will have only 3 main ID's associated with each U.ID.
Please note in real scenario this is a huge data set so manually doing this is not possible.
Can anyone help with an excel formula (preferred) or a macro
 

Excel Facts

Show numbers in thousands?
Use a custom number format of #,##0,K. Each comma after the final 0 will divide the displayed number by another thousand
Hi, is this what you mean ? (assuming I used cell "E1" to insert which U. ID I wanted to filter ?)

Book1
ABCDEFG
1U.ID(Not Unique)Main ID (Unique)Priority1
2110High110High
3111Medium111Medium
4212Medium119Medium
5313Low
6514High
7415Low
8716Medium
9517High
10718Medium
11119Medium
12220Low
13821High
14922Low
15623Medium
16524High
17425Medium
18226Medium
19827Low
20128High
21629Low
22930Medium
23331High
24632Medium
25933Medium
26134Low
27135High
28236Low
29337Medium
30538High
31439Medium
32740Medium
33541Low
34742High
35143Low
36244Medium
37845High
38946Medium
39647Medium
40548Low
41449High
42250Low
43851Medium
44152High
45653Medium
46954Medium
47355Low
48656High
49957Low
Sheet1
Cell Formulas
RangeFormula
E2:G4E2=TAKE(FILTER(A2:C49,A2:A49=E1),3,)
Dynamic array formulas.
 
Upvote 0
Hi, is this what you mean ? (assuming I used cell "E1" to insert which U. ID I wanted to filter ?)

Book1
ABCDEFG
1U.ID(Not Unique)Main ID (Unique)Priority1
2110High110High
3111Medium111Medium
4212Medium119Medium
5313Low
6514High
7415Low
8716Medium
9517High
10718Medium
11119Medium
12220Low
13821High
14922Low
15623Medium
16524High
17425Medium
18226Medium
19827Low
20128High
21629Low
22930Medium
23331High
24632Medium
25933Medium
26134Low
27135High
28236Low
29337Medium
30538High
31439Medium
32740Medium
33541Low
34742High
35143Low
36244Medium
37845High
38946Medium
39647Medium
40548Low
41449High
42250Low
43851Medium
44152High
45653Medium
46954Medium
47355Low
48656High
49957Low
Sheet1
Cell Formulas
RangeFormula
E2:G4E2=TAKE(FILTER(A2:C49,A2:A49=E1),3,)
Dynamic array formulas.
Hi RobP, thanks for the response. Yes the solution is correct but i want the results such that it has all combined the UID's and 3 random main ID's at one place, one below the other. Because in my scenario i have 1000's of UID's so I cannot manually do this
 
Upvote 0
How about, for 365
Fluff.xlsm
ABCDEFG
1U.ID(Not Unique)Main ID (Unique)Priority
2110High110High
3111Medium111Medium
4212Medium119Medium
5313Low212Medium
6514High220Low
7415Low226Medium
8716Medium313Low
9517High331High
10718Medium337Medium
11119Medium415Low
12220Low425Medium
13821High439Medium
14922Low514High
15623Medium517High
16524High524High
17425Medium623Medium
18226Medium629Low
19827Low632Medium
20128High716Medium
21629Low718Medium
22930Medium740Medium
23331High821High
24632Medium827Low
25933Medium845High
26134Low922Low
27135High930Medium
28236Low933Medium
29337Medium
30538High
31439Medium
32740Medium
33541Low
34742High
35143Low
36244Medium
37845High
38946Medium
39647Medium
40548Low
41449High
42250Low
43851Medium
44152High
45653Medium
46954Medium
47355Low
48656High
49957Low
Sheet7
Cell Formulas
RangeFormula
E2:G28E2=LET(d,FILTER(A2:C1000,A2:A1000<>""),r,ROWS(d),SORT(FILTER(d,MMULT((SEQUENCE(r)>=SEQUENCE(,r))*(INDEX(d,,1)=TOROW(INDEX(d,,1))),SEQUENCE(r,,,0))<=3)))
Dynamic array formulas.
 
Upvote 0
How about, for 365
Fluff.xlsm
ABCDEFG
1U.ID(Not Unique)Main ID (Unique)Priority
2110High110High
3111Medium111Medium
4212Medium119Medium
5313Low212Medium
6514High220Low
7415Low226Medium
8716Medium313Low
9517High331High
10718Medium337Medium
11119Medium415Low
12220Low425Medium
13821High439Medium
14922Low514High
15623Medium517High
16524High524High
17425Medium623Medium
18226Medium629Low
19827Low632Medium
20128High716Medium
21629Low718Medium
22930Medium740Medium
23331High821High
24632Medium827Low
25933Medium845High
26134Low922Low
27135High930Medium
28236Low933Medium
29337Medium
30538High
31439Medium
32740Medium
33541Low
34742High
35143Low
36244Medium
37845High
38946Medium
39647Medium
40548Low
41449High
42250Low
43851Medium
44152High
45653Medium
46954Medium
47355Low
48656High
49957Low
Sheet7
Cell Formulas
RangeFormula
E2:G28E2=LET(d,FILTER(A2:C1000,A2:A1000<>""),r,ROWS(d),SORT(FILTER(d,MMULT((SEQUENCE(r)>=SEQUENCE(,r))*(INDEX(d,,1)=TOROW(INDEX(d,,1))),SEQUENCE(r,,,0))<=3)))
Dynamic array formulas.
Thanks Fluff for the solution, is there any way the formula picks up random 3 Main ID instead of first 3. Currently, it is taking only top 3, which might make the samples skewed
 
Upvote 0
How about
Excel Formula:
=LET(d,FILTER(A2:C1000,A2:A1000<>""),r,ROWS(d),s,SORTBY(d,RANDARRAY(r)),SORT(FILTER(s,MMULT((SEQUENCE(r)>=SEQUENCE(,r))*(INDEX(s,,1)=TOROW(INDEX(s,,1))),SEQUENCE(r,,,0))<=3)))
Although this is volatile & will recalculate whenever any cell is changed or calculates.
 
Upvote 0
How about
Excel Formula:
=LET(d,FILTER(A2:C1000,A2:A1000<>""),r,ROWS(d),s,SORTBY(d,RANDARRAY(r)),SORT(FILTER(s,MMULT((SEQUENCE(r)>=SEQUENCE(,r))*(INDEX(s,,1)=TOROW(INDEX(s,,1))),SEQUENCE(r,,,0))<=3)))
Although this is volatile & will recalculate whenever any cell is changed or calculates.
sure will try this but when i am trying to run this formula for 5 lakh rows its throwing an error stating "Excel ran out of resources while attempting to calculate one or more formulas"
 
Upvote 0
For that number of rows, I'm not surprised. You will need a macro.
 
Upvote 0
I don't do much with VBA anymore, but will have a look if I get time & nobody else steps in.
 
Upvote 0

Forum statistics

Threads
1,215,069
Messages
6,122,958
Members
449,096
Latest member
Anshu121

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top