20% Random Sample

ARW17

Board Regular
Joined
Oct 31, 2016
Messages
109
I need to create a sample of "work that needs to be graded". For each person, in each month, I want to pull a 20% random sample.

"Person" "Month" "Number of Total Workloads"
1 5 10
1 6 5
2 5 20
2 6 5

For person 1, I need 2 random workloads for May and 1 for June to review.
For person 2, I need 4 random workloads for May and 1 for June to review.

I know I can add a random number and sort on this and then I could take the top N records, but as far as I see, I can't figure out how to take the top X%. I can calculate what would be 20% for each person for each month, but I don't see how to use that number as the selection for how many records to pull per person per month.
 
Can you post a copy of your database with enough data to illustrate the issue? I need some data to work with. It's been 6+ years since I posted the example.
 
Upvote 0

Excel Facts

Shade all formula cells
To shade all formula cells: Home, Find & Select, Formulas to select all formulas. Then apply a light fill color.
There's no way to configure or otherwise adjust SELECT TOP X PERCENT in SQL that I have ever heard of (i.e., to round it up or down).
I think (assuming you have your groups) you can solve the "at least one" problem with:
SQL:
select top 1 from MyGroup order by x
union
select top 20 percent from MyGroup order by x
This would give the first record and the first 20 percent of records and because union removes duplicates it would not repeat the first record if the two sets overlap.

As far as getting the result by actor by month, that's exactly what you will need to do - get the results by actor by month and from that select the top 20% from each of those groups. Generally (for MSAccess) you will have to combine some sql and vba to accomplish your goal - I don't think you will be able to write a straight up SQL query in MSAccess for this.
 
Upvote 0
If you can't post a db (I don't even know if it's allowed here?) then post raw data as a table. Pics don't help too much. I'm not going to create and fill my own tables based on pictures, but I might take a whack at it if I had data I could copy. I suspect that at least some of us are of that opinion.
 
Upvote 0
Agree totally with Micron and xenou.
You will have to try xenou's suggestion and let us know the outcome.
Good luck.
 
Upvote 0
I solved it by first adding the random number:
SELECT *
FROM (SELECT b.ACTOR, b.ENTRYDATE_MONTH, b.regbase, (select count(b.random_number) from entry_errors a where a.random_number >= b.random_number and a.actor=b.actor and a.entrydate_month=b.entrydate_month) AS ranking FROM (ENTRY_TINITIALS AS a LEFT JOIN entry_errors AS b ON a.TInitials = b.actor) LEFT JOIN archetypes_list AS c ON b.actor = c.[t-initials] WHERE b.cred_days <30 and c.archetype in (1,3) ORDER BY b.actor, b.entrydate_month) AS x
ORDER BY actor, entrydate_month, ranking;

then used a sort of ceiling function to get the NumberOfRowsDesired
SELECT actor, entrydate_month, -int(max(ranking)*-.2) AS numberofrowsdesired
FROM step1
GROUP BY actor, entrydate_month;

then
SELECT c.*
FROM (step1 AS a LEFT JOIN step2 AS b ON (a.actor = b.actor) AND (a.entrydate_month = b.entrydate_month)) LEFT JOIN entry_errors AS c ON (a.actor = c.actor) AND (a.regbase = c.regbase)
WHERE a.ranking <= b.numberofrowsdesired;

I appreciate the ideas!
 
Upvote 0
Glad you have a solution.

Here's a free sql formatter that may help with readability.

Code:
SELECT *
FROM (
    SELECT b.ACTOR
        ,b.ENTRYDATE_MONTH
        ,b.regbase
        ,(
            SELECT count(b.random_number)
            FROM entry_errors a
            WHERE a.random_number >= b.random_number
                AND a.actor = b.actor
                AND a.entrydate_month = b.entrydate_month
            ) AS ranking
    FROM (
        ENTRY_TINITIALS AS a LEFT JOIN entry_errors AS b
            ON a.TInitials = b.actor
        )
    LEFT JOIN archetypes_list AS c
        ON b.actor = c.[t-initials]
    WHERE b.cred_days < 30
        AND c.archetype IN (
            1
            ,3
            )
    ORDER BY b.actor
        ,b.entrydate_month
    ) AS x
ORDER BY actor
    ,entrydate_month
    ,ranking;
 
Upvote 0

Forum statistics

Threads
1,214,670
Messages
6,120,831
Members
448,990
Latest member
rohitsomani

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top