Calculate probability based on changing values

ColdplayCoaster · Mar 14, 2021

Greetings all,

I am stuck in a quandary and could use some assistance. I have values being calculated and would like to find the value based on my seniority.

The lowest daily value days should give the highest percentage, but currently it's backwards. Any time I change it, it messes up if values are less than others.

"My Seniority" = a number that could be greater/less than the daily value
"Daily Value" = a positive number, no more than the "Total" value
"Multiplier" = this is simply a qualifier to determine if a day should calculate or not.

AA Tracker.xlsx

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

AA

AB

AC

AD

AE

AF

AG

1

APRIL

SUN

MON

TUE

WED

THU

FRI

SAT

2

TOTAL RESERVES

Mar 28

0

0

0

0

1

1

1

3

871

637

676

732

4

5

MY SENIORITY

1

1

1

1

1

1

1

6

523

741

625

407

367

376

525

670

7

8

JUNIOR TO ME

1

1

1

1

1

1

1

9

348

664

580

385

343

423

551

612

10

11

% SENIORITY

1

1

1

1

1

1

1

12

60.05%

626

495

393

370

408

470

600

13

14

TOTAL BIDDERS

1

0

0

0

1

1

1

15

4,623

579

431

409

448

480

560

16

17

TARGETED LINE

0

0

0

0

0

0

0

18

3,752

19

0.00%

RSV Bid Strategy

Cell Formulas
Range	Formula
J2,AD17,Z17,V17,R17,N17,J17,AD14,Z14,V14,R14,N14,J14,AD11,Z11,V11,R11,N11,J11,AD8,Z8,V8,R8,N8,J8,AD5,Z5,V5,R5,N5,J5,AD2,Z2,V2,R2,N2	J2	=F2+1
F4,AD19,Z19,V19,R19,N19,J19,F19,AD16,Z16,V16,R16,N16,J16,F16,AD13,Z13,V13,R13,N13,J13,F13,AD10,Z10,V10,R10,N10,J10,F10,AD7,Z7,V7,R7,N7,J7,F7,AD4,Z4,V4,R4,N4,J4	F4	=IFERROR(1-IF(I2=2,1,(IF(F3,(IF($A$6<=F3,(((F3-$A$6)/F3)I2),(($A$6-F3)/F3)I2)),""))),0)
F5,F17,F14,F11,F8	F5	=AD2+1
A9	A9	=A3-A6
A12	A12	=A6/A3
A18	A18	=A15-A3

KRice · Mar 14, 2021

You've omitted a lot of detail that is necessary to advise further. For the terms that you've mentioned, are the following assumptions correct?

1. "My Seniority" is a single number you enter into cell A6, currently 523, which represents your seniority relative to a larger population consisting of 871 "total reserves" shown in cell A3.

2. The "Daily Value" is a set of numbers that you input, one for each day of the month...but the numbers are not computed here. The Daily Value is found immediately below the date for each day of the month...for example, for 1 April, the Daily Value is 637, and for 30 April, it is 480.

3. The "Multiplier" is a set of numbers that you input, one for each day of the month...and again, the numbers are not computed here. The Multiplier is found immediately to the right of the date for each day of the month...for example, for 1 April, the Multiplier is 1, and for 28 April, it is 0 (a blank). It appears, based on your formula, that the Multipliers might take on a value of 2 sometimes.

It appears that you are defining the seniority percentage on the basis of how far you are from the top of a seniority list. In other words, the most senior person has a seniority of 0 % and the most junior person has a seniority of 100 %. Because in your example, you have 60 % seniority when you are 523rd out of 871 (and there are 348 people more junior to you).

So if the Daily Values represent your somehow daily-varying position within this population of 871, you should expect larger Daily Values will have higher seniority percentages, and they do. Similarly, smaller Daily Values will have lower seniority percentages, and they do.

What am I not understanding?

ColdplayCoaster · Mar 14, 2021

KRice said:
You've omitted a lot of detail that is necessary to advise further. For the terms that you've mentioned, are the following assumptions correct?

1. "My Seniority" is a single number you enter into cell A6, currently 523, which represents your seniority relative to a larger population consisting of 871 "total reserves" shown in cell A3.

2. The "Daily Value" is a set of numbers that you input, one for each day of the month...but the numbers are not computed here. The Daily Value is found immediately below the date for each day of the month...for example, for 1 April, the Daily Value is 637, and for 30 April, it is 480.

3. The "Multiplier" is a set of numbers that you input, one for each day of the month...and again, the numbers are not computed here. The Multiplier is found immediately to the right of the date for each day of the month...for example, for 1 April, the Multiplier is 1, and for 28 April, it is 0 (a blank). It appears, based on your formula, that the Multipliers might take on a value of 2 sometimes.

It appears that you are defining the seniority percentage on the basis of how far you are from the top of a seniority list. In other words, the most senior person has a seniority of 0 % and the most junior person has a seniority of 100 %. Because in your example, you have 60 % seniority when you are 523rd out of 871 (and there are 348 people more junior to you).

So if the Daily Values represent your somehow daily-varying position within this population of 871, you should expect larger Daily Values will have higher seniority percentages, and they do. Similarly, smaller Daily Values will have lower seniority percentages, and they do.

What am I not understanding?

Hello KRice,

My apologies for the omission:

The end goal is to determine the "probability" of having a certain day off. I.E. a day with a lower Daily Value should give me a higher percentage change of having the day off as less people are required. In theory, the bottom 348 (people junior) would fill those spots first (on a day requiring more than 348).

April 14 should have the second highest (behind April 26-28) probability because only 343 are needed, yet there are 348 people junior to me. April 4th should have the lowest probability because 741 (the highest value) of people are needed (the 348 junior and then counting up from there). Hopefully that makes more sense.

The Daily Values should be inverted so that a larger daily value (requiring more people) would mean I have less of a chance of holding the day off. The issue I've run into is a Daily Value less than the "junior to me".

1. Correct. The total population is 4,623 (cell A15), but only comparing it to the Reserve population (871).
2. Also correct
3. The "multiplier" is either a 1 or 0. If it is 0, then I am guaranteed the day off (100% likelihood) and don't need to calculate. If it is a 1, then it should do the calculation mentioned above.

Does that help clear it up a bit? Thank you!

KRice · Mar 14, 2021

Thank you, that helps to clear up some things. So the Daily Value represents the number of people who are required to perform some mission on a given day...and because there are 348 people more junior than you, smaller Daily Values improve your chance of not being needed?

If that is the case, then for your column A calculations, if you rank 523 out of 871, meaning there are 348 people more junior...then why is your "Seniority" 60 % rather than 40 %? If all 871 people were available on a given day, then your seniority rank would be 40 % from the bottom, right? (871 - 523) / 871 = 348 / 871 = 40 %
Yet the column A computation shows 60 % , which doesn't seem correct to me.

I'm still somewhat puzzled by the objective. How does this computation relate to probability? If there is a need for 400 people on some day and you are ranked 523 out of 871, there are 348 people junior to you who would presumably get the call to report first, then aren't you nearly certain to also get the call to report because an additional 77 people are still needed? Where is the probability component?

KRice · Mar 17, 2021

I'm curious if you made any further progress with this. In my last post, I was attempting to understand why the probability calculations you've described would be considered valid...I just don't see it. If you are told what the Daily Value is, and if the staffing positions are filled based on seniority (lowest seniority reports for duty first), then it would seem that any Daily Value over 348 ensures that you need to work.

There are two probabilistic considerations that occur to me. One is based on the distribution of Daily Values that are given to you. Based on the DV's in your calendar for April-May in your first post, only 1 out of 31 DVs is less than 348 (April 14 with DV=343). Presumably, on that day, the staffing needs were met entirely by people who were more junior to you, and you would have had that day off. The next smallest DV is greater than 348 (April 7 with DV=367), so for that day and all of the other days, it appears that you would have to work. My question about "where" the probability considerations are doesn't seem to involve the equation you've been using. Instead, it would be the distribution of DVs that matters, and where your seniority ranking is relative to that distribution. As a first order estimate, I would say that the distribution of DVs is such that about 1/31 or about 3.2 % of the time, a DV will be declared that allows for full staffing with more junior people, and you would have the day off. That's an observation taken directly from the Post #1 data.

Ideally, you would like to have a much larger data set and attempt to identify a probability distribution function that reasonably well resembles the actual DV data points. I looked a little further at the 31 DV points available. A normal distribution model does not appear to be great match, but also not horrible. A log-normal distribution was slightly better, but unless more data were available and more effort put into identifying a better model, it's difficult to go further.

Let's assume that more data shows the normal distribution is a reasonably good descriptor for the DVs. Then a probability estimate can be made: What is the probability that a DV will be delivered that is less than or equal to the number of people who are more junior to me? And the answer would be =NORM.DIST(# junior, mean, std dev, 1)
In this example, the number more junior = 348, and the statistics of the 31 data points in your post #1, assuming a normal distribution, give a mean = 510, and a standard deviation = 111. The 4th parameter of NORM.DIST says to return the cumulative normal distribution. So =NORM.DIST(348, 510, 111, 1) = 7.3 %, suggesting that there is about a 7 % chance that you would have the day off. This estimated percentage doesn't change from day to day, unless either the distribution statistics change (mean and standard deviation), or the number of people junior to you changes.

The only other probabilistic consideration that occurs to me involves what happens when word is delivered to all of the people who need to report for service. Is their response rate 100 %? If not, then there is another probability component that could be examined, especially if statistics were available on the response rates over time. The implication is that a DV somewhat less than 348 (which above I said probably meant that you would not have to work), may indeed mean that you still might have to work that day, if the response rate of those more junior is too low.

Calculate probability based on changing values

ColdplayCoaster

New Member

Excel Facts

KRice

Well-known Member

ColdplayCoaster

New Member

KRice

Well-known Member

KRice

Well-known Member

Forum statistics

Share this page

Calculate probability based on changing values

ColdplayCoaster

New Member

Excel Facts

KRice

Well-known Member

ColdplayCoaster

New Member

KRice

Well-known Member

KRice

Well-known Member

Forum statistics

Share this page

We've detected that you are using an adblocker.

Which adblocker are you using?

Disable AdBlock

Disable AdBlock Plus

Disable uBlock Origin

Disable uBlock