Count number of occurrences values increase by X within N days in absolute and % terms

Barque

New Member
Joined
Nov 25, 2020
Messages
15
Office Version
  1. 365
I am trying to extract some information out of a data series. I am first trying to count the number of positive and negative streaks and the magnitude of each move. The 2nd step will be to count the frequency of moves by their size and their date.

I have tried looking at some nested IF/AND functions on other threads but am struggling with the best path. I am happy to use helper columns, as many as needed, to keep things simple and not hog too much system resources.

The box on the right is what a summary table might look like. I will be needing both regular and % changes. I work with Climate data.

Any suggestions are appreciated. I have become a big fan of Excel 365 new functions like Filter, Xlookup, etc but haven't reached a solution yet with anything.

Climate Data.xlsx
ABCDEFGH
1DateTempChange% Change +/-Start Date3/1/2021
24/16/202110000.00%0End Date4/16/2021
34/15/202199-1-1.00%0Lookback
44/14/202198-1-1.01%0
54/13/202197-1-1.02%0# of Moves > 10%
64/12/202196-1-1.03%0# of Moves > 5%
74/9/202110044.17%1# of Moves < 10%
84/8/202195-5-5.00%0# of Moves < 5%
94/7/202190-5-5.26%0Shortest Move > 10%
104/6/202185-5-5.56%0
114/5/202180-5-5.88%0
124/2/20211002025.00%1
134/1/202190-10-10.00%0
143/31/202180-10-11.11%0
153/30/202170-10-12.50%0
163/29/202160-10-14.29%0
173/26/20211004066.67%1
183/25/202185-15-15.00%0
193/24/202170-15-17.65%0
203/23/202155-15-21.43%0
213/22/202140-15-27.27%0
223/19/202110060150.00%1
233/18/202180-20-20.00%0
243/17/202160-20-25.00%0
253/16/202140-20-33.33%0
263/15/202120-20-50.00%0
273/12/2021301050.00%1
283/11/2021401033.33%1
293/10/2021501025.00%1
303/9/2021601020.00%1
313/8/2021701016.67%1
323/5/2021902028.57%1
333/4/20211001011.11%1
343/3/202150-50-50.00%0
353/2/202110050100.00%1
363/1/202110000.00%0
Sheet1
Cell Formulas
RangeFormula
E2:E36E2=IF(C2>0,"1","0")
H1H1=MIN(A2:A36)
H2H2=MAX(A2)
C3:C36C3=B3-B2
D3:D36D3=(B3-B2)/B2
B3:B6B3=B2-1
B8:B11B8=B7-5
B13:B16B13=B12-10
B18:B21B18=B17-15
B23:B26B23=B22-20
B27:B31B27=B26+10
 

Excel Facts

What do {} around a formula in the formula bar mean?
{Formula} means the formula was entered using Ctrl+Shift+Enter signifying an old-style array formula.
You could do something like this to offer some flexibility in setting the criteria for the counts:
Book1
GHIJ
1Start Date3/1/2021
2End Date4/16/2021
3Lookback
4CriteriaThresholdCount
5# of Moves>10%11
6# of Moves>20%8
7# of Moves<10%24
8# of Moves<1%23
9Shortest Move > 10%
Sheet6
Cell Formulas
RangeFormula
H1H1=MIN(A2:A36)
H2H2=MAX(A2)
J5:J8J5=COUNTIF($D$2:$D$36,H5&I5)


Two questions:
1) Could you clarify what is meant by, say, a move <10%? Does that mean you are looking for all positive changes less than 10% (which will include all negative changes regardless of their magnitude), or are you looking for changes of x, where -0.1<x<0.1?
2) For the Shortest Move >10%, how are you defining "shortest"? There are two parts to this answer: one involves whether you are looking for the absolute value of the change (related to question #1), and the other involves whether you are basing "shortest" on the percentage movement or the absolute magnitude movement (i.e., column C vs. column D).
 
Upvote 0
I should have made the data Chronological. Below table is updated and should be easier to speak about examples with:

Climate Data.xlsx
ABCDE
1DateTempChange% Change +/-
23/1/20211000
33/2/202198-2-2.00%0
43/3/202196-2-2.04%0
53/4/202194-2-2.08%0
63/5/202192-2-2.13%0
73/8/202110088.70%1
83/9/202110555.00%1
93/10/202111054.76%1
103/11/202111554.55%1
113/12/202112054.35%1
123/15/2021100-20-16.67%0
133/16/202190-10-10.00%0
143/17/202180-10-11.11%0
153/18/202170-10-12.50%0
163/19/202160-10-14.29%0
173/22/20216000.00%0
183/23/20216335.00%1
193/24/20216634.76%1
203/25/20216934.55%1
213/26/20217234.35%1
223/29/20217534.17%1
233/30/20217834.00%1
243/31/20218133.85%1
254/1/20218433.70%1
264/2/20218733.57%1
274/5/20219033.45%1
284/6/20219555.56%1
294/7/20211202526.32%1
304/8/202170-50-41.67%0
314/9/20217557.14%1
324/12/202174-1-1.33%0
334/13/202173-1-1.35%0
344/14/202172-1-1.37%0
354/15/202171-1-1.39%0
364/16/20211002940.85%1
Sheet1
Cell Formulas
RangeFormula
E2:E36E2=IF(C2>0,"1","0")
C3:C36C3=B3-B2
D3:D36D3=(B3-B2)/B2
B3:B6B3=B2-2
B8:B11B8=B7+5
B12B12=B11-20
B13:B16B13=B12-10
B18:B26B18=B17+3
B32:B35B32=B31-1




1.) Good question. I do wish to separate moves into positive and negative buckets first.
2.) Shortest move would be the least amount of days to yield a change of >10%, etc. I probably should have made the changes a bit smaller on the example sheet. I would eventually like to capture absolute magnitude, bucketed by positive or negative, and percentage movement.

One of the main goals is to count and quantify streaks by magnitude and amount of time. For example, A2 to A6, 3/1/2021 - 3/5/2021, was 5 full days. The Temperature change over that time period was 8 degrees, and also was an 8% decrease. So I would like to capture both of those data points somehow. If those happened to be the largest and quickest temperate changes in my dataset, that would be very valuable to know.

I imagine I might need to create helper columns to help loop through the data somehow. I have to answer questions like "What was the fastest temperature rise or fall over 5 days? and by percentage? "The quickest 3 day rise in temperature each Spring going back 5 years?" "What was the largest swing in temperature overnight when humidity was below 60%?"

I am having a tough time with "basing" or anchoring. I think the 1st step of the logic function is something like IF B3<B2, then start counting this as a negative streak. Find the last negative day, stop there, and calculate the absolute move, percentage move, and how long the move was.

Thanks for your response and I hope that clarifies things.
 
Upvote 0
Some of your objectives are clearer...thank you. It sounds as if you plan to perform additional queries that were not initially described. For those that were, my earlier posting offers one way to perform the counts of time steps where sequential steps "exceed" some threshold. For example, if one step has a metric value of x, then exceeding that metric by more than 10% means the subsequent step is >1.1x, and exceeding that metric by more than -10% means the subsequent step is <0.9x. In other words, the step size of the measured value is larger than 10% of the original step's value in some direction. The "count" approach described earlier uses inputs of > and < and some percentage value.

I completely misinterpreted your initial description of a "shortest move > 10%". I assumed that meant which adjacent time steps yielded a movement of >10%, and of those, which was closest to 10%. Your clarification says that for each time step, look at the metric (apparently a temperature) and find the first date on which the metric has been exceeded by at least 10%. The working example below shows one way to do this with helper columns. Not all of the helper columns are necessary, and there are other ways to accomplish everything shown, especially with Excel 365 (which I do not have). Your input is the blue shaded cell, where a positive percentage means you are looking for the next time step where the current temperature is exceeded by that percentage. A negative percentage says to find the next time step where the temperature has decreased by at least that percentage. In this example, I used -10%, so I am looking for the date when the temperature has decreased 10% relative to my current date's temperature. Each time step potentially generates an answer to this question, so you then need to look through this list of dates, determine the time duration between those dates (I used a simple difference to indicate the number of intervals...if you want the end dates included, then 1 needs to be added), and then find the smallest duration. In this case, the helper columns establish T_threshold (the temperature threshold that must be exceeded based on the current date's temperature)(see column G), the date on which T_threshold is first exceeded (col H), the temperature on that date (col I), and the number of days elapsed before T_threshold was exceeded (col J). Since there may be multiple occasions when this occurs, I2:J3 determine the smallest duration and perform a count of the occurrences. In this case, we learn that the smallest duration is only 1 day and it occurs 4 different times. The formulas in cols L:M build this list of those 4 occasions. Again, there are more efficient ways to build that list with Excel 365's improved function set.

Please see father below some additional commentary about temperature.
MrExcel20210420.xlsx
ABCDEGHIJKLM
1
2>+ or <-count-->4List of date pairs
3-10%smallest d-->1-->with shortest time
4DateTempChange% Change +/-T_thresholddate exceeding thresholdTemp on datedaysFromTo
53/1/20211000903/17/202180163/16/20213/17/2021
63/2/202198-2-0.02088.23/17/202180153/17/20213/18/2021
73/3/202196-2-0.020408163086.43/17/202180143/18/20213/19/2021
83/4/202194-2-0.020833333084.63/17/202180134/7/20214/8/2021
93/5/202192-2-0.021276596082.83/17/20218012  
103/8/202110080.0869565221903/17/2021809  
113/9/202110550.05194.53/16/2021907  
123/10/202111050.0476190481993/16/2021906  
133/11/202111550.0454545451103.53/15/20211004  
143/12/202112050.04347826111083/15/20211003  
153/15/2021100-20-0.1666666670903/17/2021802  
163/16/202190-10-0.10813/17/2021801  
173/17/202180-10-0.1111111110723/18/2021701  
183/18/202170-10-0.1250633/19/2021601  
193/19/202160-10-0.142857143054     
203/22/20216000054     
213/23/20216330.05156.7     
223/24/20216630.047619048159.4     
233/25/20216930.045454545162.1     
243/26/20217230.043478261164.8   
253/29/20217530.041666667167.5   
263/30/20217830.04170.24/8/2021709
273/31/20218130.038461538172.94/8/2021708
284/1/20218430.037037037175.64/8/2021707
294/2/20218730.035714286178.34/8/2021706
304/5/20219030.0344827591814/8/2021703
314/6/20219550.055555556185.54/8/2021702
324/7/2021120250.26315789511084/8/2021701
334/8/202170-50-0.416666667063   
344/9/20217550.071428571167.5   
354/12/202174-1-0.013333333066.6   
364/13/202173-1-0.013513514065.7   
374/14/202172-1-0.01369863064.8   
384/15/202171-1-0.013888889063.9   
394/16/2021100290.408450704190   
Sheet1
Cell Formulas
RangeFormula
J2J2=COUNTIF($J$5:$J$39,J3)
J3J3=SMALL(J$5:J$39,1)
E5:E39E5=IF(C5>0,"1","0")
G5:G39G5=$B5*(1+G$3)
H5:H39H5=IFERROR(INDEX($A5:$A$39,AGGREGATE(15,6,(ROW($B5:$B$39)-ROW($B4))/(IF($G$3>0,$B5:$B$39>G5,$B5:$B$39<G5)),1)),"")
I5:I39I5=IFERROR(VLOOKUP(H5,$A$5:$B$39,2),"")
J5:J39J5=IFERROR(H5-$A5,"")
L5:L23L5=IFERROR(INDEX($A$5:$A$39,AGGREGATE(15,6,(ROW($J$5:$J$39)-ROW($J$4))/($J$5:$J$39=$J$3),IF(ROWS(L$5:L5)<=$J$2,ROWS(L$5:L5),""))),"")
M5:M23M5=IFERROR(L5+$J$3,"")
C6:C39C6=B6-B5
D6:D39D6=(B6-B5)/B5
B6:B9B6=B5-2
B11:B14B11=B10+5
B15B15=B14-20
B16:B19B16=B15-10
B21:B29B21=B20+3
B35:B38B35=B34-1
Press CTRL+SHIFT+ENTER to enter array formulas surrounded with curly braces.


I will offer some additional commentary not directly related to the Excel questions. It appears that you are using temperatures as the metrics and have established some percentage of movement in those temperatures as being important. I would advise caution with that approach. The choice of temperature scale will likely change your answers. I don't know what your actual data look like, but if they are based on a non-thermodynamic temperature scale, then it would be prudent to investigate why the analysis should use that scale. Typically, a thermodynamic scale, such as the Kelvin or Rankine scales are used to avoid this issue. As an example, consider the notional example below with evenly spaced time steps and a steady increasing temperature of 2 degrees C per step. For time1, with a temperature of 16 C, we see that its temperature is exceeded at time2, whose temperature is 18 C. But if we do nothing more than convert temperatures from degrees C to Kelvin, then we see that time1's temperature of 289.15 K is exceeded at time13. From a thermodynamics perspective, the latter would be considered the correct answer.

Cell Formulas
RangeFormula
AC5:AC20AC5=AB5+273.15
AD5:AD20AD5=$AB5*(1+AD$3)
AE5:AE20AE5=$AC5*(1+AE$3)
AF5:AG20AF5=IFERROR(INDEX($AA$5:$AA$20,MATCH(AD5,AB$5:AB$20,1)+1),"")
 
Upvote 0

Forum statistics

Threads
1,213,536
Messages
6,114,213
Members
448,554
Latest member
Gleisner2

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top