Count of Occurences From First Occurence

holzie

New Member
Joined
Apr 23, 2015
Messages
19
I am sure this might be an easy one as I refer to this site often and helps me out a lot!

I have large data that comes in with number of occurrences by date. What I am trying to do is try to find trends faster by counting the number of occurrences within the first 30 days of first occurrence. For example if the first occurrence happened Sep 1, and there were 3 more in September I would like to have this output for this particular Item in a separate column. See below for example of how I would like it to look. The first table is an example of data that I look at. The next table would be a separate table of a report I would like to see out of the data. Any ideas on how to create something like this?

ItemOccurrence Date
A7/11/2017
A7/14/2017
A7/17/2017
A8/25/2017
A8/27/2017
B1/2/2017
B2/5/2017
B3/6/2017
B4/9/2017
B5/15/2017
C3/6/2017
C3/8/2017
C3/9/2017
C3/10/2017
C3/12/2017

<tbody>
</tbody><colgroup><col><col></colgroup>


ItemFirst 30 Days Since First AppearanceFirst 60 Days Since First AppearanceFirst 90 Days Since First Appearance
A35
B122
C5

<tbody>
</tbody><colgroup><col><col><col><col></colgroup>

<tbody>
</tbody>
 
Yes that was a bit rubbishy I didn't look to hard at the results because I wasn't sure what the answer should be.
Try this:-
Again if the results are not correct please show an example of the correct results.
Code:
[COLOR="Navy"]Sub[/COLOR] MG11Oct12
[COLOR="Navy"]Dim[/COLOR] Rng [COLOR="Navy"]As[/COLOR] Range, Dn [COLOR="Navy"]As[/COLOR] Range, n [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long,[/COLOR] dif [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long,[/COLOR] Dt [COLOR="Navy"]As[/COLOR] Date, Q [COLOR="Navy"]As[/COLOR] Variant, c [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Dim[/COLOR] K [COLOR="Navy"]As[/COLOR] Variant, Al [COLOR="Navy"]As[/COLOR] Variant, nRay [COLOR="Navy"]As[/COLOR] Variant
[COLOR="Navy"]Set[/COLOR] Rng = Range(Range("A2"), Range("A" & Rows.Count).End(xlUp))
c = 1
[COLOR="Navy"]With[/COLOR] CreateObject("scripting.dictionary")
.CompareMode = vbTextCompare
[COLOR="Navy"]For[/COLOR] [COLOR="Navy"]Each[/COLOR] Dn [COLOR="Navy"]In[/COLOR] Rng
    [COLOR="Navy"]If[/COLOR] Not .Exists(Dn.Value) [COLOR="Navy"]Then[/COLOR]
        ReDim ray(1 To 1)
        ray(1) = Dn.Offset(, 1).Value
        .Add Dn.Value, ray
    [COLOR="Navy"]Else[/COLOR]
        Q = .Item(Dn.Value)
        ReDim Preserve Q(1 To UBound(Q) + 1)
        Q(UBound(Q)) = Dn.Offset(, 1).Value
        .Item(Dn.Value) = Q
    [COLOR="Navy"]End[/COLOR] If
[COLOR="Navy"]Next[/COLOR]
Columns("D:H").ClearContents
Range("D1").Resize(, 5).Value = Array("Item", "Start Date", "First 30 Days Since First Appearance", "First 60 Days Since First Appearance", "First 90 Days Since First Appearance ")
[COLOR="Navy"]For[/COLOR] [COLOR="Navy"]Each[/COLOR] K [COLOR="Navy"]In[/COLOR] .keys
    [COLOR="Navy"]Set[/COLOR] Al = CreateObject("System.Collections.ArrayList")
    [COLOR="Navy"]For[/COLOR] n = 1 To UBound(.Item(K))
        [COLOR="Navy"]If[/COLOR] Not Al.Contains(.Item(K)(n)) [COLOR="Navy"]Then[/COLOR] Al.Add .Item(K)(n)
    [COLOR="Navy"]Next[/COLOR] n
    Al.Sort: .Item(K) = Al.ToArray
    Dt = .Item(K)(0)
   
    c = c + 1
    Cells(c, 4) = K
    Cells(c, 5) = Dt
        [COLOR="Navy"]If[/COLOR] UBound(.Item(K)) > 1 [COLOR="Navy"]Then[/COLOR]
                [COLOR="Navy"]For[/COLOR] n = 1 To UBound(.Item(K))
                    dif = DateDiff("d", Dt, .Item(K)(n))
                    [COLOR="Navy"]If[/COLOR] dif <= 30 [COLOR="Navy"]Then[/COLOR] Cells(c, 6) = Cells(c, 6) + IIf(n = 1, 2, 1)
                    [COLOR="Navy"]If[/COLOR] dif <= 60 [COLOR="Navy"]Then[/COLOR] Cells(c, 7) = Cells(c, 7) + IIf(n = 1, 2, 1)
                    [COLOR="Navy"]If[/COLOR] dif <= 90 [COLOR="Navy"]Then[/COLOR] Cells(c, 8) = Cells(c, 8) + IIf(n = 1, 2, 1)
                [COLOR="Navy"]Next[/COLOR] n
        [COLOR="Navy"]Else[/COLOR]
            Cells(c, 6) = 1
        [COLOR="Navy"]End[/COLOR] If
[COLOR="Navy"]Next[/COLOR] K
[COLOR="Navy"]End[/COLOR] [COLOR="Navy"]With[/COLOR]
[COLOR="Navy"]End[/COLOR] [COLOR="Navy"]Sub[/COLOR]
Regards Mick
 
Upvote 0

Excel Facts

How to show all formulas in Excel?
Press Ctrl+` to show all formulas. Press it again to toggle back to numbers. The grave accent is often under the tilde on US keyboards.
You are truly a wizard. The goal for me if this were to work is that I could sort on the 30,60, or 90 day column to find the highest frequency items and that will be my trigger to understand these occurrences. The bottom chart is using the latest macro and the top chart is what I would expect it to look.

ItemOccurrence DateItemStart DateFirst 30 Days Since First AppearanceFirst 60 Days Since First AppearanceFirst 90 Days Since First Appearance
224627/6/2017MX2647/6/2017112
224629/27/2017

<tbody>
</tbody><colgroup><col><col><col span="2"><col><col><col><col span="2"></colgroup>

ItemOccurrence DateItemStart DateFirst 30 Days Since First AppearanceFirst 60 Days Since First AppearanceFirst 90 Days Since First Appearance
224627/6/2017MX2643/27/20176710
224629/27/2017

<tbody>
</tbody><colgroup><col><col><col span="2"><col><col><col><col span="2"></colgroup>
 
Upvote 0
Before you go too much further, I have a few comments/questions

1. When there have been 2 (or more) suggestions, it would be good if you clarified which one you are referring to when you make a statement like "I was able to make it work .." :)

2. When I suggested my code I was basing it on text in the first column (like your original sample). Given that there are (mostly) numerical values in column A of your actual data, my code would need a slight tweak for that but I'll wait pending clarification of other questions below..

3. Your original post result table implied that "First 60 Days Since First Appearance" included the count from the "First 30 days". That is, for Item A you had a count of 5 for "First 60 days". This five must include the 3 you counted for the "First 30 days". To be consistent then, your result for "First 90 days" should also be 5. If the results for 60 and 90 days should not include the earlier results, then surely your result for 60 days should have only been 2. Can you please clarify how those counts should work?

4. You seem to be implying that Mick's latest code is working for you. However, when I run that code it produces a blank in the "First 30" column for some items (eg on your post 1 data it produces a blank for item B). I'm wondering how that can be when surely the very first date of every item will be within 30 days of itself? At least that is what your given result for item B in post 1 indicates. Can you please clarify whether that earliest date should be included in the count (like post 1 example), or excluded from the count (like I think Mick's code is doing). If it is correct to exclude that first date, what should happen if there are multiple instances of that earliest date? For example, if the earliest date is 26 May 2017 and there are 3 instances of that date for the item, should the first 30 day count include none of those dates, 1 of those dates or all 3 of those dates?
 
Last edited:
Upvote 0
Before you go too much further, I have a few comments/questions

1. When there have been 2 (or more) suggestions, it would be good if you clarified which one you are referring to when you make a statement like "I was able to make it work .." :) Ok, my apologies. I initially thought it was working but the more I looked the more it wasn't working.

2. When I suggested my code I was basing it on text in the first column (like your original sample). Given that there are (mostly) numerical values in column A of your actual data, my code would need a slight tweak for that but I'll wait pending clarification of other questions below..

I didn't realize the code would be affected by text or numbers. I apologize. Column A will always contain letters and numbers combined. I took the letters out of most of the items in column A to not share too much from the company but Column A will always consist of numbers and letters.

3. Your original post result table implied that "First 60 Days Since First Appearance" included the count from the "First 30 days". That is, for Item A you had a count of 5 for "First 60 days". This five must include the 3 you counted for the "First 30 days". To be consistent then, your result for "First 90 days" should also be 5. If the results for 60 and 90 days should not include the earlier results, then surely your result for 60 days should have only been 2. Can you please clarify how those counts should work?



4. You seem to be implying that Mick's latest code is working for you. However, when I run that code it produces a blank in the "First 30" column for some items (eg on your post 1 data it produces a blank for item B). I'm wondering how that can be when surely the very first date of every item will be within 30 days of itself? At least that is what your given result for item B in post 1 indicates. Can you please clarify whether that earliest date should be included in the count (like post 1 example), or excluded from the count (like I think Mick's code is doing). If it is correct to exclude that first date, what should happen if there are multiple instances of that earliest date? For example, if the earliest date is 26 May 2017 and there are 3 instances of that date for the item, should the first 30 day count include none of those dates, 1 of those dates or all 3 of those dates?

Ideally, if it has been only 30 days since the first occurrence and there were 5 different occurrences for the first 30 days I would like the table to look like the results table to look like.

ItemFirst Appearance30 Days60 Days90 Days
AM144457/21/2017500

<tbody>
</tbody><colgroup><col><col><col span="3"></colgroup>



If there were 5 more instances between 30-60 days I would like to see the results like this.


ItemFirst Appearance30 Days60 Days90 Days
AM144457/21/20175100

<tbody>
</tbody><colgroup><col><col><col span="3"></colgroup>


And if 5 more instances from 60-90 days

ItemFirst Appearance30 Days60 Days90 Days
AM144457/21/201751015

<tbody>
</tbody><colgroup><col><col><col span="3"></colgroup>


If there were 5 instances in the first 30 days and no more instances for longer than 90 days

First Appearance30 Days60 Days90 Days
AM144457/21/2017555

<tbody>
</tbody>


Yes, the earliest date should be included in the count.



Thanks for the questions and sorry for not being clear. I did not realize the code needed to be so specific. I do appreciate the help and will provide large impact as I was told by others that it would be impossible to do what I am asking. I hope my answers provide clarity.
 
Last edited by a moderator:
Upvote 0
Sorry, I hadn't seen that part of your answers were accidentally included in your quote of my post. I have removed that now & I'll take another look.
 
Last edited:
Upvote 0
Ideally, if it has been only 30 days since the first occurrence and there were 5 different occurrences for the first 30 days I would like the table to look like the results table to look like.

ItemFirst Appearance30 Days60 Days90 Days
AM144457/21/2017500

<tbody>
</tbody><colgroup><col><col><col span="3"></colgroup>


If there were 5 instances in the first 30 days and no more instances for longer than 90 days

First Appearance30 Days60 Days90 Days
AM144457/21/2017555

<tbody>
</tbody>
These 2 examples seem inconsistent to me. If the last example is correct why isn't the first table also 5, 5, 5 at that point?
 
Upvote 0
These 2 examples seem inconsistent to me.
Assuming the last example is the correct one, give this a try. Don't forget to adjust the date format in the code line indicated.

Rich (BB code):
Sub CountEm()
  Dim dMin As Object, dRow As Object
  Dim a As Variant, b As Variant, aDays As Variant
  Dim i As Long, j As Long
  
  aDays = Array(30, 60, 90) '<- The 'day periods' you want to count
  
  ReDim Preserve aDays(1 To UBound(aDays) - LBound(aDays) + 1)
  Set dMin = CreateObject("Scripting.Dictionary")
  Set dRow = CreateObject("Scripting.Dictionary")
  a = Range("A2", Range("B" & Rows.Count).End(xlUp)).Formula
  For i = 1 To UBound(a)
    If dMin.Exists(a(i, 1)) Then
      If a(i, 2) < dMin(a(i, 1)) Then
        dMin(a(i, 1)) = a(i, 2)
      End If
    Else
      dMin(a(i, 1)) = a(i, 2)
      dRow(a(i, 1)) = dMin.Count
    End If
  Next i
  With Range("D1:E1")
    .Value = Array("Item", "First")
    With .Offset(1).Resize(dMin.Count)
      .Columns(1).NumberFormat = "0"
      .Columns(2).NumberFormat = "d/mm/yyyy"   '<- Adjust this to suit your preferred date format
      .Value = Application.Transpose(Array(dMin.keys, dMin.items))
    End With
    .EntireColumn.AutoFit
  End With
  ReDim b(1 To dMin.Count, 1 To UBound(aDays))
  For i = 1 To UBound(a)
    For j = 1 To UBound(aDays)
      If a(i, 2) - dMin(a(i, 1)) <= aDays(j) Then
        b(dRow(a(i, 1)), j) = b(dRow(a(i, 1)), j) + 1
      End If
    Next j
  Next i
  With Range("F1").Resize(, UBound(b, 2))
    .NumberFormat = "0"" Days"""
    .Value = aDays
    .Offset(1).Resize(dMin.Count).Value = b
    .EntireColumn.AutoFit
  End With
End Sub
 
Upvote 0
The first example would be assuming there hasn't been more than 30 days since the first occurrence and the macro was run.

The second example would be assuming more than 90 days was surpassed but all of the instances happened in the first 30 days and never had another incident. So if you don't have another instance happen after the first 30 days, you still had 5 instances within 60 days and 5 instances within 90 days.

Does that clear up the thought process?
 
Upvote 0
I have checked a few so far and the new code looks like it is a winner! You are truly amazing.
 
Upvote 0
I have checked a few so far and the new code looks like it is a winner!
So long as you are happy after you have checked carefully. :)


But I'm still a little unsure about this:
The first example would be assuming there hasn't been more than 30 days since the first occurrence and the macro was run.
Whatever count there is for the first 30 days (and we've established that there will always be at least one), my code will produce at least that number in the 60 and 90 day columns too. That is, the 60 & 90 days can never be zero with my code.

Are you implying that we need to check today's date as well? So say if the earliest date was 1 October 2017 and we ran the code today, there should be zero in the 60 & 90 day columns because the earliest date is less than 30 days before today?
 
Upvote 0

Forum statistics

Threads
1,215,219
Messages
6,123,680
Members
449,116
Latest member
HypnoFant

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top