Simple Formula or VBA for Verifying Duplicates + one Criteria

bradmsg

New Member
Joined
Jan 30, 2023
Messages
32
Office Version
  1. 365
Platform
  1. Windows
Hi all,

I have been racking my brain for a couple weeks and need some help on this. Sometimes I over look over simple answer and I need someone to put me out of my misery.

Problem:

I have a 300,000 record long list. There are multiple duplicate numbers in column A and a date in column B. I need to check if ALL duplicates are within a certain date range (Between D2 and D4).

1675122551945.png
 
That was exactly what I need it to do!!!!!!! I cant thank you enough! I have been banging my head against the wall for weeks trying to get this functionality! I owe you a beer! or 10!
No problem. Glad I could help!
 
Upvote 0

Excel Facts

Excel motto
Not everything I do at work revolves around Excel. Only the fun parts.
YOU ARE A ****ING GENIOUS!!!!!!!!

That was exactly what I need it to do!!!!!!! I cant thank you enough! I have been banging my head against the wall for weeks trying to get this functionality! I owe you a beer! or 10!
Welcome to the MrExcel board!

😵‍💫 Have you tried it on your 300,000 rows?

My test data has about 35,000 rows (much smaller than yours) with about 500 different dupes that have all dates in the date range.
Both the code from post #9 (after I edited the number of rows to 35,000) and the code below produced the same results but that one took 50 seconds on my machine whereas this one took 0.08 seconds!

I have assumed that there will be less than 65,000 rows in the results. If it could be more then my code would need a slight adjustment.

VBA Code:
Sub Dupes_In_Range()
  Dim d1 As Object, d2 As Object
  Dim a As Variant
  Dim dStart As Date, dEnd As Date
  Dim i As Long
 
  Set d1 = CreateObject("Scripting.Dictionary")
  Set d2 = CreateObject("Scripting.Dictionary")
  a = Range("A2", Range("B" & Rows.Count).End(xlUp)).Value
  dStart = Range("D2").Value
  dEnd = Range("D4").Value
  For i = 1 To UBound(a)
    If d2.exists(a(i, 1)) Then
      If d1.exists(a(i, 1)) Then d1.Remove a(i, 1)
    Else
      If a(i, 2) < dStart Or a(i, 2) > dEnd Then
        d2(a(i, 1)) = 1
      Else
        d1(a(i, 1)) = 1
      End If
    End If
  Next i
  If d1.Count > 0 Then
    With Range("F2").Resize(d1.Count)
      .Value = Application.Transpose(d1.Keys)
      .Sort Key1:=.Cells(1), Order1:=xlAscending, Header:=xlNo
    End With
  Else
     Range("F2").Value = "N/A"
  End If
End Sub
 
Last edited:
Upvote 1
Welcome to the MrExcel board!

😵‍💫 Have you tried it on your 300,000 rows?

My test data has about 35,000 rows (much smaller than yours) with about 500 different dupes that have all dates in the date range.
Both the code from post #9 and the code below produced the same results but that one took 50 seconds on my machine whereas this one took 0.08 seconds!

I have assumed that there will be less than 65,000 rows in the results. If it could be more then my code would need a slight adjustment.

VBA Code:
Sub Dupes_In_Range()
  Dim d1 As Object, d2 As Object
  Dim a As Variant
  Dim dStart As Date, dEnd As Date
  Dim i As Long
 
  Set d1 = CreateObject("Scripting.Dictionary")
  Set d2 = CreateObject("Scripting.Dictionary")
  a = Range("A2", Range("B" & Rows.Count).End(xlUp)).Value
  dStart = Range("D2").Value
  dEnd = Range("D4").Value
  For i = 1 To UBound(a)
    If d2.exists(a(i, 1)) Then
      If d1.exists(a(i, 1)) Then d1.Remove a(i, 1)
    Else
      If a(i, 2) < dStart Or a(i, 2) > dEnd Then
        d2(a(i, 1)) = 1
      Else
        d1(a(i, 1)) = 1
      End If
    End If
  Next i
  If d1.Count > 0 Then
    With Range("F2").Resize(d1.Count)
      .Value = Application.Transpose(d1.Keys)
      .Sort Key1:=.Cells(1), Order1:=xlAscending, Header:=xlNo
    End With
  Else
     Range("F2").Value = "N/A"
  End If
End Sub


Good job. Yeah, I forgot about the 300k set.
I mean, mine would work, though it would be a big slow, and they would need to change the value from 1000 to 300000+
Yours is better :)
 
Upvote 0
Welcome to the MrExcel board!

😵‍💫 Have you tried it on your 300,000 rows?

My test data has about 35,000 rows (much smaller than yours) with about 500 different dupes that have all dates in the date range.
Both the code from post #9 (after I edited the number of rows to 35,000) and the code below produced the same results but that one took 50 seconds on my machine whereas this one took 0.08 seconds!

I have assumed that there will be less than 65,000 rows in the results. If it could be more then my code would need a slight adjustment.

VBA Code:
Sub Dupes_In_Range()
  Dim d1 As Object, d2 As Object
  Dim a As Variant
  Dim dStart As Date, dEnd As Date
  Dim i As Long
 
  Set d1 = CreateObject("Scripting.Dictionary")
  Set d2 = CreateObject("Scripting.Dictionary")
  a = Range("A2", Range("B" & Rows.Count).End(xlUp)).Value
  dStart = Range("D2").Value
  dEnd = Range("D4").Value
  For i = 1 To UBound(a)
    If d2.exists(a(i, 1)) Then
      If d1.exists(a(i, 1)) Then d1.Remove a(i, 1)
    Else
      If a(i, 2) < dStart Or a(i, 2) > dEnd Then
        d2(a(i, 1)) = 1
      Else
        d1(a(i, 1)) = 1
      End If
    End If
  Next i
  If d1.Count > 0 Then
    With Range("F2").Resize(d1.Count)
      .Value = Application.Transpose(d1.Keys)
      .Sort Key1:=.Cells(1), Order1:=xlAscending, Header:=xlNo
    End With
  Else
     Range("F2").Value = "N/A"
  End If
End Sub
Wow thanks for your reply as well!!!!

I having been working on modifying it to work on a couple of different sheets. I was going to log back in and try to get a couple more comments so that I could figure out which parts were pulling from where. My start and end dates are on one sheet, the look up values (numbers w/duplicates and dates) on another and the return values I would like on yet another. 3 sheets total. I got it to run without errors but, it is pulling one unique number for my entire data set. LOL

Any ideas/code/comments would be greatly appreciated! You guys are fantastic! On the road now but once I can get back to it I can upload screenshots
 
Upvote 0
I mean, mine would work, though it would be a big slow,
It is actually much worse than I reported before. When I did that test I only changed the "i" loop from 1,000 to 35,000. When I also changed the two "x" loops and the "q" loop to 35,000 the code took almost 30 minutes! Imagine 300,000!! :eek:

3 sheets total.
Should be able to cope with that. Try this version. You will need to check the sheet names and the various ranges in the code.
For now I have assumed that the numbers and dates are in adjacent columns as per the original sample. If those columns are apart, please give details.
VBA Code:
Sub Dupes_In_Range_v2()
  Dim d1 As Object, d2 As Object
  Dim a As Variant
  Dim dStart As Date, dEnd As Date
  Dim i As Long
 
  Set d1 = CreateObject("Scripting.Dictionary")
  Set d2 = CreateObject("Scripting.Dictionary")
  With Sheets("Data") '<-Check sheet name & columns
    a = .Range("A2", .Range("B" & Rows.Count).End(xlUp)).Value
  End With
  With Sheets("StartEnd") '<-Check sheet name & cell addresses
    dStart = .Range("D2").Value
    dEnd = .Range("D4").Value
  End With
  For i = 1 To UBound(a)
    If d2.exists(a(i, 1)) Then
      If d1.exists(a(i, 1)) Then d1.Remove a(i, 1)
    Else
      If a(i, 2) < dStart Or a(i, 2) > dEnd Then
        d2(a(i, 1)) = 1
      Else
        d1(a(i, 1)) = 1
      End If
    End If
  Next i
  With Sheets("Results").Columns("F")  '<-Check sheet name & column for results
    .EntireColumn.ClearContents
    .Cells(1).Value = "Results"
    If d1.Count > 0 Then
      With .Resize(d1.Count).Offset(1)
        .Value = Application.Transpose(d1.Keys)
        .Sort Key1:=.Cells(1), Order1:=xlAscending, Header:=xlNo
      End With
    Else
       .Cells(2).Value = "N/A"
    End If
  End With
End Sub
 
Upvote 0
Welcome to the MrExcel board!

😵‍💫 Have you tried it on your 300,000 rows?

My test data has about 35,000 rows (much smaller than yours) with about 500 different dupes that have all dates in the date range.
Both the code from post #9 (after I edited the number of rows to 35,000) and the code below produced the same results but that one took 50 seconds on my machine whereas this one took 0.08 seconds!

I have assumed that there will be less than 65,000 rows in the results. If it could be more then my code would need a slight adjustment.

VBA Code:
Sub Dupes_In_Range()
  Dim d1 As Object, d2 As Object
  Dim a As Variant
  Dim dStart As Date, dEnd As Date
  Dim i As Long
 
  Set d1 = CreateObject("Scripting.Dictionary")
  Set d2 = CreateObject("Scripting.Dictionary")
  a = Range("A2", Range("B" & Rows.Count).End(xlUp)).Value
  dStart = Range("D2").Value
  dEnd = Range("D4").Value
  For i = 1 To UBound(a)
    If d2.exists(a(i, 1)) Then
      If d1.exists(a(i, 1)) Then d1.Remove a(i, 1)
    Else
      If a(i, 2) < dStart Or a(i, 2) > dEnd Then
        d2(a(i, 1)) = 1
      Else
        d1(a(i, 1)) = 1
      End If
    End If
  Next i
  If d1.Count > 0 Then
    With Range("F2").Resize(d1.Count)
      .Value = Application.Transpose(d1.Keys)
      .Sort Key1:=.Cells(1), Order1:=xlAscending, Header:=xlNo
    End With
  Else
     Range("F2").Value = "N/A"
  End If
End Sub
Thank you for the warm welcome!

I tried the code on my test set of data

1675141002216.png


It is returning the 1 as the first set of code did. One of the "1's" is out of the start and end date?

how would i go about modifying the code to verify that all of the duplicates are within the date range and only return those values?

thanks!
 
Upvote 0
It is actually much worse than I reported before. When I did that test I only changed the "i" loop from 1,000 to 35,000. When I also changed the two "x" loops and the "q" loop to 35,000 the code took almost 30 minutes! Imagine 300,000!! :eek:


Should be able to cope with that. Try this version. You will need to check the sheet names and the various ranges in the code.
For now I have assumed that the numbers and dates are in adjacent columns as per the original sample. If those columns are apart, please give details.
VBA Code:
Sub Dupes_In_Range_v2()
  Dim d1 As Object, d2 As Object
  Dim a As Variant
  Dim dStart As Date, dEnd As Date
  Dim i As Long
 
  Set d1 = CreateObject("Scripting.Dictionary")
  Set d2 = CreateObject("Scripting.Dictionary")
  With Sheets("Data") '<-Check sheet name & columns
    a = .Range("A2", .Range("B" & Rows.Count).End(xlUp)).Value
  End With
  With Sheets("StartEnd") '<-Check sheet name & cell addresses
    dStart = .Range("D2").Value
    dEnd = .Range("D4").Value
  End With
  For i = 1 To UBound(a)
    If d2.exists(a(i, 1)) Then
      If d1.exists(a(i, 1)) Then d1.Remove a(i, 1)
    Else
      If a(i, 2) < dStart Or a(i, 2) > dEnd Then
        d2(a(i, 1)) = 1
      Else
        d1(a(i, 1)) = 1
      End If
    End If
  Next i
  With Sheets("Results").Columns("F")  '<-Check sheet name & column for results
    .EntireColumn.ClearContents
    .Cells(1).Value = "Results"
    If d1.Count > 0 Then
      With .Resize(d1.Count).Offset(1)
        .Value = Application.Transpose(d1.Keys)
        .Sort Key1:=.Cells(1), Order1:=xlAscending, Header:=xlNo
      End With
    Else
       .Cells(2).Value = "N/A"
    End If
  End With
End Sub
Thanks again for all of the help!

Ill get more specific about my actual data set because i am having issues implementing the above.

Sheet 1- Start date- B4, End date B6, List of numbers (these numbers are to be checked against sheet 2)(These numbers represent Buildings)
1675143895248.png


Sheet 2- List that we are checking. Column F contains the list that has duplicates. Each of these duplicates must fall into the start/end range(sheet 1) to be counted. The date for each record is in Column P.

1675144073106.png



Sheet 3- just a blank sheet for now where i would like a list of results.

I hope this makes it easier to understand. thank you again!!!!!!
 
Upvote 0
It is returning the 1 as the first set of code did. One of the "1's" is out of the start and end date?

how would i go about modifying the code to verify that all of the duplicates are within the date range and only return those values?
Yep, mistake on my part, I was missing a line of code. For that basic, one-sheet version, please test this version (new line highlighted).

I will look at the 3-sheet issue separately.

Rich (BB code):
Sub Dupes_In_Range_2()
  Dim d1 As Object, d2 As Object
  Dim a As Variant
  Dim dStart As Date, dEnd As Date
  Dim i As Long
 
  Set d1 = CreateObject("Scripting.Dictionary")
  Set d2 = CreateObject("Scripting.Dictionary")
  a = Range("A3", Range("B" & Rows.Count).End(xlUp)).Value
  dStart = Range("D2").Value
  dEnd = Range("D4").Value
  For i = 1 To UBound(a)
    If d2.exists(a(i, 1)) Then
      If d1.exists(a(i, 1)) Then d1.Remove a(i, 1)
    Else
      If a(i, 2) < dStart Or a(i, 2) > dEnd Then
        d2(a(i, 1)) = 1
        If d1.exists(a(i, 1)) Then d1.Remove a(i, 1)
      Else
        d1(a(i, 1)) = 1
      End If
    End If
  Next i
  If d1.Count > 0 Then
    With Range("F2").Resize(d1.Count)
      .Value = Application.Transpose(d1.Keys)
      .Sort Key1:=.Cells(1), Order1:=xlAscending, Header:=xlNo
    End With
  Else
     Range("F2").Value = "N/A"
  End If
End Sub
 
Upvote 0
Re: The 3-sheet scenario

List of numbers (these numbers are to be checked against sheet 2)
I did not understand that.

Also, could we please have some small sample data for Sheet1 and Sheet2 with XL2BB so we can easily copy for testing?
I cannot really read Sheet2 and in any case there is way too much typing to set that up here to test with.
 
Upvote 0
Re: The 3-sheet scenario


I did not understand that.

Also, could we please have some small sample data for Sheet1 and Sheet2 with XL2BB so we can easily copy for testing?
I cannot really read Sheet2 and in any case there is way too much typing to set that up here to test with.
Sorry, that list is not necessary. It’s another puzzle I’m trying to solve though. If it’s not too big of a modification, that would be perfect.

Sorry I’m having trouble with that program. It can be any data, as long as F:F are numbers with some duplicates and P:P are dates. Sorry i cant provide test data....
 
Upvote 0

Forum statistics

Threads
1,216,094
Messages
6,128,785
Members
449,468
Latest member
AGreen17

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top