How do I identify identical rows in Excel or VBA

twinkle99

Board Regular
Joined
Aug 7, 2005
Messages
240
I have a large data file which has thousands of rows of data.

In Column A is a text string which is the reference number followed by an element. In Column C is the amount of that element and Column D is the date of that element.

I have written a macro which reads through the data file and finds the data which is relevant to the reference number and date both of which are inputted on another sheet (not supplied).

The problem I have is that in Column A, there may be rows which are identical. The macro looks for the text string, however, I dont know how to get the Macro to distinguish between the identical rows - i.e. 1021#INCV.

An example of the VBA for one input is:

With Application.WorksheetFunction
a.Range("C10") = .Index(b.Range("C:C"), .Match(b.Range("C11").Value & "#INCV", s, 0))

C:C - is the result that I want - the amount
C11 - is the reference number
s - is Column A:A where the text string will be

Basically, the macro should find the text string, check whether the date in D:D is before the calculation date (input on another sheet) and if it isnt, it should move to the next string and check the date etc until it finds the one which is before. However, it doesnt work because the macro just finds the first line every time (because the other INCV's are identical)

Is there a way of either writing an excel formula in Column B which will rename the identical rows when they occur i.e. 1021#INCV2, then 1021#INCV3 etc. Note that the elements for each member can appear in any row and are not necessarily in the same place, The dates and amounts will also be different. So members may return more rows than others. If this is possible I can just look for the different text strings in the macro.

Alternatively, is there a way of doing something in the Macro?

I have supplied a very short piece of the data file.

Thanks
vba test.xls
ABCD
11021#INCV160018/09/2005
21021#CA042018/09/2005
31021#ADP62018/09/2005
41021#INCV145220/09/2004
51021#CA041520/09/2004
61021#ADP8820/09/2004
71021#CA049017/09/2003
81021#ADP3317/09/2003
91021#INCV120017/09/2003
101021#INCV116616/08/2002
111021#INCV102015/08/2001
121021#INCV95016/01/2000
131569#INCV156018/09/2005
141569#CA0520018/09/2005
151569#ADP61518/09/2005
161569#INCV150020/09/2004
171569#CA0515020/09/2004
181569#ADP59520/09/2004
191569#CA059017/09/2003
201569#ADP50017/09/2003
211569#INCV140017/09/2003
221569#INCV133312/08/2002
Sheet1
 

Excel Facts

Test for Multiple Conditions in IF?
Use AND(test, test, test, test) or OR(test, test, test, ...) as the logical_test argument of IF.
this is reg identifying the identical rows.
in an empty column inex to data n a row for e.g. row no. 1
type
=a1&b1&c1&d1..........
then sort the whole data according to the helper column
the identical rows will be together.

venkat
 
Upvote 0
ok, that will work but how do i then distinguish the identical rows from each other. I would prefer to have the latest row as say 1021#INCV and then working back 1021#INCV2, 1021#INCV3 etc
 
Upvote 0
It is not clear what is meant by “identical rows”. The identical rows are those rows where the element in each column is same. There is no such identical row atleast within your data. In column A there is only INCV there is no INCV2 or INCV3
Anyhow see the spreadsheet below. Does it give any insight? The formulas in column E and F and G may be seen.

give your response please. you can add in column H what you expect.


venkat
Book3
ABCDEFG
11021#INCV95016/01/20001021#INCV95016/01/2000INCV95016/01/2000INCV9501
21021#INCV160018/09/20051021#INCV160018/09/2005INCV160018/09/2005INCV1600
31569#INCV156018/09/20051569#INCV156018/09/2005INCV156018/09/2005INCV1560
41569#INCV150020/09/20041569#INCV150020/09/2004INCV150020/09/2004INCV1500
51021#INCV145220/09/20041021#INCV145220/09/2004INCV145220/09/2004INCV1452
61569#INCV140017/09/20031569#INCV140017/09/2003INCV140017/09/2003INCV1400
71569#INCV133312/8/20021569#INCV133312/8/2002INCV133312/8/2002INCV1333
81021#INCV120017/09/20031021#INCV120017/09/2003INCV120017/09/2003INCV1200
91021#INCV116616/08/20021021#INCV116616/08/2002INCV116616/08/2002INCV1166
101021#INCV102015/08/20011021#INCV102015/08/2001INCV102015/08/2001INCV1020
111569#CA059017/09/20031569#CA059017/09/2003CA059017/09/2003CA059017
121569#CA0520018/09/20051569#CA0520018/09/2005CA0520018/09/2005CA052001
131569#CA0515020/09/20041569#CA0515020/09/2004CA0515020/09/2004CA051502
141021#CA049017/09/20031021#CA049017/09/2003CA049017/09/2003CA049017
151021#CA042018/09/20051021#CA042018/09/2005CA042018/09/2005CA042018
161021#CA041520/09/20041021#CA041520/09/2004CA041520/09/2004CA041520
171021#ADP8820/09/20041021#ADP8820/09/2004ADP8820/09/2004ADP8820/
181021#ADP62018/09/20051021#ADP62018/09/2005ADP62018/09/2005ADP62018
191569#ADP61518/09/20051569#ADP61518/09/2005ADP61518/09/2005ADP61518
201569#ADP59520/09/20041569#ADP59520/09/2004ADP59520/09/2004ADP59520
211569#ADP50017/09/20031569#ADP50017/09/2003ADP50017/09/2003ADP50017
221021#ADP3317/09/20031021#ADP3317/09/2003ADP3317/09/2003ADP3317/
23sortedaccordingtoColFdescending
Sheet1
 
Upvote 0
Hi

The identical Rows are in Column A, this is the problem i have. The data is exported from a database and this is how it is presented. My problem is that the Macro I have written will only read the first occurance of (for example) 1021#INCV and hence when it outputs the result, it is always 1600.

I think I have two options:

1. Write something in the Macro which will go to the next row where 1021#INCV exists (if the first occurance does not meet the criteria). Is there a next function in VBA?, this is abit beyond me to be honest.

2. Somehow manipulate the data in Column A. This appears to be the route you have taken but I dont think this will work for me.

If I sort Column A in the Macro I will get the following (for example):

1021#INCV
1021#INCV
1021#INCV
1021#INCV
1021#INCV

These are in date order as the data is exported in that way.

What I now need to do is manipulate the data somehow in Column A, along the lines:

1021#INCV1 - this being the latest one
1021#INCV2
1021#INCV3
1021#INCV4
1021#INCV5 - this being the earliest one

Or something very similar, where I have now seperated the identical strings.

I can then re write the Macro so that it starts with 1021#INCV1 and then moves to the next INCV (1021#INCV2 etc) for that members until it matches.

Hope this makes sense.
 
Upvote 0
I hope I have understood.
See the she spreadsheet below AFTER running the macro.
I have utilized the columns B and E assuming they are empty. If they have data you take two columns after the last column. In that case the macro needs slight modification in . the addresses of the cells.
see comment lines in the macro.
see column E

you can run the macro again once or twice to see whetehr you get what you want.
there is rpovision in the macro to delete the range B2 down and E2 down and then tor run the othre ocde satements.



Venkat .
book1.xls
ABCDE
1H1H2H3H4H5
21021#ADPADP8820/09/2004ADP_1
31569#ADPADP59520/09/2004ADP_2
41569#ADPADP61518/09/2005ADP_3
51021#ADPADP62018/09/2005ADP_4
61021#ADPADP3317/09/2003ADP_5
71569#ADPADP50017/09/2003ADP_6
81021#CA04CA041520/09/2004CA04_1
91021#CA04CA042018/09/2005CA04_2
101021#CA04CA049017/09/2003CA04_3
111569#CA05CA0515020/09/2004CA05_1
121569#CA05CA0520018/09/2005CA05_2
131569#CA05CA059017/09/2003CA05_3
141021#INCVINCV145220/09/2004INCV_1
151569#INCVINCV150020/09/2004INCV_2
161569#INCVINCV156018/09/2005INCV_3
171021#INCVINCV160018/09/2005INCV_4
181021#INCVINCV120017/09/2003INCV_5
191569#INCVINCV140017/09/2003INCV_6
201021#INCVINCV116616/08/2002INCV_7
211021#INCVINCV95016/01/2000INCV_8
221021#INCVINCV102015/08/2001INCV_9
231569#INCVINCV133312/8/2002INCV_10
Sheet1


THE MACRO IS GIVEN BELOW
Code:
Sub test()
Dim rng, c As Range
Dim i As Integer
i = 0
'the next two lines is to clear the result of the macro
'so that you can test the macro once or twice
Range(Range("B2"), Range("b2").End(xlDown)).Clear
Range(Range("e2"), Range("e2").End(xlDown)).Clear
'the actual macro starts
Set rng = Range(Cells(2, 1), Cells(Rows.Count, "a").End(xlUp))
'now introduce in column B the alphabet part of cells in column A
'if column B is not free change to a column adjacent to the data
For Each c In rng
Cells(c.Row, "b") = Mid(c.Value, 6, Len(c.Value) - 5)
Next
'now sort according col A ascending, col D descending
'column C ascending
'set rng as the full data including the new column
Set rng = Range(Cells(1, 1), Cells(Rows.Count, "d").End(xlUp))
rng.Sort order1:=xlAscending, key1:=Range("b1"), order2:=xlDescending, key2:=Range("d1"), _
    order3:=xlAscending, key3:=Range("c1"), header:=xlYes
    
'now according to date you add order no. to cells in col B and park it in column E
Set rng = Range(Range("B2"), Range("B2").End(xlDown))
For Each c In rng
If c = c.Offset(1, 0) Then
i = i + 1
Cells(c.Row, "e") = c.Value & "_" & i
Cells(c.Offset(1, 0).Row, "e") = c.Value & "_" & i + 1
Else
i = 0
End If
Next

End Sub
 
Upvote 0
does this help?
Code:
Sub test()
Dim a, i As Long, dic As Object, txt As String, myNum As Long
a = Range("a1",Range("a" & Rows.Count).End(xlUp)).Value
Set dic = CreateObject("Scripting.Dictionary")
dic.CompareMode = vbTextCompare
With CreateObject("VBScript.RegExp")
   For i = 1 To UBound(a,1)
      If Not IsEmpty(e) Then
         .Pattern = "\d+$"
         txt = .replace(Trim(a(i,1)),"")
         If .test(a(i,1)) Then
            myNum = .execute(Trim(a(i,1)))(0)
         Else
            myNum = 0
         End If
         If Not dic.exists(txt) Then
            dic.add txt, myNum
         Else
            a(i,1) = txt & dic(txt) + 1
            dic(txt) = dic(txt) + 1
         End If
      End If
   Next
End With
Range("a1").Resize(UBound(a,1),UBound(a,2)).Value = a
Set dic = Nothing
End Sub
P.S my understanding is different from Venkat, so if this is not what you after just ignore.
 
Upvote 0
Hi venkat1926, Hi Jindon

Thanks you both for your input with this, it is much appreciated.

Jindon, I cant get your code to work, what is it supposed to do?, I am not getting any results.

Venkat, your code works, I made a few tweaks because the reference number (in column A) is not always 4 characters so when i tested it on one with 5 characters it didnt work correctly (see below).
I also didnt want the header and i wanted the result in column B (sorry but i removed your notes).


Sub test3()
Dim rng, c As Range
Dim i As Integer
i = 0
Range(Range("B2"), Range("b2").End(xlDown)).Clear
Range(Range("e2"), Range("e2").End(xlDown)).Clear
Set rng = Range(Cells(1, 1), Cells(Rows.Count, "a").End(xlUp))
For Each c In rng
Cells(c.Row, "b") = c.Value
Next
Set rng = Range(Cells(1, 1), Cells(Rows.Count, "d").End(xlUp))
rng.Sort order1:=xlAscending, key1:=Range("a1"), header:=xlNo
Set rng = Range(Range("A1"), Range("A1").End(xlDown))
For Each c In rng
If c = c.Offset(1, 0) Then
i = i + 1
Cells(c.Row, "b") = c.Value & i
Cells(c.Offset(1, 0).Row, "b") = c.Value & i + 1
Else
i = 0
End If
Next

End Sub



However, I have now found a flaw in the whole thing because i didnt realise that the data is in another sheet (sheet 1). This means that if we sort the sheet i have given you (sheet 2), the index match functions in the macro dont work because they pull data from sheet 1 from the exact same row where the text string was found in sheet 2. For example, where 1021#INCV is found in Sheet 2 (say A1), it will then find the amount from Sheet 1 - from row 1 (the column is always F:F). Similarly, if the text string was in row 22000, it will match on row 22000 in sheet 1.

So basically, can we get this code to work without sorting it?

A short piece of the unsorted data on sheet 2 is given below

Note that the data in column A sheet 2 will always be in reference order first but the elements are not in any particular order except that the latest elements will be shown first (and not all elements are applicable to every member).

Note also that the amounts and dates in column C and D are actually from Sheet 1!, I copied them over rather than sending you another sheet, sorry my mistake.

Many Thanks
vba test.xls
ABCD
11569#INCV156018/09/2005
21569#CA0520018/09/2005
31569#ADP61518/09/2005
41569#INCV150020/09/2004
51569#CA0515020/09/2004
61569#ADP59520/09/2004
71569#INCV140017/09/2003
81569#CA059017/09/2003
91569#ADP50017/09/2003
101569#INCV133312/08/2002
111021#INCV160018/09/2005
121021#CA042018/09/2005
131021#ADP62018/09/2005
141021#INCV145220/09/2004
151021#CA041520/09/2004
161021#ADP8820/09/2004
171021#CA049017/09/2003
181021#ADP3317/09/2003
191021#INCV120017/09/2003
201021#INCV116616/08/2002
211021#INCV102015/08/2001
221021#INCV95016/01/2000
2356233#INCV23318/09/2005
2456233#ADP5018/09/2005
2556233#INCV20017/09/2004
2656233#INCV16316/09/2003
Sheet2
 
Upvote 0
Quote
the reference number (in column A) is not always 4 characters so when i tested it on one with 5 characters it didn’t work correctly (see below).
Unquote
There is no problem about this What you should do is to find the number of characters when # occurs and use this in mid function.

I am thinking on certain lines and shall try on them. I am just now ging out and be back after 3 hours and then I shall try. Don’t worry the whole day is remaining.
Greetings
venkat
 
Upvote 0

Forum statistics

Threads
1,214,641
Messages
6,120,691
Members
448,978
Latest member
rrauni

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top