remove shuffled duplicates

vs140580

New Member
Joined
Dec 11, 2015
Messages
10
Suppose a1,b1,c1,d1,e1 are elements of a row and a1,c1,e1,d1,b1 is some permutation of the elements of that row how to retain only one row among all such permutations in several rows and delete remaining I dont know VBA prograoming kind help.
 

Excel Facts

Excel motto
Not everything I do at work revolves around Excel. Only the fun parts.
Excel 2007 and later has remove duplicates in the data tab, for earlier versions the advanced filter will do this
 
Upvote 0
Excel 2007 and later has remove duplicates in the data tab, for earlier versions the advanced filter will do this
I'd be interested to see how you envisage using either option for this problem.
.. or perhaps I have misunderstood it.
 
Upvote 0
My interpretation is that for the following data, you would want to keep the green rows (unique combinations)
and delete the yellow rows (row 4 is a permutation of row 2 & row 6 is a permutation of row 3)

Excel Workbook
ABCDE
1Hdr 1Hdr 2Hdr 3Hdr 4Hdr 5
2a1b1c1d1e1
3a1b1x1j1e1
4a1c1e1d1b1
5z1z2z3z4z5
6x1j1e1a1b1
Remove Permutations



If so, then try ..

Rich (BB code):
Sub DeletePerputations()
  Dim al As Object, d As Object
  Dim Data As Variant, tmp As Variant
  Dim i As Long, j As Long
  Dim Comb As String
  
  Set d = CreateObject("Scripting.Dictionary")
  d.CompareMode = 1
  Set al = CreateObject("System.Collections.ArrayList")
  With Range("A1").CurrentRegion
    Data = .Value
    ReDim tmp(1 To UBound(Data), 1 To 1)
    For i = 1 To UBound(Data)
      For j = 1 To UBound(Data, 2)
        al.Add Data(i, j)
      Next j
      al.Sort
      Comb = Join(al.toarray(), "|")
      If Not d.exists(Comb) Then
        tmp(i, 1) = i
        d.Add Comb, 1
      End If
      al.Clear
    Next i
    Application.ScreenUpdating = False
    With .Resize(, .Columns.Count + 1)
      .Columns(.Columns.Count).Value = tmp
      .Sort Key1:=.Columns(.Columns.Count), Order1:=xlAscending, Header:=xlNo, _
            OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom, DataOption1:=xlSortNormal
      With .Columns(.Columns.Count)
        On Error Resume Next
        .SpecialCells(xlBlanks).EntireRow.Delete
        On Error GoTo 0
        .ClearContents
      End With
    End With
    Application.ScreenUpdating = True
  End With
End Sub


After the code:

Excel Workbook
ABCDE
1Hdr 1Hdr 2Hdr 3Hdr 4Hdr 5
2a1b1c1d1e1
3a1b1x1j1e1
4z1z2z3z4z5
5
6
Remove Permutations
 
Upvote 0
I'd be interested to see how you envisage using either option for this problem.

Well your code is better, but after a formula:


Excel 2010
ABCDEFGHIJK
112345
2a1b1c1d1e1a1b1c1d1e1
3q1w1e1r1t1e1q1r1t1w1
4b1a1d1c1e1a1b1c1d1e1
5zxc1vbbc1vxz
6a1sd1fga1d1fgs
7e1a1b1d1c1a1b1c1d1e1
8poiuyiopuy
9mnb1vc1b1c1mnv
Sheet7 (5)
Cell Formulas
RangeFormula
G2{=INDEX($A2:$E2,MATCH(COLUMN(A2),COUNTIF($A2:$E2,"<="&$A2:$E2),0))}
Press CTRL+SHIFT+ENTER to enter array formulas.


the advanced filter can hide the sorted group duplicate rows, and the originals can be copied with F5-special-visible. My above suggestion did imply that no sorting would be required, and it seems that remove duplicates won't work because there's no way to remember which rows were deleted. So I was 1/3 right :cool:
 
Upvote 0
Well your code is better, but after a formula:


the advanced filter can hide the sorted group duplicate rows, and the originals can be copied with F5-special-visible. My above suggestion did imply that no sorting would be required, and it seems that remove duplicates won't work because there's no way to remember which rows were deleted. So I was 1/3 right :cool:
Well that does at least give the OP a chance - I thought that you original response was a bit light-on with help. :)

I have 3 comments.

1. As you have noted, Remove Duplicates is no use because you have created a different range. However, I would warn about ever recommending Remove Duplicates as it is known to fail (without indication) in some circumstances. As far as I know it has not yet been fixed. An example can be found here.

2. I don't know if it is possible with the OPs data, but your sorting formula fails if there are repeated entries in a row. eg Change cell D2 in your layout above to "b1"

3. I would recommend using a COLUMNS() function instead of COLUMN() in your formula. To see why, with your sheet and formulas as shown above, insert a new column A to the left of the data.
 
Last edited:
Upvote 0
THe question now is if they are spread across workbooks due to large data files
That isn't terribly informative but in any case would make me think it is time to use Access or perhaps one of the Power BI tools.
 
Upvote 0
1. I would warn about ever recommending Remove Duplicates as it is known to fail (without indication) in some circumstances. As far as I know it has not yet been fixed.
Ok that's not good

your sorting formula fails if there are repeated entries in a row. eg Change cell D2 in your layout above to "b1"

3. I would recommend using a COLUMNS() function instead of COLUMN() in your formula. To see why, with your sheet and formulas as shown above, insert a new column A to the left of the data.

I remember there was a reason why so many used columns instead of column, and I should have gone with the first option here: http://www.mrexcel.com/forum/excel-questions/706066-simple-formula-sort-text.html which seems to be better with repeats: =INDEX($A2:$E2,MATCH(SMALL(COUNTIF($A2:$E2,"<="&$A2:$E2),COLUMNS($A2:A2)),COUNTIF($A2:$E2,"<="&$A2:$E2),0))


THe question now is if they are spread across workbooks due to large data files

How difficult would it be to consolidate them into one with perhaps a reference number for each line?
 
Upvote 0

Forum statistics

Threads
1,215,473
Messages
6,125,018
Members
449,203
Latest member
tungnmqn90

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top