remove duplicate lines in text file

yinkajewole

Active Member
Joined
Nov 23, 2018
Messages
257
how can i use a macro to remove duplicate lines in my text file such that the last one will remain, for instance
the boy is good
great thinking
great thinking
he lives
the boy is good
what's up
great thinking

it should now be like this
he lives
the boy is good
what's up
great thinking
 
Last edited:

Some videos you may like

Excel Facts

Excel Joke
Why can't spreadsheets drive cars? They crash too often!

Domenic

MrExcel MVP
Joined
Mar 10, 2004
Messages
19,092
Since duplicate lines are going to be deleted, it shouldn't matter which instance should remain. Or am I missing something?
 

MARK858

MrExcel MVP
Joined
Nov 12, 2010
Messages
12,030
Office Version
365, 2010
Platform
Windows, Mobile
Maybe....
Code assumes that your data is in column A and you have a header in A1.

Code:
Sub DelDupsKeepLast()

    Dim myRng As Range, myCell As Range
    Dim RngDel As Range, LstRw As Long

    Set myRng = Range("A2:A" & Range("A" & Rows.Count).End(xlUp).Row)
    LstRw = Range("A" & Rows.Count).End(xlUp).Row

    For Each myCell In myRng

        If myCell.Row < LstRw Then
            If Not myCell.Offset(1, 0).Resize(LstRw - myCell.Row).Find(What:=myCell.Value, Lookat:=xlWhole) Is Nothing Then
                If RngDel Is Nothing Then
                    Set RngDel = myCell
                Else
                    Set RngDel = Application.Union(RngDel, myCell)
                End If
            End If
        End If

    Next myCell

    If Not RngDel Is Nothing Then RngDel.EntireRow.Delete

End Sub
 

yinkajewole

Active Member
Joined
Nov 23, 2018
Messages
257
Maybe....
Code assumes that your data is in column A and you have a header in A1.

Code:
Sub DelDupsKeepLast()

    Dim myRng As Range, myCell As Range
    Dim RngDel As Range, LstRw As Long

    Set myRng = Range("A2:A" & Range("A" & Rows.Count).End(xlUp).Row)
    LstRw = Range("A" & Rows.Count).End(xlUp).Row

    For Each myCell In myRng

        If myCell.Row < LstRw Then
            If Not myCell.Offset(1, 0).Resize(LstRw - myCell.Row).Find(What:=myCell.Value, Lookat:=xlWhole) Is Nothing Then
                If RngDel Is Nothing Then
                    Set RngDel = myCell
                Else
                    Set RngDel = Application.Union(RngDel, myCell)
                End If
            End If
        End If

    Next myCell

    If Not RngDel Is Nothing Then RngDel.EntireRow.Delete

End Sub
the data is not in excel, it's in a text file (.txt)
 

MARK858

MrExcel MVP
Joined
Nov 12, 2010
Messages
12,030
Office Version
365, 2010
Platform
Windows, Mobile
You do realize this is an Excel site? Import the data into Excel run the code and save as a text file.
 

MARK858

MrExcel MVP
Joined
Nov 12, 2010
Messages
12,030
Office Version
365, 2010
Platform
Windows, Mobile
No, you can't remove duplicates while still in a text file, what is so difficult about importing it into Excel and then saving it?
 

Eric W

MrExcel MVP
Joined
Aug 18, 2015
Messages
9,293
Try this on a COPY of your text file. It asks for the file name, opens it, reads it, processes the data, and writes back the results to the same file.

Rich (BB code):
Sub RemoveDups()
Dim FileName As String, MyData As String, MyResult As String, i As Long, MyLines As Variant
Dim Kept As Long, Lost As Long

    FileName = Application.GetOpenFilename
    
    Open FileName For Input As #1 
    MyData = Input(LOF(1), 1)
    Close #1 
    
    MyResult = vbCrLf
    MyLines = Split(MyData, vbCrLf)
    For i = UBound(MyLines) To 0 Step -1
        If InStr(1, MyResult, vbCrLf & MyLines(i) & vbCrLf, vbTextCompare) = 0 Then
            MyResult = vbCrLf & MyLines(i) & MyResult
            Kept = Kept + 1
        Else
            Lost = Lost + 1
        End If
    Next i
    
    Open FileName For Output As #1 
    Print #1 , Mid(MyResult, 2)
    Close #1 
    
    MsgBox Kept & " lines were kept" & vbCrLf & Lost & " lines were removed"
    
End Sub
 

yinkajewole

Active Member
Joined
Nov 23, 2018
Messages
257
Try this on a COPY of your text file. It asks for the file name, opens it, reads it, processes the data, and writes back the results to the same file.

Rich (BB code):
Sub RemoveDups()
Dim FileName As String, MyData As String, MyResult As String, i As Long, MyLines As Variant
Dim Kept As Long, Lost As Long

    FileName = Application.GetOpenFilename
    
    Open FileName For Input As #1 
    MyData = Input(LOF(1), 1)
    Close #1 
    
    MyResult = vbCrLf
    MyLines = Split(MyData, vbCrLf)
    For i = UBound(MyLines) To 0 Step -1
        If InStr(1, MyResult, vbCrLf & MyLines(i) & vbCrLf, vbTextCompare) = 0 Then
            MyResult = vbCrLf & MyLines(i) & MyResult
            Kept = Kept + 1
        Else
            Lost = Lost + 1
        End If
    Next i
    
    Open FileName For Output As #1 
    Print #1 , Mid(MyResult, 2)
    Close #1 
    
    MsgBox Kept & " lines were kept" & vbCrLf & Lost & " lines were removed"
    
End Sub
this is too real to be true... Exactly what I was looking for. Many thanks you all
 

Forum statistics

Threads
1,089,394
Messages
5,407,983
Members
403,176
Latest member
mehtavish1

This Week's Hot Topics

  • help please
    SORRY NOT ANY GOOD AT EXCEL SO HELP WOULD BE MUCH APPRECIATED this formula is in a sheet called ignore...
  • two formulas needed
    Hello, I'll try my best to explain this: First formula needed in Sheet1 cell A2: If Sheet1 cell B2 = Sheet2 cell B2 then return a 1. If not then...
  • Dynamic Counts
    Good afternoon, we are tidying up some data & the data seems to be growing quicker than we are tidying it up! What we confirm (by reviewing it...
  • Help Excel formula eliminate duplicate values and keep only 2 identical rows.
    as picture below column A has a duplicate value. but the values are not the same as the rule. sometimes 4 rows, sometimes 10 rows or 7 or 9...
  • Macro Compile Error Sub or Function not defined
    Hello, I am trying to run macros from a validation list, all macros have been created and run perfectly on there own but I'm getting a compile...
  • Last row combined with Current Region VBA
    I'm generally happy finding the last row of data through something like Lastrow = Cells(Rows.Count, "D").End(xlUp) but I don't always receive data...
Top