Duplicate Text Removal

euanh

New Member
Joined
Jan 7, 2004
Messages
20
I have a list of addresses that I get sent from an external source in Excel and the list has some duplicates in it which I need to remove.

For example and entry might be : J Bloggs 1 any street Any Town London Town London Town (the duplicate being London Town)

I have a macro that can remove single word duplicates, ie: London London, but I need one that will remove duplicates of more than one word as in the example above.

The duplicates always occur next to each other so the cell won't start London Town with text inbetween the duplicate entries.

Does anyone know whether there is a macro/UDF that I could use to remove these entries ?

Ordinarily I would go back to the source and ask them to stop duplicating the entry however this will take weeks as it's an external company.

Any help greatly appreciated.
Euan :unsure:
 

Excel Facts

Create a Pivot Table on a Map
If your data has zip codes, postal codes, or city names, select the data and use Insert, 3D Map. (Found to right of chart icons).
Hi Euan

Would a macro that removes any duplicate words in a text do?

Won't that be a problem for

Jack London 1 any street Any Town London Town London Town

results in

Jack London 1 any street Any Town

Since London already appears 1 time in the name, it won't show in the address.

Kind regards
PGC
 
Upvote 0
Hi,

The function I am using at the moment is as follows.


Function RemoveDup(str1)

Dim regEx

Set regEx = CreateObject("vbscript.regexp")

regEx.Global = False

regEx.Pattern = "\b(\w+)\b\s+\1"

regEx.IgnoreCase = True

RemoveDup = regEx.Replace(str1, "$1")

End Function



=RemoveDup(A1)

As for removing words that appear twice in say the address but also in the name, this could cause a problem so if I can get a macro to do this by looking for duplicated words that apear is sequence this would be better.

Thanks
Euan
 
Upvote 0
Hi Euan

I replaced the pattern.

Please try

Code:
Function RemoveDup(sS)

With CreateObject("vbscript.regexp")
  .Pattern = "\b(\w+)\b\s+\b(\w+)\b\s+\1\s+\2"
  .IgnoreCase = True
  .Global = False
   RemoveDup = .Replace(sS, "$1 $2")
End With

End Function

Hope this helps
PGC

EDIT: simplified the pattern
Book2
ABCDE
1JBloggs1anystreetAnyTownLondonTownLondonTownJBloggs1anystreetAnyTownLondonTown
2MyohmyohmyMyohmy
3repeat1repeat2repeat2repeat1repeat2
4arepeatarepeatarepeat
5 
6
Sheet3
 
Upvote 0

Forum statistics

Threads
1,213,560
Messages
6,114,309
Members
448,564
Latest member
ED38

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top