address Cleaning / Standardization

Welsh Mark3

Board Regular
Joined
Apr 7, 2014
Messages
115
I have a file of about 2500 companies, however, the addresses are very badly formatted. There are no real defined column names such as address line 1, city state zip. In many cases the data is spread over many columns and if some cases repeated.

I would like to create an output where these addresses are cleaned and Standardized.

Does anyone have any recommendations? I don't think I'm going to find a perfect solution as the data is very bad, but it If can improve the structure it will be a great help. I also understand it's difficult to fully understand the data without seeing it. Is it possible to import a file to this forum?
 

Some videos you may like

Excel Facts

Convert text numbers to real numbers
Select a column containing text numbers. Press Alt+D E F to quickly convert text to numbers. Faster than "Convert to Number"

Joe4

MrExcel MVP, Junior Admin
Joined
Aug 1, 2002
Messages
54,486
Office Version
  1. 365
Platform
  1. Windows
Does anyone have any recommendations? I don't think I'm going to find a perfect solution as the data is very bad, but it If can improve the structure it will be a great help. I also understand it's difficult to fully understand the data without seeing it. Is it possible to import a file to this forum?
You are correct. It will be very difficult to come up with a fool-proof solution. There are just too many variations and "exceptions to the rule" (the same is true for parsing name into first and last name, given that people can have a multiple-word first name or last name).

Typically, what I try to do is come up with "rules" that handle most of the cases (I typically like to be at least 90%), and handle the rest manually. So the key here is to come up with those rules (in plain English). It is largely dependent on what the data looks like and how it is entered.

If you can come up with those rules, post them here and we will help you write what you need. If you aren't sure what the rules should be, please post a sampling of data that covers as many as the common variations as possible. You cannot upload files to this site. But there are tools you can use to post screen images. They are listed in Section B of this link here: http://www.mrexcel.com/forum/board-a...forum-use.html. Also, there is a Test Here forum on this board that you can use to test out these tools to make sure they are working correctly before using them in your question.
 

Welsh Mark3

Board Regular
Joined
Apr 7, 2014
Messages
115
Thank you Joe, I will work on creating the rules and posing a common sample
 

Watch MrExcel Video

Forum statistics

Threads
1,109,276
Messages
5,527,732
Members
409,786
Latest member
AbdulMoix

This Week's Hot Topics

Top