Extracting a UK postcode from a string

YorkshireMidge

New Member
Joined
Jun 3, 2017
Messages
5
Have some addresses in comma delimited form within a cell, but the format is all over the place so if I do text to column, the postcode can end up in as many as four different columns. The postcode isn't always at the end either - it is sometimes followed by UK.

If I use the SEARCH function, I can pick out a postcode in a particular format and return the numeric value of where it starts - and could then use that to extract I guess, but I don't know how to cater for the fact there are three variations in length on a UK postcode (and 6 formats).

In essence, what I want to do is search in a cell for ",?? ???," and ",??? ???," and ",???? ???," and whichever format of the three is found, extract postcode minus commas to the adjoining cell. There is always a postcode to be found in the cell being searched.

So you could have permutations like this (and more) in the data.

House number,street,town,LLN NLL,
House number,street,district,city,LNN NLL,UK
Flat number,building number,street,district,city,county,LLNN NLL,UK
Building name,building purpose,street,city,county,LN NLL,

Any help appreciated!

Thanks.
 

Excel Facts

Pivot Table Drill Down
Double-click any number in a pivot table to create a new report showing all detail rows that make up that number
i am only replying with a possible solution as i can't see any unique identifier to enable string parsing.

firstly as they all seem to be UK orientated as per your thread title. if you were to use find and replace ( select column with strings press CTRL + H keys to bring up dialogue box)

in find put UK and replace with lower case uk ( you can reverse this later if wanted) now we have unique way of identifying your post code

here is a function thread that will extract upper case words in a string

https://www.mrexcel.com/forum/excel...ula-extract-upper-case-words-text-string.html

please read all the replies to thread

hope this offers some hope to a solution
 
Upvote 0
Try this convoluted formula with a helper cell

=SUBSTITUTE(C2,", UK","") in O2, your address in C2

=IF(ISNUMBER(--MID(O2,LEN(O2)-2,1)),TRIM(RIGHT(SUBSTITUTE(O2," ",REPT(" ",100)),200)),"") in P2

its built on the logic that the third digit from the right in a UK postcode is always a number

so it should substitute , UK to an empty value in O2

then where the post code should reside it pickups up the third from right, and ensures that a text number is shown as a number

if its a number then it extracts the whole of the postcode

without the helper column

=IF(ISNUMBER(--MID(SUBSTITUTE(C2,", UK",""),LEN(SUBSTITUTE(C2,", UK",""))-2,1)),TRIM(RIGHT(SUBSTITUTE(SUBSTITUTE(C2,", UK","")," ",REPT(" ",100)),200)),"")


I'm sure someone can make it better
 
Last edited:
Upvote 0
Yorkshire, if, per your examples, there is always a comma before and after the postal code, you should be able to use this formula to extract the postal codes in all cases:

=TRIM(MID(A1,FIND(",",A1,(FIND(" ",A1,LEN(A1)-7)-5))+1,(FIND(",",A1,(FIND(" ",A1,LEN(A1)-7)))-FIND(",",A1,(FIND(" ",A1,LEN(A1)-7)-5)))-1))

If there are some cases where the postal code is last with no comma directly after it, you could create a helper column that is simply =A1&"," which would add a comma to the end of every entry. Even if some were superfluous or created back-to-back final commas, the main formula I provided should still work. The important thing is that every postal code have a comma before and after it.

The "TRIM" function will take care of cases where there may be a single space preceding or following the postal code — but it will result in an error if there is space both before and​ after it. That was just an extra.
 
Upvote 0
Yorkshire, if, per your examples, there is always a comma before and after the postal code, you should be able to use this formula to extract the postal codes in all cases:

=TRIM(MID(A1,FIND(",",A1,(FIND(" ",A1,LEN(A1)-7)-5))+1,(FIND(",",A1,(FIND(" ",A1,LEN(A1)-7)))-FIND(",",A1,(FIND(" ",A1,LEN(A1)-7)-5)))-1))

If there are some cases where the postal code is last with no comma directly after it, you could create a helper column that is simply =A1&"," which would add a comma to the end of every entry. Even if some were superfluous or created back-to-back final commas, the main formula I provided should still work. The important thing is that every postal code have a comma before and after it.

The "TRIM" function will take care of cases where there may be a single space preceding or following the postal code — but it will result in an error if there is space both before and​ after it. That was just an extra.

Erik, I tried your code to see what it did. UK post Codes have the last three as Number, Letter, Letter that is consistant everwhere, what is not obvious is the first part which can be

E1 9NN
E14 9NN
SE1 9NN
SE11 9NN
 
Last edited:
Upvote 0
Yorkshire, I'm not understanding you.

You said you tried my code "to see what it did." But you didn't say whether or not it worked. It should have taken care of all cases you listed, if there is a comma before and after the postal code. It works on all cases you provided as examples in your initial post.

If it did not work for you, could you provide a few examples where it DID NOT work, including the original entry and the incorrect result of the formula?
 
Last edited:
Upvote 0
@Erik (I'm not yorkshire)

this produces
SHAW ROAD, DAGENHAM, RM10 7QJ, UK #VALUE

the OP also said that UK is not at the end of all, and i appreciate there was a suggestion to add an end comma via a helper cell

<colgroup><col width="424"></colgroup><tbody>
</tbody>
 
Upvote 0
Thanks for the example, Mole. It's the presence of spaces after every comma in your example that doesn't match the format presented by the original poster. If this is the case in the actual data, I'd just select all of Column A (or the column with the original entries) and Find/Replace [comma space] with [comma], in order to quickly resolve that issue. The formula I suggest will take over from there.

Alternatively, you could just replace the presence of '7' in my formula with '8':

=TRIM(MID(A1&CHAR(44),FIND(",",A1&CHAR(44),(FIND(" ",A1&CHAR(44),LEN(A1&CHAR(44))-8)-5))+1,(FIND(",",A1&CHAR(44),(FIND(" ",A1&CHAR(44),LEN(A1&CHAR(44))-7)))-FIND(",",A1&CHAR(44),(FIND(" ",A1&CHAR(44),LEN(A1&CHAR(44))-8)-5)))-1))

That should work even if you don't do the FIND/REPLACE and WITHOUT running the add-comma helper column.

Thanks again for pointing out the exception.
 
Last edited:
Upvote 0
Ah, found an exception again.

This should work without FIND/REPLACE or adding helper commas to the end:

=TRIM(MID(TRIM(A1),FIND(",",TRIM(A1),(FIND(" ",TRIM(A1),LEN(TRIM(A1))-8)-6))+1,(FIND(",",TRIM(A1),(FIND(" ",TRIM(A1),LEN(TRIM(A1))-8)))-FIND(",",TRIM(A1),(FIND(" ",TRIM(A1),LEN(TRIM(A1))-8)-6)))-1))
 
Last edited:
Upvote 0
Guys - this is fantastic and did the trick! It just threw up a handful of exceptions out of many hundreds - and these were where the comma was missing after the postcode, but I can live with that and correct manually.

@Mole - just for info, there are the two London variations, bringing the total to six different formats at the start of a UK postcode (we don't make life easy for ourselves do we?). I've tested the extra two with Erik's formula and they work OK too.

Many thanks!


FormatExampleCoverage
A9 9AAS1 1AAB, E, G, L, M, N, S, W postcode areas
A99 9AAM60 1NW
AA9 9AACR2 6XHAll postcode areas except B, E, G, L, M, N, S, W, WC
AA99 9AADN55 1PT
A9A 9AAW1A 1HQE1W, N1C, N1P, W1 postcode districts
AA9A 9AAEC1M 1BBWC postcode area; EC1-EC4, NW1W, SE1P, SW1 postcode districts

<tbody>
</tbody>
 
Upvote 0

Forum statistics

Threads
1,214,618
Messages
6,120,544
Members
448,970
Latest member
kennimack

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top