# Array formula to get longest common prefix

#### Fractalis

##### Active Member
I have a list of numbers and I want to get the common prefix between them.

I've been able to make this array formula that gives me correct output but comparing only 2 numbers (stored as text).

Code:
``````A1=[COLOR=#0000ff][B]23702[/B][/COLOR]000000
B1=[B][COLOR=#0000ff]23702[/COLOR][/B]999999

C1="=LEFT(A1,SUMPRODUCT((LEFT(\$A1,COLUMN(\$A\$1:\$R\$1))=LEFT(\$B1,COLUMN(\$A\$1:\$R\$1)))*1))"

Output: [COLOR=#0000ff][B]23702[/B][/COLOR]``````

Is there a way to get the longest common prefix between more than 2 strings with an array formula? or for this task only VBA macro could help?

If I have the following 5 values.
Code:
``````[B][COLOR=#0000ff]7551[/COLOR][/B]1813
[B][COLOR=#0000ff]7551[/COLOR][/B]18
[B][COLOR=#0000ff]7551[/COLOR][/B]184
[COLOR=#0000ff][B]7551[/B][/COLOR]51852
[COLOR=#0000ff][B]7551[/B][/COLOR]581``````

The longest common prefix here is 7551 but I don´t know how to generalized my formula for more than 2 values.

Thanks for any help

Last edited:

### Excel Facts

Save Often
If you start asking yourself if now is a good time to save your Excel workbook, the answer is Yes

#### shaowu459

##### Active Member
Please try this, Ctrl shift enter for the array formula:
Code:
``=LOOKUP(1,0/(COUNTIF(A1:A5,LEFT(A1,COLUMN(A:Z))&"*")=COUNTA(A1:A5)),LEFT(A1,COLUMN(A:Z)))``

#### Fractalis

##### Active Member
Hi shaowu459,

Excellent. It seems to work how expected.

My last issue is that sometimes I have set of values where there is no common prefix between all of them, but there longest common prefixes between subgroups within the set.

Example.

Code:
``````[TABLE="width: 86"]
<tbody>[TR]
[TD]14385804999[/TD]
[/TR]
[TR]
[TD]14385884999[/TD]
[/TR]
[TR]
[TD]15875804999[/TD]
[/TR]
[TR]
[TD]15875884999[/TD]
[/TR]
[TR]
[TD]16472139[/TD]
[/TR]
[TR]
[TD]1647580[/TD]
[/TR]
[TR]
[TD]164758056[/TD]
[/TR]
</tbody>[/TABLE]``````
Here the longest common prefix is 1, but the condition is that LCP be no lower than 4 digits length.

So in this case there are 3 longest common prefixes

Code:
``````143858    --> between 14385804999 and 14385884999
158758    --> between 15875804999 and 15875884999
1647        --> between 16472139, 1647580 and 164758056``````
This would be the more general solution for the kind of inputs I have. Maybe I could use your formula in those subgroups, but that way should be manually, not automatically.

Thanks so much in advance for any help

Last edited:

#### shaowu459

##### Active Member
This is more complicated than the original question, let me see... #### Fractalis

##### Active Member
This is more complicated than the original question, let me see... Yes, I know and can be more complicated if values that have some common prefix are not contiguos I think  #### shaowu459

##### Active Member
Copy formula down until it returns "". values that have some common prefix are not contiguos is considered in this formula.

Code:
``=IFERROR(LEFT(INDEX(\$A\$1:\$A\$7,SMALL(IF(MATCH(LEFT(\$A\$1:\$A\$7,4),LEFT(\$A\$1:\$A\$7,4),)=ROW(\$A\$1:\$A\$7),ROW(\$A\$1:\$A\$7)),ROW(A1))),MOD(MAX(COUNTIF(\$A\$1:\$A\$7,LEFT(INDEX(\$A\$1:\$A\$7,SMALL(IF(MATCH(LEFT(\$A\$1:\$A\$7,4),LEFT(\$A\$1:\$A\$7,4),)=ROW(\$A\$1:\$A\$7),ROW(\$A\$1:\$A\$7)),ROW(A1))),COLUMN(D:Z))&"*")*10^3+COLUMN(D:Z)),10^3)),"")``

Last edited:
• Fractalis

#### Fractalis

##### Active Member
Hi shaowu459,

Many thanks!

It seems to work pretty fine. I was trying to understand the logic of your formula in order to for example change the limit of Longest Common prefix. I said to you not lower than 4 digits length. I changed from 4 to 5 but didn't work, so that means I still don't understand and it works.

#### shaowu459

##### Active Member
It is hard for me to explain the logic very clearly, but i can tell you how to make changes to the formula. If you want to change 4 to 5, see below example:

=IFERROR(LEFT(INDEX(\$A\$1:\$A\$7,SMALL(IF(MATCH(LEFT(\$A\$1:\$A\$7,5),LEFT(\$A\$1:\$A\$7,5),)=ROW(\$A\$1:\$A\$7),ROW(\$A\$1:\$A\$7)),ROW(A1))),MOD(MAX(COUNTIF(\$A\$1:\$A\$7,LEFT(INDEX(\$A\$1:\$A\$7,SMALL(IF(MATCH(LEFT(\$A\$1:\$A\$7,5),LEFT(\$A\$1:\$A\$7,5),)=ROW(\$A\$1:\$A\$7),ROW(\$A\$1:\$A\$7)),ROW(A1))),COLUMN(E:Z))&"*")*10^3+COLUMN(E:Z)),10^3)),"")

Column(D:Z) need change to Column(E:Z), because E is the fifth column and column(E:E) returns 5.

Last edited:

#### Fractalis

##### Active Member
Thanks so much shaowu459. It´s brilliant. It works just fine.

With your explanation I was able to generalized a little bit your formula, changing the limit based on value in D1. You could put in D1 values like 3, 4, 5 etc and the result changes accordingly.

This is the formula modified for such behaviour. Changes in red.

Code:
``````=IFERROR( LEFT(
INDEX(\$A\$1:\$A\$7,
SMALL(
IF(
MATCH(LEFT(\$A\$1:\$A\$7,[COLOR=#ff0000]\$D\$1[/COLOR]),LEFT(\$A\$1:\$A\$7,[COLOR=#ff0000]\$D\$1[/COLOR]),)=ROW(\$A\$1:\$A\$7),
ROW(\$A\$1:\$A\$7)
),
ROW(A1)
)
),
MOD(MAX(COUNTIF(\$A\$1:\$A\$7,
LEFT(INDEX(\$A\$1:\$A\$7,
SMALL(
IF(
MATCH(LEFT(\$A\$1:\$A\$7,[COLOR=#ff0000]\$D\$1[/COLOR]),LEFT(\$A\$1:\$A\$7[COLOR=#ff0000],\$D\$1[/COLOR]),)=ROW(\$A\$1:\$A\$7),
ROW(\$A\$1:\$A\$7)
),
ROW(A1)
)
),
[COLOR=#ff0000]ROW(INDEX(\$A:\$A,\$D\$1):\$A\$30)[/COLOR])&"*")*10^3+[COLOR=#ff0000]ROW(INDEX(\$A:\$A,\$D\$1):\$A\$30)[/COLOR]),
10^3
)
),
"")``````

Last edited:

#### shaowu459

##### Active Member
Haha, I've forgotten how I get the formula You are the one who is brilliant. If it works fine after you made such changes, it is OK. If you encounter other problems, please post in this thread, we can discuss it later.