Array formula to get longest common prefix

Fractalis

Active Member
Joined
Oct 11, 2011
Messages
252
I have a list of numbers and I want to get the common prefix between them.


I've been able to make this array formula that gives me correct output but comparing only 2 numbers (stored as text).


Code:
A1=[COLOR=#0000ff][B]23702[/B][/COLOR]000000
B1=[B][COLOR=#0000ff]23702[/COLOR][/B]999999

C1="=LEFT(A1,SUMPRODUCT((LEFT($A1,COLUMN($A$1:$R$1))=LEFT($B1,COLUMN($A$1:$R$1)))*1))"

Output: [COLOR=#0000ff][B]23702[/B][/COLOR]

Is there a way to get the longest common prefix between more than 2 strings with an array formula? or for this task only VBA macro could help?


If I have the following 5 values.
Code:
[B][COLOR=#0000ff]7551[/COLOR][/B]1813
[B][COLOR=#0000ff]7551[/COLOR][/B]18
[B][COLOR=#0000ff]7551[/COLOR][/B]184
[COLOR=#0000ff][B]7551[/B][/COLOR]51852
[COLOR=#0000ff][B]7551[/B][/COLOR]581

The longest common prefix here is 7551 but I don´t know how to generalized my formula for more than 2 values.


Thanks for any help
 
Last edited:

Some videos you may like

Excel Facts

Save Often
If you start asking yourself if now is a good time to save your Excel workbook, the answer is Yes

shaowu459

Active Member
Joined
Apr 26, 2018
Messages
468
Office Version
365
Platform
Windows
Please try this, Ctrl shift enter for the array formula:
Code:
=LOOKUP(1,0/(COUNTIF(A1:A5,LEFT(A1,COLUMN(A:Z))&"*")=COUNTA(A1:A5)),LEFT(A1,COLUMN(A:Z)))
 

Fractalis

Active Member
Joined
Oct 11, 2011
Messages
252
Hi shaowu459,

Excellent. It seems to work how expected.

My last issue is that sometimes I have set of values where there is no common prefix between all of them, but there longest common prefixes between subgroups within the set.

Example.

Code:
[TABLE="width: 86"]
<tbody>[TR]
[TD]14385804999[/TD]
[/TR]
[TR]
[TD]14385884999[/TD]
[/TR]
[TR]
[TD]15875804999[/TD]
[/TR]
[TR]
[TD]15875884999[/TD]
[/TR]
[TR]
[TD]16472139[/TD]
[/TR]
[TR]
[TD]1647580[/TD]
[/TR]
[TR]
[TD]164758056[/TD]
[/TR]
</tbody>[/TABLE]
Here the longest common prefix is 1, but the condition is that LCP be no lower than 4 digits length.

So in this case there are 3 longest common prefixes

Code:
143858    --> between 14385804999 and 14385884999
158758    --> between 15875804999 and 15875884999
1647        --> between 16472139, 1647580 and 164758056
This would be the more general solution for the kind of inputs I have. Maybe I could use your formula in those subgroups, but that way should be manually, not automatically.

Thanks so much in advance for any help
 
Last edited:

shaowu459

Active Member
Joined
Apr 26, 2018
Messages
468
Office Version
365
Platform
Windows
This is more complicated than the original question, let me see...:)
 

Fractalis

Active Member
Joined
Oct 11, 2011
Messages
252
This is more complicated than the original question, let me see...:)
Yes, I know and can be more complicated if values that have some common prefix are not contiguos I think:confused::confused:
 

shaowu459

Active Member
Joined
Apr 26, 2018
Messages
468
Office Version
365
Platform
Windows
Copy formula down until it returns "". values that have some common prefix are not contiguos is considered in this formula.

Code:
=IFERROR(LEFT(INDEX($A$1:$A$7,SMALL(IF(MATCH(LEFT($A$1:$A$7,4),LEFT($A$1:$A$7,4),)=ROW($A$1:$A$7),ROW($A$1:$A$7)),ROW(A1))),MOD(MAX(COUNTIF($A$1:$A$7,LEFT(INDEX($A$1:$A$7,SMALL(IF(MATCH(LEFT($A$1:$A$7,4),LEFT($A$1:$A$7,4),)=ROW($A$1:$A$7),ROW($A$1:$A$7)),ROW(A1))),COLUMN(D:Z))&"*")*10^3+COLUMN(D:Z)),10^3)),"")
 
Last edited:

Fractalis

Active Member
Joined
Oct 11, 2011
Messages
252
Hi shaowu459,

Many thanks!

It seems to work pretty fine. I was trying to understand the logic of your formula in order to for example change the limit of Longest Common prefix. I said to you not lower than 4 digits length. I changed from 4 to 5 but didn't work, so that means I still don't understand and it works.
 

shaowu459

Active Member
Joined
Apr 26, 2018
Messages
468
Office Version
365
Platform
Windows
It is hard for me to explain the logic very clearly, but i can tell you how to make changes to the formula. If you want to change 4 to 5, see below example:

=IFERROR(LEFT(INDEX($A$1:$A$7,SMALL(IF(MATCH(LEFT($A$1:$A$7,5),LEFT($A$1:$A$7,5),)=ROW($A$1:$A$7),ROW($A$1:$A$7)),ROW(A1))),MOD(MAX(COUNTIF($A$1:$A$7,LEFT(INDEX($A$1:$A$7,SMALL(IF(MATCH(LEFT($A$1:$A$7,5),LEFT($A$1:$A$7,5),)=ROW($A$1:$A$7),ROW($A$1:$A$7)),ROW(A1))),COLUMN(E:Z))&"*")*10^3+COLUMN(E:Z)),10^3)),"")

Column(D:Z) need change to Column(E:Z), because E is the fifth column and column(E:E) returns 5.
 
Last edited:

Fractalis

Active Member
Joined
Oct 11, 2011
Messages
252
Thanks so much shaowu459. It´s brilliant. It works just fine.

With your explanation I was able to generalized a little bit your formula, changing the limit based on value in D1. You could put in D1 values like 3, 4, 5 etc and the result changes accordingly.

This is the formula modified for such behaviour. Changes in red.

Code:
=IFERROR( LEFT(
  INDEX($A$1:$A$7,
   SMALL(
    IF(
     MATCH(LEFT($A$1:$A$7,[COLOR=#ff0000]$D$1[/COLOR]),LEFT($A$1:$A$7,[COLOR=#ff0000]$D$1[/COLOR]),)=ROW($A$1:$A$7),
     ROW($A$1:$A$7)
    ),
     ROW(A1)
   )
  ),
  MOD(MAX(COUNTIF($A$1:$A$7,
    LEFT(INDEX($A$1:$A$7,
    SMALL(
     IF(
      MATCH(LEFT($A$1:$A$7,[COLOR=#ff0000]$D$1[/COLOR]),LEFT($A$1:$A$7[COLOR=#ff0000],$D$1[/COLOR]),)=ROW($A$1:$A$7),
      ROW($A$1:$A$7)
     ),
      ROW(A1)
    )
      ),
     [COLOR=#ff0000]ROW(INDEX($A:$A,$D$1):$A$30)[/COLOR])&"*")*10^3+[COLOR=#ff0000]ROW(INDEX($A:$A,$D$1):$A$30)[/COLOR]),
  10^3
  )
 ),
"")
 
Last edited:

shaowu459

Active Member
Joined
Apr 26, 2018
Messages
468
Office Version
365
Platform
Windows
Haha, I've forgotten how I get the formula:LOL: You are the one who is brilliant. If it works fine after you made such changes, it is OK. If you encounter other problems, please post in this thread, we can discuss it later.
 

Watch MrExcel Video

Forum statistics

Threads
1,099,505
Messages
5,469,017
Members
406,628
Latest member
jared92

This Week's Hot Topics

Top