How to Count Number of Unique Occurrences within an Alphanumeric String

dwgnome

Active Member
Joined
Dec 18, 2005
Messages
441
Looking for a way to count the number of unique values. Note that some cells have multiple entries. Need to ignore the text and dates and just count the number of “numbers” even though they themselves are in a text field. So for the example below, the number of unique values is 12. Note: need to ignore letters after the “numbers” as that is referring to the same value just amended - in this case the "A".</SPAN>


FOT A52281
POL A46168
KUY A46159 (3/2&3/3/04)</SPAN>
WEV A46159 (3/2&3/3/04) </SPAN>
IOV A46170</SPAN> (1/31/2005)</SPAN>
WEV A46159
IOV A46170 (1/31/2005)</SPAN>
NOV A46159
TIM A46175H
SAM A46175
DTO A46174
UAL A48207
PIR A46070
RTE A46070
TX A46072
Y A52283
NV A52284
PQW A52283
O A52284
ZRR A52283
ZRR A52284
FOT A52281A
FOT A52281A
TIM A52283
TIM A52284

<TBODY>
</TBODY>

I tried using:</SPAN>
=COUNTIF(D10:D28,"*A*") but results in 19</SPAN>
=SUM(IF(COUNTIF(D9:D28,D9:D28)=0,"",1/COUNTIF(D9:D28,D9:D28))) results in 18</SPAN>
=SUM(IF(FREQUENCY(MATCH(D10:D28,D10:D28,0),MATCH(D10:D28,D10:D28,0))>0,1)) Also results in 18</SPAN>
and</SPAN>
=SUMPRODUCT((D10:D28<>"")*(1/COUNTIF(D10:D28,D10:D28&""))) results in 18</SPAN>

While there is always an “A” before the number, it may not show up. Also, there are blank rows in my real data, so I need to ignore those. If this helps, the number of digits are always 5 (i.e., 52283).</SPAN>


Excel 2003.</SPAN>
 

Excel Facts

Why are there 1,048,576 rows in Excel?
The Excel team increased the size of the grid in 2007. There are 2^20 rows and 2^14 columns for a total of 17 billion cells.
Bump:

The formulas again hopefully easier to read:</SPAN>
Code:
=COUNTIF(D10:D28,"*A*")
but results in 19</SPAN>
Code:
=SUM(IF(COUNTIF(D9:D28,D9:D28)=0,"",1/COUNTIF(D9:D28,D9:D28)))
results in 18</SPAN>
Code:
=SUM(IF(FREQUENCY(MATCH(D10:D28,D10:D28,0),MATCH(D10:D28,D10:D28,0))>0,1))
Also results in 18</SPAN>
and</SPAN>
Code:
=SUMPRODUCT((D10:D28<>"")*(1/COUNTIF(D10:D28,D10:D28&"")))
results in 18</SPAN>

While there is an “A” before the number, it may not always show up. Also, there are blank rows in my real data, so I need to ignore those. If this helps, the number of digits are always 5 (i.e., 52283).</SPAN>

Any help is appreciated.


Excel 2003</SPAN>
 
Upvote 0
If you can use a helper column then see if this will work for you.
It assumes that the 5 numbers will always be the first 5 numbers in the text string.

Also, I only see 11 unique items not 12.

The formula in B1 must be entered with CTRL-SHIFT-ENTER (copy this formula down to your last row of data).
Excel Workbook
ABC
1FOT A522815228111
2POL A4616846168
3KUY A46159 (3/2&3/3/04)46159
4WEV A46159 (3/2&3/3/04)46159
5IOV A46170 (1/31/2005)46170
6WEV A4615946159
7IOV A46170 (1/31/2005)46170
8NOV A4615946159
9TIM A46175H46175
10SAM A4617546175
11DTO A4617446174
12UAL A4820748207
13PIR A4607046070
14RTE A4607046070
15TX A4607246072
16Y A5228352283
17NV A5228452284
18PQW A5228352283
19O A5228452284
20ZRR A5228352283
21ZRR A5228452284
22FOT A52281A52281
23FOT A52281A52281
24TIM A5228352283
25TIM A5228452284
Sheet
 
Upvote 0
Thanks AhoyNC. Very clever. I think you are right there are only 11 unique numbers. My original data had 12 and I think I missed one when I set up this example.

I think you are on the right track, but when I tried your suggestion, it only counted 9. This may be that there are actually two sets of 5 digits in one cell and the helper formula only extracts the first one, not both. Secondly if there is blank row, the result is #NA and then the SUMPRODUCT portion also results in #NA. However, when I adjusted the SUMPRODUCT(($B$1:$B$25<>"") range to the left one column that worked even though it still gives the incorrect result.
 
Upvote 0
I think the formula below will take care of the blank cells, but I'm not sure how to handle two sets of 5 in the same cell.
Excel Workbook
ABC
1FOT A522815228111
2POL A4616846168
Sheet
 
Upvote 0
Give these formulas a try, enter them with Shift-Ctrl-Enter:

Into B1:

=IFERROR(MID(A1,FIND(" A",A1)+2,5),"")

Into C1:

=IFERROR(MID(A1,FIND(" A",A1,7)+2,5),"")

Into D1:

=SUMPRODUCT(($B$1:$C$25<>"")*(1/COUNTIF($B$1:$C$25,$B$1:$C$25&"")))
 
Upvote 0
Thanks Ahoy. Yes you are right, your solution still does not address the two or more sets per cell. Your revised formula may work, but I get an error preventing me from entering the formula. The error says cannot enter formula due to too many nested levels allowed by the current file format.

Thanks Istvan, your solution worked. :) It is a bit more work than I would normally would want as I would have to anticipate extra helper columns if more than two sets of 5 digits were present. I also used ISERROR as I have Excel 2003.

Thanks everyone for their help.
 
Upvote 0

Forum statistics

Threads
1,215,463
Messages
6,124,965
Members
449,201
Latest member
Jamil ahmed

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top