Measuring the amount of single and double byte characters in a cell

Translationguy

New Member
Joined
Nov 4, 2016
Messages
26
Hi All

I have (what is probably) quite an unusual problem where I frequently need to measure the amount of single and double byte characters in an Excel cell.

The reason for this is that Google allows a fixed number of characters for PPC such as ads but for double byte languages, it counts any single byte characters such as spaces and English text as 0.5. Therefore whilst the character limit they have might be 12, I could use 9 double byte characters and 6 single byte characters and still make it fit.

If it is over this limit, the piece will be rejected and can;t be used whatsoever.

The current method I have is to count the number of single byte characters in the text, put the number into the cell next to it and put in a formula such as "=LEN(a1)-(a3/2)".

"A1" is the text I am measuring, "A3" is where I put the manually counted number of double byte characters and the result will tell me how Google would view it. I tend to put in conditional formatting to highlight the ones that are too long however this is quite a manual process.

I understand that Excel has the ability to count bytes with "=lenb" however it hasn't worked when trying to measure 2 different byte lengths in the same cell.

The cells will never follow the same pattern however changing the cells the formula is referencing isn't too much of an issue for us.

If anybody knows of a solution I'd be really grateful as it will save a lot of time for myself and my colleagues.

With many thanks!
 

Excel Facts

Ambidextrous Undo
Undo last command with Ctrl+Z or Alt+Backspace. If you use the Undo icon in the QAT, open the drop-down arrow to undo up to 100 steps.
Hi

Excel uses Unicode in strings. All the characters in an excel string are coded with 2 byte words. Even if it's as ascii character, for ex. "A" that has the code hex 41, is stored as 2 bytes, 0041.

This means that if you want to have a string with both 1 and 2 byte codes in excel it will not be an excel string, it will be a sequence of bytes that you have manipulated for ex. to remove 00 bytes in case of your single byte characters.

Please comment.
 
Upvote 0
Hi pgc01

Thanks for coming back to me on this and with an explanation.

From this, it looks like Excel doesn't have the ability to distinguish between the 2 types of characters?

Just for more clarification, there may be times when looking at text going into a Search Engine, I need to distinguish how many single byte characters and how many double byte characters there are. Google for example will have character limits for any character restricted content but if you are using Single byte characters within a Double byte Ad, it will treat each Single byte character as 0.5.

Currently I list the limit for the Ad, do a length formula for the translation, put the amount of Single byte characters /2 and then minus this amount from the length. This will then show when it is unsuitable if over.

Let me know if this helps explain it further.

Best regards
 
Upvote 0
From this, it looks like Excel doesn't have the ability to distinguish between the 2 types of characters?

Excel uses Unicode and so these are the same type of characters just with different code numbers. All the characters are coded with 2 byte words

It's not that excel cannot distiguish between both types of characters, it's that there is just 1 type of characters in excel.

Just for more clarification, there may be times when looking at text going into a Search Engine, I need to distinguish how many single byte characters and how many double byte characters there are. Google for example will have character limits for any character restricted content but if you are using Single byte characters within a Double byte Ad, it will treat each Single byte character as 0.5.

Maybe you can come out with a way to distinguish them.

How do you split a chain of bytes into 1 and 2 byte characters?
An example, let's say you have the sequence of bytes

59 27 30 4D 30 44

You can say these are 3 2 byte characters, (5927, 304D and 3044), and in that case it would be

大きい

which is dog in Japanese (I hope :) )

or you could say these are 6 1 byte characters, that would be the ascii string:

Y'0M0D

So, my question is, you look at that chain of characters and how do you split it into characters?


Notice that in excel it's very simple, since all characters have 2 bytes this problem does not exist.
 
Last edited:
Upvote 0
Hi again

I'll elaborate on this further with the current method :)

Column "A"
This contains English text that needs to be translated.

Column "B"
This will contain the translation of column "A".

Column "C"
This contains an =len" formula measuring the cell next to it in column "B".

Column "D"
This will contain the number of Single byte characters used in the translation in column"B". As of right now, this is always counted manually and the reason for this post.

Column "E"
This contains a formula like this "=C-(D/2)". It allows me to see the number of characters used in the translation with 0.5 taken from it for every Single byte character.

From how it looks in your previous messages, it seems Excel encodes these with 2 bytes (not the number used with other encodings). Do you know if there is a way to distinguish between the 2? Maybe this doesn't require a measurement of the number of bytes but instead use something to measure based on the alphabet used (if this is a thing)?

Many thanks
 
Upvote 0
Well, you could assume that characters with code less than 256 are single byte characters.

In that case, you could use something like:

=(LEN(A2)+SUMPRODUCT(--(UNICODE(MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1))>255)))/2

which counts characters with codes less than 256 as 1/2 and the others as 1.

Remarks:

- you should check if this assumption makes sense it terms of what google defines

- If you have an older version of excel and still don't have the function UNICODE() use a udf to get the character code.

For ex., in A2:

sキク」sdaf

the formula will count 5.5, thats 3 for the 3 double byte characters and 2.5 for the 5 single byte characters.

Does this help?
 
Upvote 0
Hi PGC

Thank you very much for this, it's actually perfect for what we need!

I've verified the formula using a few texts and it works every time.

Many thanks once again for your help with this.

Best regards
 
Upvote 0

Forum statistics

Threads
1,223,099
Messages
6,170,111
Members
452,302
Latest member
TaMere

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top