Separating long sequences of characters by space

zevlacab · Dec 17, 2010

Hello,
I have 50 to 100 sequences of 200 letters long or more in a column of Excel. I'm trying to add a space every 10 letters such as:
QWERTYUIOPASDFGHJKLZXCVB...etc
into QWERTYUIOP ASDFGHJKLZ XCVB...etc
Formulas I tried stopped after 10 characters or self-formulas replaced letters by space. Please help. Thanks. H.

Marcelo Branco · Dec 18, 2010

Peter_SSs said:
TRIM would be fine if the separator is a blank - which is what the OP wanted. I note though that the function you have provided allows for an optional separator so if something else was used, TRIM wouldn't be sufficient.

Peter,

In fact i didnt use the separator provided by the function - its optional. I just added an space at the end of each part of the string.

But if i had to use a comma (using the function separator or simply including a comma in front of each part of the string), for example, i think it would be possible to clean up the final string using the SUBSTITUTE function.

M.

pgc01 · Dec 18, 2010

I guess I'm too late for the party but here's another one:

Code:

Public Function AddSep(s As String, lSep As Long, Optional sSep As String = " ") As String
 
    With CreateObject("VBScript.RegExp")
        .Pattern = "(.{" & lSep & "})(?!$)"
        .Global = True
        AddSep = .Replace(s, "$1" & sSep)
    End With
End Function

Peter_SSs · Dec 18, 2010

pgc01 said:
I guess I'm too late for the party ..

pgc

I'm glad you came. As usual, it has been beneficial for me.

I have not got my head very well around "lookaround" (is that the correct term?) with regular expressions and had not considered using it here.

I still like the idea of removing the CreateObject("VBScript.RegExp") from the function itself. If the function is used much within the sheet, creating the object for every calculation seems to slow the code quite a bit.

Anyway, your suggestion made me have another think about this exercise. I think this very marginally shorter pattern also does the job. Do you have any comments/insight about where this idea would fail for the job where yours wouldn't?

My code would become ..

Private RX As Object Private Sub CreateRX() Set RX = CreateObject("VBScript.RegExp") End Sub Function Separate(ByVal s As String, Pos As Long, _ Optional sep As String = " ") As String If RX Is Nothing Then CreateRX With RX .Pattern = "(.{" & Pos & "})(.)" .Global = True Separate = .Replace(s, "$1" & sep) End With End Function

pgc01 · Dec 18, 2010

Peter

Peter_SSs said:
I have not got my head very well around "lookaround" (is that the correct term?) ...

Yes, but as you know there are two, the lookahead, that we have in the vbscript, but also the lookbehind, that is available in other languages but that ms did not yet implement.

Peter_SSs said:
I still like the idea of removing the CreateObject("VBScript.RegExp") from the function itself. If the function is used much within the sheet, creating the object for every calculation seems to slow the code quite a bit.

I agree completely with you. I usually don't post it because I like to concentrate in the logic of the RegEx, but you are right, I should post it as that is how I use it when I need it for myself.

I prefer, however, to define it inside the udf, instead of in a module variable, like:

Code:

Public Function AddSep(s As String, lSep As Long, Optional sSep As String = " ") As String
Static RegEx As Object
 
If RegEx Is Nothing Then Set RegEx = CreateObject("VBScript.RegExp")
 
With RegEx
    .Pattern = "(.{" & lSep & "})(?!$)"
    .Global = True
    AddSep = .Replace(s, "$1" & sSep)
End With
End Function

Peter_SSs said:
Anyway, your suggestion made me have another think about this exercise. I think this very marginally shorter pattern also does the job. Do you have any comments/insight about where this idea would fail for the job where yours wouldn't?

Your pattern does not work, you are matching 1 extra character that you do not use in the result. This means that for each match you lose 1 character.

For ex., using a space as separator at each 10 characters, with your udf in column B and mine in column C, you see that since you did 2 matches you lost 2 characters.

<TABLE style="BORDER-BOTTOM-COLOR: #cccccc; BORDER-RIGHT-WIDTH: 2px; BORDER-TOP-COLOR: #cccccc; BORDER-COLLAPSE: collapse; FONT-FAMILY: Arial,Arial; BACKGROUND: #fff; BORDER-TOP-WIDTH: 2px; BORDER-BOTTOM-WIDTH: 2px; BORDER-RIGHT-COLOR: #cccccc; FONT-SIZE: 10pt; BORDER-LEFT-COLOR: #cccccc; BORDER-LEFT-WIDTH: 2px" border=1 cellPadding=1><TBODY><TR><TH style="BORDER-BOTTOM-COLOR: #888888; BORDER-RIGHT-WIDTH: 1px; BORDER-TOP-COLOR: #888888; BACKGROUND: #9cf; BORDER-TOP-WIDTH: 1px; BORDER-BOTTOM-WIDTH: 1px; BORDER-RIGHT-COLOR: #888888; BORDER-LEFT-COLOR: #888888; BORDER-LEFT-WIDTH: 1px"></TH><TH style="BORDER-BOTTOM-COLOR: #888888; TEXT-ALIGN: center; BORDER-RIGHT-WIDTH: 1px; BORDER-TOP-COLOR: #888888; BACKGROUND: #9cf; BORDER-TOP-WIDTH: 1px; BORDER-BOTTOM-WIDTH: 1px; BORDER-RIGHT-COLOR: #888888; BORDER-LEFT-COLOR: #888888; BORDER-LEFT-WIDTH: 1px">A</TH><TH style="BORDER-BOTTOM-COLOR: #888888; TEXT-ALIGN: center; BORDER-RIGHT-WIDTH: 1px; BORDER-TOP-COLOR: #888888; BACKGROUND: #9cf; BORDER-TOP-WIDTH: 1px; BORDER-BOTTOM-WIDTH: 1px; BORDER-RIGHT-COLOR: #888888; BORDER-LEFT-COLOR: #888888; BORDER-LEFT-WIDTH: 1px">B</TH><TH style="BORDER-BOTTOM-COLOR: #888888; TEXT-ALIGN: center; BORDER-RIGHT-WIDTH: 1px; BORDER-TOP-COLOR: #888888; BACKGROUND: #9cf; BORDER-TOP-WIDTH: 1px; BORDER-BOTTOM-WIDTH: 1px; BORDER-RIGHT-COLOR: #888888; BORDER-LEFT-COLOR: #888888; BORDER-LEFT-WIDTH: 1px">C</TH></TR><TR><TD style="BORDER-BOTTOM-COLOR: #000000; TEXT-ALIGN: center; PADDING-BOTTOM: 0.25em; BORDER-RIGHT-WIDTH: 1px; BORDER-TOP-COLOR: #000000; PADDING-LEFT: 0.5em; PADDING-RIGHT: 0.5em; BACKGROUND: #9cf; BORDER-TOP-WIDTH: 1px; BORDER-BOTTOM-WIDTH: 1px; BORDER-RIGHT-COLOR: #000000; BORDER-LEFT-COLOR: #000000; BORDER-LEFT-WIDTH: 1px; PADDING-TOP: 0.4em">1</TD><TD style="BORDER-BOTTOM-COLOR: #888888; TEXT-ALIGN: left; PADDING-BOTTOM: 0.25em; BORDER-RIGHT-WIDTH: 1px; BORDER-TOP-COLOR: #888888; PADDING-LEFT: 0.5em; PADDING-RIGHT: 0.5em; BORDER-TOP-WIDTH: 1px; BORDER-BOTTOM-WIDTH: 1px; BORDER-RIGHT-COLOR: #888888; BORDER-LEFT-COLOR: #888888; BORDER-LEFT-WIDTH: 1px; PADDING-TOP: 0.4em">01234567890123456789012</TD><TD style="BORDER-BOTTOM-COLOR: #888888; TEXT-ALIGN: left; PADDING-BOTTOM: 0.25em; BORDER-RIGHT-WIDTH: 1px; BORDER-TOP-COLOR: #888888; PADDING-LEFT: 0.5em; PADDING-RIGHT: 0.5em; BORDER-TOP-WIDTH: 1px; BORDER-BOTTOM-WIDTH: 1px; BORDER-RIGHT-COLOR: #888888; BORDER-LEFT-COLOR: #888888; BORDER-LEFT-WIDTH: 1px; PADDING-TOP: 0.4em">0123456789 1234567890 2</TD><TD style="BORDER-BOTTOM-COLOR: #888888; TEXT-ALIGN: left; PADDING-BOTTOM: 0.25em; BORDER-RIGHT-WIDTH: 1px; BORDER-TOP-COLOR: #888888; PADDING-LEFT: 0.5em; PADDING-RIGHT: 0.5em; BORDER-TOP-WIDTH: 1px; BORDER-BOTTOM-WIDTH: 1px; BORDER-RIGHT-COLOR: #888888; BORDER-LEFT-COLOR: #888888; BORDER-LEFT-WIDTH: 1px; PADDING-TOP: 0.4em">0123456789 0123456789 012</TD></TR><TR><TD style="PADDING-LEFT: 1em; BACKGROUND: #9cf" colSpan=4>[Book1]Sheet1</TD></TR></TBODY></TABLE>

What you can do, using your idea, is to use a lookahead.

I've used a negative logic, "match n characters if they are not followed by the end of the string".

In your case you can use positive logic, "match n characters if they are followed by another character". Using a lookahead you are not matching the extra character, you are just checking if it is present in the string and so you will not lose characters.

In this case you replace your pattern with:

Code:

        .Pattern = "(.{" & Pos & "})(?=.)"

Hope I did not mess up something as it's (much) past my bedtime

pgc01 · Dec 18, 2010

More efficient:

Code:

Public Function AddSep(s As String, lSep As Long, Optional sSep As String = " ") As String
Static RegEx As Object
 
If RegEx Is Nothing Then
    Set RegEx = CreateObject("VBScript.RegExp")
    With RegEx
        .Pattern = "(.{" & lSep & "})(?!$)"
        .Global = True
    End With
End If
 
AddSep = RegEx.Replace(s, "$1" & sSep)
End Function

Peter_SSs · Dec 18, 2010

pgc

Thanks for the additional explanations & suggestions.

pgc01 said:
I prefer, however, to define it inside the udf, instead of in a module variable ..

That seems a more compact method, I'll try to rember that for the future.

pgc01 said:

More efficient:

Code:

Public Function AddSep(s As String, lSep As Long, Optional sSep As String = " ") As String
Static RegEx As Object
 
If RegEx Is Nothing Then
    Set RegEx = CreateObject("VBScript.RegExp")
    With RegEx
        .Pattern = "(.{" & lSep & "})(?!$)"
        .Global = True
    End With
End If
 
AddSep = RegEx.Replace(s, "$1" & sSep)
End Function

More efficient for some RegExp implementations, but isn't it prone to problems with this particular exercise since the Pattern is effectively set in the sheet by the user?

For example, with ..
B9: =AddSep(A9,10)
C9: =AddSep(A9,5)
.. you don't get the desired results.

pgc01 said:
Your pattern does not work, you are matching 1 extra character that you do not use in the result.

Ah, yes, I did not study my results carefully enough, thanks for pointing that out.

pgc01 · Dec 19, 2010

Peter_SSs said:
More efficient for some RegExp implementations, but isn't it prone to problems with this particular exercise since the Pattern is effectively set in the sheet by the user?

Peter

Thank you, you are absolutely right.

If the pattern is constant, which happens many times, the version in post #15, that defines the pattern only once, is the most efficient.

If the pattern varies with the function's arguments, like in this particular exercise, the correct one is like I posted in post #14, eventually including setting the Global property ony once, but leaving out the pattern setting for each udf call.

Separating long sequences of characters by space

zevlacab

New Member

Marcelo Branco

MrExcel MVP

Excel Facts

pgc01

MrExcel MVP

Peter_SSs

MrExcel MVP, Moderator

pgc01

MrExcel MVP

pgc01

MrExcel MVP

Peter_SSs

MrExcel MVP, Moderator

pgc01

MrExcel MVP

Similar threads

Forum statistics

Share this page

Separating long sequences of characters by space

New Member

MrExcel MVP

Excel Facts

MrExcel MVP

MrExcel MVP, Moderator

MrExcel MVP

MrExcel MVP

MrExcel MVP, Moderator

MrExcel MVP

Similar threads

Forum statistics

Share this page

We've detected that you are using an adblocker.

Which adblocker are you using?

Disable AdBlock

Disable AdBlock Plus

Disable uBlock Origin

Disable uBlock