Separating long sequences of characters by space

zevlacab

New Member
Joined
Dec 17, 2010
Messages
2
Hello,
I have 50 to 100 sequences of 200 letters long or more in a column of Excel. I'm trying to add a space every 10 letters such as:
QWERTYUIOPASDFGHJKLZXCVB...etc
into QWERTYUIOP ASDFGHJKLZ XCVB...etc
Formulas I tried stopped after 10 characters or self-formulas replaced letters by space. Please help. Thanks. H.
 
TRIM would be fine if the separator is a blank - which is what the OP wanted. I note though that the function you have provided allows for an optional separator so if something else was used, TRIM wouldn't be sufficient.

Peter,

In fact i didnt use the separator provided by the function - its optional. I just added an space at the end of each part of the string.

But if i had to use a comma (using the function separator or simply including a comma in front of each part of the string), for example, i think it would be possible to clean up the final string using the SUBSTITUTE function.

M.
 
Upvote 0

Excel Facts

Last used cell?
Press Ctrl+End to move to what Excel thinks is the last used cell.
I guess I'm too late for the party but here's another one:

Code:
Public Function AddSep(s As String, lSep As Long, Optional sSep As String = " ") As String
 
    With CreateObject("VBScript.RegExp")
        .Pattern = "(.{" & lSep & "})(?!$)"
        .Global = True
        AddSep = .Replace(s, "$1" & sSep)
    End With
End Function
 
Upvote 0
I guess I'm too late for the party ..
pgc

I'm glad you came. As usual, it has been beneficial for me. :)

I have not got my head very well around "lookaround" (is that the correct term?) with regular expressions and had not considered using it here.

I still like the idea of removing the CreateObject("VBScript.RegExp") from the function itself. If the function is used much within the sheet, creating the object for every calculation seems to slow the code quite a bit.

Anyway, your suggestion made me have another think about this exercise. I think this very marginally shorter pattern also does the job. Do you have any comments/insight about where this idea would fail for the job where yours wouldn't?

My code would become ..

<font face=Courier New><SPAN style="color:#00007F">Private</SPAN> RX <SPAN style="color:#00007F">As</SPAN> <SPAN style="color:#00007F">Object</SPAN><br><br><SPAN style="color:#00007F">Private</SPAN> <SPAN style="color:#00007F">Sub</SPAN> CreateRX()<br>    <SPAN style="color:#00007F">Set</SPAN> RX = CreateObject("VBScript.RegExp")<br><SPAN style="color:#00007F">End</SPAN> <SPAN style="color:#00007F">Sub</SPAN><br><br><SPAN style="color:#00007F">Function</SPAN> Separate(<SPAN style="color:#00007F">ByVal</SPAN> s <SPAN style="color:#00007F">As</SPAN> <SPAN style="color:#00007F">String</SPAN>, Pos <SPAN style="color:#00007F">As</SPAN> <SPAN style="color:#00007F">Long</SPAN>, _<br>            <SPAN style="color:#00007F">Optional</SPAN> sep <SPAN style="color:#00007F">As</SPAN> <SPAN style="color:#00007F">String</SPAN> = " ") <SPAN style="color:#00007F">As</SPAN> <SPAN style="color:#00007F">String</SPAN><br>    <SPAN style="color:#00007F">If</SPAN> RX <SPAN style="color:#00007F">Is</SPAN> <SPAN style="color:#00007F">Nothing</SPAN> <SPAN style="color:#00007F">Then</SPAN> CreateRX<br>    <SPAN style="color:#00007F">With</SPAN> RX<br>        .Pattern = "(.{" & Pos & "})(.)"<br>        .Global = <SPAN style="color:#00007F">True</SPAN><br>        Separate = .Replace(s, "$1" & sep)<br>    <SPAN style="color:#00007F">End</SPAN> <SPAN style="color:#00007F">With</SPAN><br><SPAN style="color:#00007F">End</SPAN> <SPAN style="color:#00007F">Function</SPAN></FONT>
 
Upvote 0
Peter

I have not got my head very well around "lookaround" (is that the correct term?) ...

Yes, but as you know there are two, the lookahead, that we have in the vbscript, but also the lookbehind, that is available in other languages but that ms did not yet implement. :(

I still like the idea of removing the CreateObject("VBScript.RegExp") from the function itself. If the function is used much within the sheet, creating the object for every calculation seems to slow the code quite a bit.

I agree completely with you. I usually don't post it because I like to concentrate in the logic of the RegEx, but you are right, I should post it as that is how I use it when I need it for myself.

I prefer, however, to define it inside the udf, instead of in a module variable, like:

Code:
Public Function AddSep(s As String, lSep As Long, Optional sSep As String = " ") As String
Static RegEx As Object
 
If RegEx Is Nothing Then Set RegEx = CreateObject("VBScript.RegExp")
 
With RegEx
    .Pattern = "(.{" & lSep & "})(?!$)"
    .Global = True
    AddSep = .Replace(s, "$1" & sSep)
End With
End Function


Anyway, your suggestion made me have another think about this exercise. I think this very marginally shorter pattern also does the job. Do you have any comments/insight about where this idea would fail for the job where yours wouldn't?

Your pattern does not work, you are matching 1 extra character that you do not use in the result. This means that for each match you lose 1 character.

For ex., using a space as separator at each 10 characters, with your udf in column B and mine in column C, you see that since you did 2 matches you lost 2 characters.


<TABLE style="BORDER-BOTTOM-COLOR: #cccccc; BORDER-RIGHT-WIDTH: 2px; BORDER-TOP-COLOR: #cccccc; BORDER-COLLAPSE: collapse; FONT-FAMILY: Arial,Arial; BACKGROUND: #fff; BORDER-TOP-WIDTH: 2px; BORDER-BOTTOM-WIDTH: 2px; BORDER-RIGHT-COLOR: #cccccc; FONT-SIZE: 10pt; BORDER-LEFT-COLOR: #cccccc; BORDER-LEFT-WIDTH: 2px" border=1 cellPadding=1><TBODY><TR><TH style="BORDER-BOTTOM-COLOR: #888888; BORDER-RIGHT-WIDTH: 1px; BORDER-TOP-COLOR: #888888; BACKGROUND: #9cf; BORDER-TOP-WIDTH: 1px; BORDER-BOTTOM-WIDTH: 1px; BORDER-RIGHT-COLOR: #888888; BORDER-LEFT-COLOR: #888888; BORDER-LEFT-WIDTH: 1px"></TH><TH style="BORDER-BOTTOM-COLOR: #888888; TEXT-ALIGN: center; BORDER-RIGHT-WIDTH: 1px; BORDER-TOP-COLOR: #888888; BACKGROUND: #9cf; BORDER-TOP-WIDTH: 1px; BORDER-BOTTOM-WIDTH: 1px; BORDER-RIGHT-COLOR: #888888; BORDER-LEFT-COLOR: #888888; BORDER-LEFT-WIDTH: 1px">A</TH><TH style="BORDER-BOTTOM-COLOR: #888888; TEXT-ALIGN: center; BORDER-RIGHT-WIDTH: 1px; BORDER-TOP-COLOR: #888888; BACKGROUND: #9cf; BORDER-TOP-WIDTH: 1px; BORDER-BOTTOM-WIDTH: 1px; BORDER-RIGHT-COLOR: #888888; BORDER-LEFT-COLOR: #888888; BORDER-LEFT-WIDTH: 1px">B</TH><TH style="BORDER-BOTTOM-COLOR: #888888; TEXT-ALIGN: center; BORDER-RIGHT-WIDTH: 1px; BORDER-TOP-COLOR: #888888; BACKGROUND: #9cf; BORDER-TOP-WIDTH: 1px; BORDER-BOTTOM-WIDTH: 1px; BORDER-RIGHT-COLOR: #888888; BORDER-LEFT-COLOR: #888888; BORDER-LEFT-WIDTH: 1px">C</TH></TR><TR><TD style="BORDER-BOTTOM-COLOR: #000000; TEXT-ALIGN: center; PADDING-BOTTOM: 0.25em; BORDER-RIGHT-WIDTH: 1px; BORDER-TOP-COLOR: #000000; PADDING-LEFT: 0.5em; PADDING-RIGHT: 0.5em; BACKGROUND: #9cf; BORDER-TOP-WIDTH: 1px; BORDER-BOTTOM-WIDTH: 1px; BORDER-RIGHT-COLOR: #000000; BORDER-LEFT-COLOR: #000000; BORDER-LEFT-WIDTH: 1px; PADDING-TOP: 0.4em">1</TD><TD style="BORDER-BOTTOM-COLOR: #888888; TEXT-ALIGN: left; PADDING-BOTTOM: 0.25em; BORDER-RIGHT-WIDTH: 1px; BORDER-TOP-COLOR: #888888; PADDING-LEFT: 0.5em; PADDING-RIGHT: 0.5em; BORDER-TOP-WIDTH: 1px; BORDER-BOTTOM-WIDTH: 1px; BORDER-RIGHT-COLOR: #888888; BORDER-LEFT-COLOR: #888888; BORDER-LEFT-WIDTH: 1px; PADDING-TOP: 0.4em">01234567890123456789012</TD><TD style="BORDER-BOTTOM-COLOR: #888888; TEXT-ALIGN: left; PADDING-BOTTOM: 0.25em; BORDER-RIGHT-WIDTH: 1px; BORDER-TOP-COLOR: #888888; PADDING-LEFT: 0.5em; PADDING-RIGHT: 0.5em; BORDER-TOP-WIDTH: 1px; BORDER-BOTTOM-WIDTH: 1px; BORDER-RIGHT-COLOR: #888888; BORDER-LEFT-COLOR: #888888; BORDER-LEFT-WIDTH: 1px; PADDING-TOP: 0.4em">0123456789 1234567890 2</TD><TD style="BORDER-BOTTOM-COLOR: #888888; TEXT-ALIGN: left; PADDING-BOTTOM: 0.25em; BORDER-RIGHT-WIDTH: 1px; BORDER-TOP-COLOR: #888888; PADDING-LEFT: 0.5em; PADDING-RIGHT: 0.5em; BORDER-TOP-WIDTH: 1px; BORDER-BOTTOM-WIDTH: 1px; BORDER-RIGHT-COLOR: #888888; BORDER-LEFT-COLOR: #888888; BORDER-LEFT-WIDTH: 1px; PADDING-TOP: 0.4em">0123456789 0123456789 012</TD></TR><TR><TD style="PADDING-LEFT: 1em; BACKGROUND: #9cf" colSpan=4>[Book1]Sheet1</TD></TR></TBODY></TABLE>


What you can do, using your idea, is to use a lookahead.

I've used a negative logic, "match n characters if they are not followed by the end of the string".

In your case you can use positive logic, "match n characters if they are followed by another character". Using a lookahead you are not matching the extra character, you are just checking if it is present in the string and so you will not lose characters.

In this case you replace your pattern with:

Code:
        .Pattern = "(.{" & Pos & "})(?=.)"

Hope I did not mess up something as it's (much) past my bedtime :)
 
Last edited:
Upvote 0
More efficient:

Code:
Public Function AddSep(s As String, lSep As Long, Optional sSep As String = " ") As String
Static RegEx As Object
 
If RegEx Is Nothing Then
    Set RegEx = CreateObject("VBScript.RegExp")
    With RegEx
        .Pattern = "(.{" & lSep & "})(?!$)"
        .Global = True
    End With
End If
 
AddSep = RegEx.Replace(s, "$1" & sSep)
End Function
 
Upvote 0
pgc

Thanks for the additional explanations & suggestions.

I prefer, however, to define it inside the udf, instead of in a module variable ..
That seems a more compact method, I'll try to rember that for the future.


More efficient:

Code:
Public Function AddSep(s As String, lSep As Long, Optional sSep As String = " ") As String
Static RegEx As Object
 
If RegEx Is Nothing Then
    Set RegEx = CreateObject("VBScript.RegExp")
    With RegEx
        .Pattern = "(.{" & lSep & "})(?!$)"
        .Global = True
    End With
End If
 
AddSep = RegEx.Replace(s, "$1" & sSep)
End Function
More efficient for some RegExp implementations, but isn't it prone to problems with this particular exercise since the Pattern is effectively set in the sheet by the user?

For example, with ..
B9: =AddSep(A9,10)
C9: =AddSep(A9,5)
.. you don't get the desired results.


Your pattern does not work, you are matching 1 extra character that you do not use in the result.
Ah, yes, I did not study my results carefully enough, thanks for pointing that out. :oops:
 
Upvote 0
More efficient for some RegExp implementations, but isn't it prone to problems with this particular exercise since the Pattern is effectively set in the sheet by the user?

Peter

Thank you, you are absolutely right.

If the pattern is constant, which happens many times, the version in post #15, that defines the pattern only once, is the most efficient.

If the pattern varies with the function's arguments, like in this particular exercise, the correct one is like I posted in post #14, eventually including setting the Global property ony once, but leaving out the pattern setting for each udf call.
 
Upvote 0

Forum statistics

Threads
1,215,269
Messages
6,123,976
Members
449,138
Latest member
abdahsankhan

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top