Separate a text string when undercase meets uppercase / find where the case is changing

egotajcs

New Member
Joined
May 6, 2017
Messages
27
Hi guys,

I have a quite strange and unique problem. I make an "automatic parser excel file", which uses a big database (where everything is in a single cell, a long-long string) for input, and parsing the datas into separate columns. I could do a lot of things on my own, but I stucked at a point.

I need to separate the horses' names and the trainer's names, but there isn't any space between them. So, the strings look like this:


A
ThrockleyJohn Davies
Royal RegentLucy Normile
Spes NostraIain Jardine
KomodoJedd O'Keeffe
Lucent DreamJohn C McConnell

<tbody>
</tbody>

As you can see, there could be one and only solution to separate this into two columns:
find the first position where an undercased character followed by an uppercased character

If I could get that kind of position, I could separate the horses' names and the trainer's names into two separate columns using LEN, LEFT, RIGHT and MID formulas.

But I don't know how to determine 'where is the first undercase character which is followed by an uppercased character'.

Can anybody help with this?

Thank you
 

Excel Facts

How to fill five years of quarters?
Type 1Q-2023 in a cell. Grab the fill handle and drag down or right. After 4Q-2023, Excel will jump to 1Q-2024. Dash can be any character.
In this case, that automatic tool won't work either.

But in generally, my computer and my excel program is in hungarian language where a lot of things are different, for example the decimal is not point, but comma, etc. So, many cases the excel automatically changes a lot of datas into date and time format. I can prevent it but only if I set the columns to text, no matter the datas. That is 2-3 minutes clicking and copypasting using a TXT file to get rid of every formats and copypaste back into excel file, and I don't want to do it every day. If I can make a good parser excel file, then it will be one, single click.
 
Upvote 0
The following formula will extract the first word in the string located in A1.

Maybe it helps you,

Code:
=LEFT(A1;(FIND(" ";A1;1))-1)
 
Upvote 0
This is the result, which I'm seeking:

AB
ThrockleyJohn DaviesThrockley
Royal RegentLucy NormileRoyal Regent
Spes NostraIain JardineSpes Nostra
KomodoJedd O'KeeffeKomodo
Lucent DreamJohn C McConnellLucent Dream

<tbody>
</tbody>

So, as it can seen, sometimes, I have to separate the first word, sometimes the second word. So, the only option is to find 'where is the first undercase character which is followed by an uppercased character'
I cannot find solution using Google to this matter.
The 'first' is important, because, as it can seen the last one, there are multiple times, where an undercase character meets with an uppercase characeter (McConnell).

THX
 
Last edited:
Upvote 0
Code:
=LEFT(A1,FIND("J",A1,2)-1)

This is essentially the formula you would use "If you knew what you were looking for", so we need to somehow create an array of the alphabet to replace the "J" in the formula.
 
Upvote 0
There are people on the forum whom are geniuses with "functions", something I am not. I am sure they will be able to build you a function for your needs.

A function is essentially a personal formula and would result in the formula you would use being something like =HORSE(A1)
 
Upvote 0
Hmmm. I'm not good how to make array formulas (the shift-ctrl-enter things), and VBA is way out of my league. :(
 
Upvote 0
Here's a UDF (user-defined function) you can use like a native excel function. First install the function in your workbook using the instructions below.
To install the UDF:
1. With your workbook active press Alt and F11 keys. This will open the VBE window.
2. In the project tree on the left of the VBE window, find your project and click on it.
3. On the VBE menu: Insert>Module
4. Copy the UDF from your browser window and paste it into the white space in the VBE window.
5. Close the VBE window and Save the workbook. If you are using Excel 2007 or a later version do a SaveAs and save it as a macro-enabled workbook (.xlsm file extension).

Use the UDF like this: =HorseName(A1)
Code:
Function HorseName(S As String) As String
'horse name is everything to the left of the first lower case letter
'followed immediately by an uppercase letter
Dim i As Long
For i = 1 To Len(S)
    If Mid(S, i, 1) Like "[a-z]" And Mid(S, i + 1, 1) Like "[A-Z]" Then
        HorseName = Left(S, i)
        Exit Function
    End If
Next i
End Function
 
Upvote 0
I'm sure the OP will be very grateful Joe, but I suspect there might be one or two horses called McSomething or MacSomething. The function may have to be changed to "assume" that the horses name will be at least 4 or 5 characters in length.
 
Upvote 0

Forum statistics

Threads
1,215,055
Messages
6,122,902
Members
449,097
Latest member
dbomb1414

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top