Macro - edit and reorder data into new cells

Crawtney

New Member
Joined
Apr 10, 2018
Messages
6
This a sample of the data that I have entered into Column A after opening an output from a third party program.
Nc1nc2c(ncn2C3CC(C=C3)CO)c(n1)NC4CC4 abacavir_DB01048_B3_TGA.smi_0

<colgroup><col width="571"></colgroup><tbody>
</tbody>
CC(N(O)C(=O)N)c1cc2ccccc2s1 zileuton_DB00744_C_FDA.smi_0

<colgroup><col width="571"></colgroup><tbody>
</tbody>
Clc1c(cc2c(c1)NC(=O)C2)CCN3CCN(CC3)c4nsc5c4cccc5 ziprasidone_DB00246_C_TGA.smi_0

<colgroup><col width="571"></colgroup><tbody>
</tbody>
OC(Cn1ccnc1)(P(=O)(O)O)P(=O)(O)O zoledronic-acid_DB00399_B3_TGA.smi_0

<colgroup><col width="571"></colgroup><tbody>
</tbody>
CN(C)CCc1c[nH]c2c1cc(cc2)CC3COC(=O)N3 zolmitriptan_DB00315_B3_TGA.smi_0

<colgroup><col width="571"></colgroup><tbody>
</tbody>
CN(C)C(=O)Cc1c(nc2n1cc(cc2)C)c3ccc(cc3)C zolpidem_DB00425_B3_TGA.smi_0

<colgroup><col width="571"></colgroup><tbody>
</tbody>

<tbody>
</tbody>


<tbody>
</tbody>














I would like to be able to run a macro so that it changes to this:
abacavir

<colgroup><col width="101"></colgroup><tbody>
</tbody>
DB01048

<colgroup><col width="64"></colgroup><tbody>
</tbody>
B3

<colgroup><col width="64"></colgroup><tbody>
</tbody>
TGA

<colgroup><col width="64"></colgroup><tbody>
</tbody>
zileuton

<colgroup><col width="101"></colgroup><tbody>
</tbody>
DB00744

<colgroup><col width="64"></colgroup><tbody>
</tbody>
C

<colgroup><col width="64"></colgroup><tbody>
</tbody>
FDA

<colgroup><col width="64"></colgroup><tbody>
</tbody>
ziprasidone

<colgroup><col width="101"></colgroup><tbody>
</tbody>
DB00246

<colgroup><col width="64"></colgroup><tbody>
</tbody>
C

<colgroup><col width="64"></colgroup><tbody>
</tbody>
TGA

<colgroup><col width="64"></colgroup><tbody>
</tbody>
zoledronic acid

<colgroup><col width="101"></colgroup><tbody>
</tbody>
DB00399

<colgroup><col width="64"></colgroup><tbody>
</tbody>
B3

<colgroup><col width="64"></colgroup><tbody>
</tbody>
TGA

<colgroup><col width="64"></colgroup><tbody>
</tbody>
zolmitriptan

<colgroup><col width="101"></colgroup><tbody>
</tbody>
DB00315

<colgroup><col width="64"></colgroup><tbody>
</tbody>
B3

<colgroup><col width="64"></colgroup><tbody>
</tbody>
TGA

<colgroup><col width="64"></colgroup><tbody>
</tbody>
zolpidem

<colgroup><col width="101"></colgroup><tbody>
</tbody>
DB00425

<colgroup><col width="64"></colgroup><tbody>
</tbody>
B3

<colgroup><col width="64"></colgroup><tbody>
</tbody>
TGA

<colgroup><col width="64"></colgroup><tbody>
</tbody>

<tbody>
</tbody>

The individual changes that need to be made are:
- removing the first string and it's ending space
- removing ".smi_0"
- inserting 3 cells to the right of the first one
- using the '_' as an indication to move data to the next column over
- making sure no underscores are left
- replace any '-' with a space

Please note that the first string of characters and the following drug name do not all have an equal number of characters.

How I've been doing it so far:
- Find and replace ' ' with '_'
- Find and delete '.smi_0'
- Replace '_' with ','
- Copy and paste data to a text file
- Save as .csv
- Insert new cells into my spreadsheet
- Clear the cells with the existing data
- Open and copy the .csv data back into my spreadsheet
- Find and replace '-' with ' '

You can see why I need help!!! :)
Thanks
 

Excel Facts

Repeat Last Command
Pressing F4 adds dollar signs when editing a formula. When not editing, F4 repeats last command.
Hi & welcome to MrExcel
Give this a go
Code:
Sub Splitdata()

   Dim Ary As Variant
   Dim i As Long
      
   Ary = Application.Transpose(Range("A1", Range("A" & Rows.Count).End(xlUp)))
   For i = LBound(Ary) To UBound(Ary)
     Range("C" & i).Resize(, 4).Value = Split(Split(Left(Ary(i), Len(Ary(i)) - 6), " ")(1), "_")
   Next i
   Range("C:F").Replace "-", " ", xlPart, , , False, False
End Sub
 
Upvote 0
Glad to help & thanks for the feedback
 
Upvote 0
Hi,
Sometimes an output does not have the first string and its ending space.
I.e. the data looks like this:
abacavir_DB01048_B3_TGA.smi_0

<tbody>
</tbody>

zileuton_DB00744_C_FDA.smi_0

<tbody>
</tbody>

ziprasidone_DB00246_C_TGA.smi_0

<tbody>
</tbody>


<tbody>
</tbody>

Can you please modify the macro so that it works for this sequence also?
Thanks
 
Upvote 0
How about
Code:
Sub Splitdata()

   Dim Ary As Variant
   Dim i As Long
      
   Ary = Application.Transpose(Range("A1", Range("A" & Rows.Count).End(xlUp)))
   For i = LBound(Ary) To UBound(Ary)
      If InStr(1, Ary(i), " ") > 0 Then
         Range("C" & i).Resize(, 4).Value = Split(Split(Left(Ary(i), Len(Ary(i)) - 6), " ")(1), "_")
      Else
         Range("C" & i).Resize(, 4).Value = Split(Left(Ary(i), Len(Ary(i)) - 6), "_")
      End If
   Next i
   Range("C:F").Replace "-", " ", xlPart, , , False, False
End Sub
 
Upvote 0
I don't know if it is important to you but your description indicated inserting cells in your worksheet to receive three columns of the split data & I therefore assumed that the original data is no longer required and those cells would hold the first column of the new split data. If so, you could also consider testing this approach in a copy of your workbook (it does over-write the original data in column A)
Code:
Sub Split_Data()
  Application.ScreenUpdating = False
  With Range("A1", Range("A" & Rows.Count).End(xlUp))
    .Replace What:="* ", Replacement:="", LookAt:=xlPart
    .Replace What:=".*", Replacement:=""
    .Replace What:="-", Replacement:=" "
    .Offset(, 1).Resize(, 3).EntireColumn.Insert
    .TextToColumns DataType:=xlDelimited, Tab:=False, Semicolon:=False, Comma:=False, Space:=False, Other:=True, OtherChar:="_"
    .Resize(, 4).Columns.AutoFit
  End With
  Application.ScreenUpdating = True
End Sub
 
Upvote 0
Hello again,

Peter thanks for you suggestion.

Here is another scenario that I have and would really like help with (please):

etretinate_DB00926_X_TGA.smi_0_192

I would want this data to be sorted like previously, but with the addition of keeping the last string of numbers (which are of varying length)
E.g. it would look like this:

etretinate DB00926 X TGA 192
 
Upvote 0
Are you saying that now there would be 4 columns inserted and 5 columns of results?

If so, try
Code:
Sub Split_Data_v2()
  Application.ScreenUpdating = False
  With Range("A1", Range("A" & Rows.Count).End(xlUp))
    .Replace What:="* ", Replacement:="", LookAt:=xlPart
    .Replace What:=".smi_0", Replacement:=""
    .Replace What:="-", Replacement:=" "
    .Offset(, 1).Resize(, 4).EntireColumn.Insert
    .TextToColumns DataType:=xlDelimited, Tab:=False, Semicolon:=False, Comma:=False, Space:=False, Other:=True, OtherChar:="_"
    .Resize(, 5).Columns.AutoFit
  End With
  Application.ScreenUpdating = True
End Sub

If not, please provide more explanation and a few rows (5-10) of varied sample data and the expected results.
 
Upvote 0

Forum statistics

Threads
1,215,301
Messages
6,124,142
Members
449,144
Latest member
Rayudo125

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top