VBA text to columns

jaspncr

New Member
Joined
Jun 6, 2011
Messages
5
I'm pulling data out of an html source. I am currently taking it out line by line. It looks :

<table class="lined fontSize90" summary="This table shows the distribution of employer businesses by size of business establishment in each province and territory."><tbody><tr><td id="t3r3" headers="t3h1">Nova Scotia</td> <td headers="t3ha t3h2 t3r3">30 603</td> <td headers="t3ha t3hb t3h3 t3r3">55.1</td> <td headers="t3ha t3hb t3h4 t3r3">20.8</td> <td headers="t3ha t3hb t3h5 t3r3">11.7</td> <td headers="t3ha t3hb t3h6 t3r3">7.8</td> <td headers="t3ha t3hb t3h7 t3r3">2.5</td> <td headers="t3ha t3hb t3h8 t3r3">97.9</td> <td headers="t3ha t3hb t3h9 t3r3">1.3</td> <td headers="t3ha t3hb t3h10 t3r3">0.6</td> <td headers="t3ha t3hb t3h11 t3r3">1.9</td> <td headers="t3ha t3hb t3h12 t3r3">0.3</td></tr></tbody></table>
<table class="lined fontSize90" summary="This table shows the distribution of employer businesses by size of business establishment in each province and territory."><tbody><tr><tr><td id="t3r2" headers="t3h1">Prince Edward Island</td><td headers="t3ha t3h2 t3r2">6 148</td><td headers="t3ha t3hb t3h3 t3r2">51.4</td><td headers="t3ha t3hb t3h4 t3r2">22.8</td><td headers="t3ha t3hb t3h5 t3r2">13.2</td><td headers="t3ha t3hb t3h6 t3r2">8.3</td><td headers="t3ha t3hb t3h7 t3r2">2.6</td><td headers="t3ha t3hb t3h8 t3r2">98.3</td><td headers="t3ha t3hb t3h9 t3r2">1.0</td><td headers="t3ha t3hb t3h10 t3r2">0.5</td><td headers="t3ha t3hb t3h11 t3r2">1.6</td><td headers="t3ha t3hb t3h12 t3r2">0.2</td></tr></tr></tbody></table><table class="lined fontSize90" summary="This table shows the distribution of employer businesses by size of business establishment in each province and territory."><tbody><tr><td id="t3r1" headers="t3h1">Newfoundland and Labrador</td> <td headers="t3ha t3h2 t3r1">17 989</td> <td headers="t3ha t3hb t3h3 t3r1">57.7</td> <td headers="t3ha t3hb t3h4 t3r1">21.5</td> <td headers="t3ha t3hb t3h5 t3r1">10.7</td> <td headers="t3ha t3hb t3h6 t3r1">6.4</td> <td headers="t3ha t3hb t3h7 t3r1">1.9</td> <td headers="t3ha t3hb t3h8 t3r1">98.3</td> <td headers="t3ha t3hb t3h9 t3r1">0.9</td> <td headers="t3ha t3hb t3h10 t3r1">0.5</td> <td headers="t3ha t3hb t3h11 t3r1">1.4</td> <td headers="t3ha t3hb t3h12 t3r1">0.3</td></tr></tbody></table><table class="lined fontSize90" summary="This table shows the distribution of employer businesses by size of business establishment in each province and territory." width="440" height="145"><tbody><tr><td valign="top">The above list goes on to include the rest of the Canadian provinces. When separated correctly there is one naming column (province name) + 11 columns of numbers. I want to change this text to columns to most efficient way possible. Currently I am past line by line into excel, and converting each from text to columns using 'fixed width'. I remove the tabs where it tries to separate province names that have more than one word. I thought that there might be a quicker solution with VBA. Any advice appreciated.
</td><td valign="top">
</td><td valign="top">
</td><td valign="top">
</td><td valign="top">
</td><td valign="top">
</td><td valign="top">
</td><td valign="top">
</td><td valign="top">
</td><td valign="top">
</td><td valign="top">
</td><td valign="top">
</td><td valign="top">
</td><td valign="top">
</td><td valign="top">
</td></tr><tr><td id="t3r2" headers="t3h1">
</td> <td valign="top">
</td><td valign="top">
</td><td valign="top">
</td><td headers="t3ha t3h2 t3r2">
</td> <td headers="t3ha t3hb t3h3 t3r2">
</td> <td headers="t3ha t3hb t3h4 t3r2">
</td> <td headers="t3ha t3hb t3h5 t3r2">
</td> <td headers="t3ha t3hb t3h6 t3r2">
</td> <td headers="t3ha t3hb t3h7 t3r2">
</td> <td headers="t3ha t3hb t3h8 t3r2">
</td> <td headers="t3ha t3hb t3h9 t3r2">
</td> <td headers="t3ha t3hb t3h10 t3r2">
</td> <td headers="t3ha t3hb t3h11 t3r2">
</td> <td headers="t3ha t3hb t3h12 t3r2">
</td></tr></tbody></table>
 

Excel Facts

Can a formula spear through sheets?
Use =SUM(January:December!E7) to sum E7 on all of the sheets from January through December
Hello and Welcome,

Here is one way to do this with VBA. It assumes your data is in Column A beginning at Cell A1.
It will step through each row splitting the text into Columns B through N until it finds
the first blank cell in Column A.

Rich (BB code):
Sub Split_Text()
    Dim rng As Range
    Dim strRemainder As String
    Dim strSplit() As String
    Dim i As Long, p As Long, lngNrPos As Long
    Application.ScreenUpdating = False
 
    Set rng = Range("A1") 'edit to first cell with text
    While rng.value <> ""
        lngNrPos = 0
        For p = 1 To Len(rng.value)
            If Mid(rng.value, p, 1) Like "[0-9]" Then
                lngNrPos = p
                Exit For
            End If
        Next p
        If lngNrPos = 0 Then
            rng.Offset(0, 1) = rng.value
        Else
            rng.Offset(0, 1) = Trim(Left(rng.value, lngNrPos - 1))
            strRemainder = Mid(rng.value, lngNrPos) 
            strSplit = Split(strRemainder, " ")
            For i = LBound(strSplit) To UBound(strSplit)
                rng.Offset(0, i + 2).value = strSplit(i)
            Next i
        End If
        Set rng = rng.Offset(1)
    Wend
    Set rng = Nothing
    Erase strSplit
End Sub
 
Upvote 0
JS, is this the most efficient way to format my data? I am pulling data from "Table 3..., June 2008" This is one cross-sectional set of the data set. It is panel data (the same provinces update the data semi-annually). Each cross section of the data is stored on different html pages. Example:

June 2008 (Table 3): http://www.ic.gc.ca/eic/site/sbrp-rppe.nsf/eng/rd02343.html

December 2008 (Table 3): http://www.ic.gc.ca/eic/site/sbrp-rppe.nsf/eng/rd02400.html

June 2009 (Table 3): http://www.ic.gc.ca/eic/site/sbrp-rppe.nsf/eng/rd02445.html

etc. etc.
I have a total of 14 cross sections (13 provinces + 1 total) and 12 time series for each cross section. I've been mapping my data like this:

Name1(1), Data1(1), Data2(1), Data3(1), ...,Data11(1)
.
.
Name1(12), Data1(12), Data2(12), Data3(12),...,Data11(12)
.
.
Name14(12), Data1(12), Data2(12), Data3(12),...,Data11(12)

Where Name = the cross section name (province name)
(x) = the time series point
DataX = The numbers describing the cross section name.

Once I paste each row of table 3 into excel I separate them into columns, and then I paste the separated text into my master database, the one I described above.

Given the very helpful code you sent me, do you think that is the fastest way to process internet data, or do you think there is a quicker way to do it. Thanks again.
 
Upvote 0
Given the very helpful code you sent me, do you think that is the fastest way to process internet data, or do you think there is a quicker way to do it.

I'm happy to hear that the code worked for you.

I don't have any experience downloading table data from the web, but your question gave me an excuse to find out how this is typically done. ;)

I think you will want to look into Web Queries if you are not already using that to do your downloads. Using Web Queries, you should be able to set up your formatting once and then just refresh your data each time you open your workbook.

If I had known how easy xl2007 makes this when I first read your post, I would have suggested this method instead of providing you some VBA code as a work around.

Here are two links that could get you started.

For xl2003:
http://www.mrexcel.com/tip103.shtml

For xl2007:
http://www.dummies.com/how-to/content/how-to-import-online-data-into-excel-2007-with-a-w.html

Good luck!
 
Upvote 0

Forum statistics

Threads
1,224,505
Messages
6,179,151
Members
452,891
Latest member
JUSTOUTOFMYREACH

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top