First time scraping

dgkindy

New Member
Joined
Sep 9, 2009
Messages
9
Hello Everyone.

I am trying scraping for the first time. I came across this website that gives a great example but I am getting lost with what I would call more advanced programming.

What I am trying to do is extract the person's name, number and address from the website but getting lost in the code and unclear how it should be modified. The sample code uses a list brought in from Excel, but in this case I am trying to hard code with the following id from the HTML source code

listing__row
ContactName1
ContactPhone1
ContactAddress1

I want to output the scraping results to a spreadsheet but was planning to post the results to the sheet as the details are located so was not planning to store the values in memory.

Any help to steer me in the right direction is appreciated.



Scraping a website's HTML in VBA

Sample of HTML Code to be used
Capture.PNG



HTML:
                                                                                K Fiedler-Seres                            (905) 895-3245                                                                                                            832 Arnold Cres Newmarket ON L3Y 2E2                                                                            Get directions


Code:
Sub ImportCanada411Data()'to refer to the running copy of Internet Explorer
Dim ie As InternetExplorer
'to refer to the HTML document returned
Dim html As HTMLDocument
'open Internet Explorer in memory, and go to website
Set ie = New InternetExplorer
ie.Visible = False
ie.navigate "http://www.canada411.ca/search/ad/1/-/cZQQstZArnold%20CrescentQQciZNewmarketQQpvZONQQpcZ/?pgLen=10000"
'Wait until IE is done loading page
Do While ie.READYSTATE <> READYSTATE_COMPLETE
Application.StatusBar = "Trying to go to Canada411 ..."
DoEvents
Loop
'show text of HTML document returned
Set html = ie.document
'close down IE and reset status bar
Set ie = Nothing
Application.StatusBar = ""
Dim Contact As IHTMLElement
Dim Questions As IHTMLElementCollection
Dim Question As IHTMLElement
'Dim RowNumber As Long
'Dim QuestionId As String
'Dim QuestionFields As IHTMLElementCollection
'Dim QuestionField As IHTMLElement
Dim StreetCnt As Integer
Dim ListCnt As Integer
Dim QuestionFieldLinks As IHTMLElementCollection
Set Contact = html.getElementById("listing__row")
'Set Questions = QuestionList.Children
StreetCnt = 2
ListCnt = 2




'For Each Question In Questions
    'if this is the tag containing the question details, process it
    If Question.className = "question-summary narrow" Then
        'first get and store the question id in first column
        QuestionId = Replace(Question.ID, "question-summary-", "")
        Cells(RowNumber, 1).Value = CLng(QuestionId)
    'if this is the question's votes, store it (get rid of any surrounding text)
    If QuestionField.className = "votes" Then
        votes = Replace(QuestionField.innerText, "votes", "")
        votes = Replace(votes, "vote", "")
        Cells(RowNumber, 2).Value = Trim(votes)
    End If
    'likewise for views (getting rid of any text)
    If QuestionField.className = "views" Then
        views = QuestionField.innerText
        views = Replace(views, "views", "")
        views = Replace(views, "view", "")
        Cells(RowNumber, 3).Value = Trim(views)
    End If
    'if this is the bit where author's name is ...
    If QuestionField.className = "started" Then
        'get a list of all elements within, and store the
        'text in the second one
        Set QuestionFieldLinks = QuestionField.all
        Cells(RowNumber, 4).Value = QuestionFieldLinks(2).innerHTML
    End If
    'Next QuestionField
        'go on to next row of worksheet
        RowNumber = RowNumber + 1
        End If
    Next
Set html = Nothing
MsgBox "Done!"
End Sub
 

Excel Facts

Spell Check in Excel
Press F7 to start spell check in Excel. Be careful, by default, Excel does not check Capitalized Werds (whoops)
Last edited:
Upvote 0
Mark,

I already tried that url. This link is redirecting to a blank page with black background.
 
Upvote 0
Click your update/refresh button....
 
Last edited:
Upvote 0
I am afraid but still no success. Tried that link with 3 different browsers. Same result.
 
Upvote 0

Forum statistics

Threads
1,214,785
Messages
6,121,543
Members
449,038
Latest member
Guest1337

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top