VBA - Error handling help needed

Brickhouse

New Member
Joined
Sep 5, 2016
Messages
29
Hey, so essentially I wrote some code to do a bit of web data scraping, and although the code is not pretty, it "works" for the most part.
The major issue I am running into though is error handling. Initially I didn’t have any error handling, and because some of the html variables can be absent (for example: "li_detail_params size"), i was getting errors. I tried to temporarily solve this with the "on error/next" error handling, but this doesnt actually solve my issue which is: output data structure i.e. prices/sizes etc are getting matched to the wrong locations when printed to cells. What is needed is an error handling method that adds a blank or a dummy value into the cell(s) which have adsent html variables and then continue on with my loop.

Hope this post clearly outlines the issue i am having, and thanks in advance for the feedback!


Code:
sub DataDump()


'Declaring Variables
Dim element As IHTMLElement
Dim elements As IHTMLElementCollection
Dim ie As InternetExplorer
Dim html As HTMLDocument


'open Internet Explorer, and go to website
    Set ie = New InternetExplorer
        ie.Visible = True
            ie.navigate "https://www.Website.com" 'Scrubbed
            
'Wait until IE has loaded the web page
Do While ie.readyState <> READYSTATE_COMPLETE
    DoEvents
        Loop


Set html = ie.document
Set elements = html.getElementsByClassName("media-heading")


Dim count As Long
Dim erow As Long
count = 0


On Error Resume Next 'error handling


For Each element In elements
    If element.className = "media-heading" Then
     
'print html values to cells
erow = Blad1.Cells(Rows.count, 1).End(xlUp).Offset(1, 0).Row
Cells(erow, 1) = html.getElementsByTagName("h4")(count).innerText
Cells(erow, 2) = html.getElementsByClassName("li_detail_params first price")(count).innerText
Cells(erow, 3) = html.getElementsByClassName("li_detail_params rooms")(count).innerText
Cells(erow, 4) = html.getElementsByClassName("li_detail_params size")(count).innerText


count = count + 1
    End If
        Next element


If Err.Number <> 0 Then Debug.Print Err.Description
On Error GoTo 0




MsgBox ("Done")
 

Excel Facts

Did you know Excel offers Filter by Selection?
Add the AutoFilter icon to the Quick Access Toolbar. Select a cell containing Apple, click AutoFilter, and you will get all rows with Apple
From a quick look, the only way I can see you getting data out of sync is if this part fails: html.getElementsByTagName("h4")(count).innerText
To solve that you could simply add 1 to erow each time rather than using End(xlUp) in the loop.
 
Upvote 0
Hey Rory,

I am not entirely sure how to amend the code with the correct syntax, but the methodology sounds logical.
 
Upvote 0
Change this part:
Code:
For Each element In elements
    If element.className = "media-heading" Then
     
'print html values to cells
erow = Blad1.Cells(Rows.count, 1).End(xlUp).Offset(1, 0).Row
Cells(erow, 1) = html.getElementsByTagName("h4")(count).innerText
Cells(erow, 2) = html.getElementsByClassName("li_detail_params first price")(count).innerText
Cells(erow, 3) = html.getElementsByClassName("li_detail_params rooms")(count).innerText
Cells(erow, 4) = html.getElementsByClassName("li_detail_params size")(count).innerText


count = count + 1
    End If
        Next element
to this:
Code:
erow = Blad1.Cells(Rows.count, 1).End(xlUp).Offset(1, 0).Row
For Each element In elements
    If element.className = "media-heading" Then
     
'print html values to cells
Cells(erow, 1) = html.getElementsByTagName("h4")(count).innerText
Cells(erow, 2) = html.getElementsByClassName("li_detail_params first price")(count).innerText
Cells(erow, 3) = html.getElementsByClassName("li_detail_params rooms")(count).innerText
Cells(erow, 4) = html.getElementsByClassName("li_detail_params size")(count).innerText

erow = erow + 1
count = count + 1
    End If
        Next element

and let us know what happens.
 
Upvote 0
Change this part:
Code:
For Each element In elements
    If element.className = "media-heading" Then
     
'print html values to cells
erow = Blad1.Cells(Rows.count, 1).End(xlUp).Offset(1, 0).Row
Cells(erow, 1) = html.getElementsByTagName("h4")(count).innerText
Cells(erow, 2) = html.getElementsByClassName("li_detail_params first price")(count).innerText
Cells(erow, 3) = html.getElementsByClassName("li_detail_params rooms")(count).innerText
Cells(erow, 4) = html.getElementsByClassName("li_detail_params size")(count).innerText


count = count + 1
    End If
        Next element
to this:
Code:
erow = Blad1.Cells(Rows.count, 1).End(xlUp).Offset(1, 0).Row
For Each element In elements
    If element.className = "media-heading" Then
     
'print html values to cells
Cells(erow, 1) = html.getElementsByTagName("h4")(count).innerText
Cells(erow, 2) = html.getElementsByClassName("li_detail_params first price")(count).innerText
Cells(erow, 3) = html.getElementsByClassName("li_detail_params rooms")(count).innerText
Cells(erow, 4) = html.getElementsByClassName("li_detail_params size")(count).innerText

erow = erow + 1
count = count + 1
    End If
        Next element

and let us know what happens.

I just tested this and it seems to have the same issues i.e. not leaving blanks in the output where blanks should be. I have a feeling that this is a simple fix, but i cant seem to wrap my mind around it
 
Upvote 0
It can't misalign data - you'll either get a row of output, or not.

I suggest you step through the code and work out what is going on exactly.
 
Upvote 0

Forum statistics

Threads
1,216,151
Messages
6,129,162
Members
449,489
Latest member
spvclub

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top