Web Scraping code question, specific site

wittonlin

Board Regular
Joined
Jan 30, 2016
Messages
144
I can't for the life of my re-code a custom project of ours to deal with changes on the following website. I'm not 100% sure what the developer was going after in the code below, but maybe the site changed?? I removed part of the code that used to attempt to scrape whether the number is a cell or landline that doesn't appear to be available any longer. I hope I didn't mess anything up.

The code doesn't error. It's just that the result is ALWAYS the same data = "Call" << That's what it scrapes for every record now.

The following function queries a page like this 240-505-00## Silver Spring ✔ CALLER ID ✔ CALLER NAME check-caller.NET, from check-caller.net.

Rich (BB code):
Public Function checkcaller(ByVal x As Integer) 
    querys = "http://www.check-caller.com/1-" & ThisWorkbook.Sheets("PhoneToCity").Cells(x, 4).Value & ThisWorkbook.Sheets("PhoneToCity").Cells(x, 5).Value & "-" & ThisWorkbook.Sheets("PhoneToCity").Cells(x, 7).Value & "-index" 
    Dim xhr As MSXML2.XMLHTTP60 
    Dim HTML As MSHTML.HTMLDocument 
    Dim table As MSHTML.HTMLTable 
    Dim tableCells As MSHTML.IHTMLElementCollection 
    Dim imgs As Object, ink As Object 
     
    DoEvents 
    Set xhr = New MSXML2.XMLHTTP60 
    DoEvents 
    With xhr 
        DoEvents 
        .Open "GET", querys, False 
        .send 
        DoEvents 
        If .ReadyState = 4 And .Status = 200 Then 
            Set HTML = New MSHTML.HTMLDocument 
            HTML.body.innerHTML = .responseText 
        Else 
        End If 
        DoEvents 
    End With 
    DoEvents 
     
            citysss = HTML.getElementsByClassName("entry-content")(0).innerText 
    On Error Goto errngct 
    citys = Mid(HTML.getElementsByClassName("entry-content")(3).getElementsByTagName("td")(7).innerText, 1, InStr(1, HTML.getElementsByClassName("entry-content")(3).getElementsByTagName("td")(7).innerText, ",") - 3) 
    If err = 1 Then 
errngct: 
        sp = InStrRev(Mid(citysss, 1, InStr(1, citysss, ",")), " ") 
        sf = InStr(1, citysss, ",") - sp 
        citys = Mid(citysss, sp, sf) 
    End If 
     
    ThisWorkbook.Sheets("PhoneToCity").Cells(x, 11).Value = citys 
    err = 0 
    If err = 1 Then 
errng: 
        ThisWorkbook.Sheets("PhoneToCity").Cells(x, 11).Value = "unknown" 
    End If 
    Set xhr = Nothing 
    Set HTML = Nothing 
     
End Function


Using the html code there's two possible areas to scrape the data I need. (CITY)



1) <head>

<title>240-505-00## Silver Spring ✔ CALLER ID ✔ CALLER NAME check-caller.NET</title>



2)<div class="entry-content">

You want to know who called you? You want to know if the number 240-505-00## is safe or unsafe? check-caller.NET is a Real Time Directory with Caller ID lookup: White Pages & Yellow Pages. Add your feedback and your Voting about your call, please!

<br>Call from Silver Spring, MD / Maryland <div class="center">


If anyone knows the code to scrape the city from one or both locations above, please let me know.
smile.gif



Also posted: Web Scraping city code question, specific site
 

Excel Facts

How to find 2nd largest value in a column?
MAX finds the largest value. =LARGE(A:A,2) will find the second largest. =SMALL(A:A,3) will find the third smallest
Crap the source html messed up the last two points!!

1) {head}
{title}240-505-00## Silver Spring ✔ CALLER ID ✔ CALLER NAME check-caller.NET{/title}

2){div class="entry-content"}
You want to know who called you? You want to know if the number 240-505-00## is safe or unsafe? check-caller.NET is a Real Time Directory with Caller ID lookup: White Pages & Yellow Pages. Add your feedback and your Voting about your call, please!
{br}Call from Silver Spring, MD / Maryland {div class="center"}
 
Last edited:
Upvote 0

Forum statistics

Threads
1,214,925
Messages
6,122,298
Members
449,077
Latest member
Rkmenon

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top