Import data form website into Excel

excel_1317

Board Regular
Joined
Jun 28, 2010
Messages
212
Doing all 47 pages at once may cause their site to block your IP.
Though usually only temporarily - a few minutes to a few days.
You could do it in sessions by changing "For i = 1 To 47" to the appropriate page numbers.
After each session be sure to rename the "Stats" sheet as the code starts by deleting and re-creating it.
After all sessions combine the various "Stat" sheets.

Doing all 47 at once will take a few minutes to complete and you won't see anything happening.
The cursor will only spin occasionally.
You wil get "Query complete" when it finishes.

Code:
Sub QueryWeb()
    Dim i As Integer
    Dim firstRow As Integer
    Dim lastRow As Integer
    Dim nextRow As Integer
    Dim URLstart As String
    Dim URLend As String
    Dim shStats As Worksheet
    Dim shQuery As Worksheet
    Dim rgQuery As Range
    Dim found As Range
    Dim TimeOutWebQuery
    Dim TimeOutTime
    Dim objIE As Object
    Application.ScreenUpdating = False
    URLstart = "http://stats.espncricinfo.com/ci/engine/stats/index.html?class=2;filter=advanced;orderby=start;page="
    URLend = ";size=200;spanmax1=12+Jul+2012;spanmin1=13+Jul+2009;spanval1=span;template=results;type=batting;view=innings;wrappertype=print"
    Application.DisplayAlerts = False
    On Error Resume Next
    Sheets("Stats").Delete
    On Error GoTo 0
    Application.DisplayAlerts = True
    Sheets.Add after:=Sheets(Sheets.Count)
    ActiveSheet.Name = "Stats"
    Set shStats = Sheets("Stats")
    For i = 1 To 47
        Sheets.Add after:=Sheets(Sheets.Count)
        Set shQuery = ActiveSheet
        Set objIE = CreateObject("InternetExplorer.Application")
        With objIE
            .Visible = False
            .Navigate CStr(URLstart & i & URLend)
        End With
        TimeOutWebQuery = 10
        TimeOutTime = DateAdd("s", TimeOutWebQuery, Now)
        Do Until objIE.ReadyState = 4
            DoEvents
            If Now > TimeOutTime Then
                objIE.stop
                GoTo ErrorTimeOut
            End If
        Loop
        objIE.ExecWB 17, 2
        objIE.ExecWB 12, 2
        shQuery.Range("A1").Select
        shQuery.PasteSpecial NoHTMLFormatting:=True
        objIE.Quit
        Set objIE = Nothing
        Set found = shQuery.Columns(1).Find("Player", , , xlWhole)
        If Not found Is Nothing Then
            firstRow = found.Row
            If i > 1 Then firstRow = firstRow + 1
        Else
            GoTo FormatError
        End If
        Set found = shQuery.Columns(1).Find("Page ", found, , xlPart)
        If Not found Is Nothing Then
            lastRow = found.Row - 1
        Else
            GoTo FormatError
        End If
        Set rgQuery = shQuery.Rows(firstRow & ":" & lastRow)
        nextRow = shStats.Cells(Rows.Count, "A").End(xlUp).Row
        If nextRow > 1 Then nextRow = nextRow + 1
        rgQuery.Copy shStats.Cells(nextRow, 1)
        Application.DisplayAlerts = False
        shQuery.Delete
        Application.DisplayAlerts = True
    Next i
    shStats.Columns.AutoFit
    MsgBox "Query complete"
    Exit Sub
FormatError:
    MsgBox "Format Error"
    Exit Sub
ErrorTimeOut:
    objIE.Quit
    Set objIE = Nothing
    MsgBox "WebSite Error"
End Sub

This code works like charm. Is is possible to develop a code like this which open each each name from this page- http://www.fitzpatrickcella.com/?p=2541&FirstName=&LastName=&LPA=&I=&L=&Rank=

AND

copy details like Attorney Name, email, phone and fax nos. and paste the same into excel rows.

Thanks in advance.
 
Re: Import data from multiple pages of a website into a single Excel sheet

I have split these posts from: http://www.mrexcel.com/forum/excel-...le-pages-website-into-single-excel-sheet.html

@excel1317:
Please acquaint yourself with our rules, particularly rule #7:
All new questions should be posted to new threads on the Public Forums.
You have essentially hijacked someone else's thread which is why I have split it and created your own thread.

Further, we generally recommend that you want 24 hours before bumping your thread, otherwise you come across a little pushy and each time you bump your thread back to the top you are bumping other peoples threads down the queue.

Finally, I suggest you have a crack at understanding the code provided and that you make some attempt to solve your problem. Members here tend to be more helpful when the poster can show that he/she has made some attempt to resolve the problem. If you just want the tool but you are not interested in learning how it is done perhaps consider employing professional help, e.g. MrExcel Consulting. A web search will reveal many others that offer paid-for Excel services.
 
Upvote 0

Excel Facts

Whats the difference between CONCAT and CONCATENATE?
The newer CONCAT function can reference a range of cells. =CONCATENATE(A1,A2,A3,A4,A5) becomes =CONCAT(A1:A5)
Re: Import data from multiple pages of a website into a single Excel sheet

I have split these posts from: http://www.mrexcel.com/forum/excel-...le-pages-website-into-single-excel-sheet.html

@excel1317:
Please acquaint yourself with our rules, particularly rule #7:

You have essentially hijacked someone else's thread which is why I have split it and created your own thread.

Further, we generally recommend that you want 24 hours before bumping your thread, otherwise you come across a little pushy and each time you bump your thread back to the top you are bumping other peoples threads down the queue.

Finally, I suggest you have a crack at understanding the code provided and that you make some attempt to solve your problem. Members here tend to be more helpful when the poster can show that he/she has made some attempt to resolve the problem. If you just want the tool but you are not interested in learning how it is done perhaps consider employing professional help, e.g. MrExcel Consulting. A web search will reveal many others that offer paid-for Excel services.

I posted the question in old thread because it was related to my problem. Moreover i am keen to know how this is done but "FROM WHERE" is the issue. If you can guide, it would be great!!
 
Upvote 0
Re: Import data from multiple pages of a website into a single Excel sheet

maybe if you had done it manually, you would be done after 3 days :)

Its weekend! Hope you could spare some time. Please do it for 1 page. So i will replace the url in code 23 times to get data for all 24 pages.

Thanks
 
Upvote 0
Re: Import data from multiple pages of a website into a single Excel sheet

There you go

don't forget to add the same references and have a good weekend :D

PHP:
Option Explicit
' 15/03/2014 VBA Geek Mr Excel Post
'http://www.mrexcel.com/forum/excel-questions/763806-import-data-form-website-into-excel.html

Sub DrLawyer()
Dim X As MSXML2.XMLHTTP: Set X = New MSXML2.XMLHTTP
Dim aHTML As MSHTML.HTMLDocument: Set aHTML = New MSHTML.HTMLDocument
Dim New_HTML As MSHTML.HTMLDocument: Set New_HTML = New MSHTML.HTMLDocument
Dim Attorneys As MSHTML.IHTMLElementCollection
Dim k As Integer: k = 1
Dim mEL As MSHTML.IHTMLElement
Dim AttInfos As MSHTML.IHTMLElementCollection
Dim u As Byte
Application.ScreenUpdating = False
[A1:D1] = Array("Attorney Name", "Job Title", "Phone", "E-Mail")
For u = 0 To 23
    With X
        .Open "GET", "http://knobbe.com/search/attorneys?position=All&office=All&practice=" & _
        "All&industry=All&&&&years_range=between&years[value]=&years[min]=" & _
        "&years[max]=&nickname=&name=&page=" & u, False
        .send
        aHTML.body.innerHTML = .responseText
    End With
    Set Attorneys = aHTML.getElementsByClassName("links important")
    Dim AttorneyGeneral As String: AttorneyGeneral = vbNullString
    
    For Each mEL In Attorneys
        AttorneyGeneral = "http://knobbe.com" & Replace(mEL.Children(0).href, "about:", vbNullString)
            With X
                .Open "GET", AttorneyGeneral, False
                .send
                New_HTML.body.innerHTML = .responseText
            End With
            Set AttInfos = New_HTML.getElementsByClassName("mod-overlay")
            Range("A1").Offset(k, 0).Value = AttInfos(0).Children(0).Children(0).innerText
            Range("A1").Offset(k, 1).Value = AttInfos(0).Children(1).Children(0).innerText
            Range("A1").Offset(k, 2).Value = AttInfos(0).Children(2).Children(1).innerText
            Range("A1").Offset(k, 3).Value = AttInfos(0).Children(2).Children(2).innerText
            k = k + 1
    Next
Next
Application.ScreenUpdating = True
Set AttInfos = Nothing
Set aHTML = Nothing
Set X = Nothing
Set New_HTML = Nothing
Set Attorneys = Nothing
MsgBox "Data Downloaded"
End Sub
 
Upvote 0
Re: Import data from multiple pages of a website into a single Excel sheet

"VBA Geek", your work justifies your name! You really made my weekend good :) Otherwise I would have been ended up doing tedious task of copy/paste.

Many Thanks for your help.
 
Upvote 0
Re: Import data from multiple pages of a website into a single Excel sheet

Great, glad it worked
 
Upvote 0

Forum statistics

Threads
1,215,109
Messages
6,123,137
Members
449,098
Latest member
Doanvanhieu

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top