Web Page Navigation To Next Item In List

Sharid

Well-known Member
Joined
Apr 22, 2007
Messages
1,064
Office Version
  1. 2016
Platform
  1. Windows
I use the following code to navigate pages on the web, it works fine, however it needs updating. I can put in the the number of pages to navigate in Sheet1 C1 and 2x delays in D1 and E1. I place the item to search in Sheet1 B1.

I would now like to have a list of items in Sheet2 column A. I would like the code below to search each item in the column for the number of pages input into Sheet1 C1. Currently I can ONLY do one item, then the code stops. I then have to add a new item and run it again. I would prefer to add a list of items in sheet2 column A and the code automates the changes by itself after it has reached the max page number or there is nothing more for that item to extract.

Currently it will exit to Do if there are no more items to extract, at this point it should move to the next item in sheet 2 or when X amount of pages has been achived (Page number to navigate is in Sheet1 C1)

This part of the code is the issue, works fine just needs updating.
VBA Code:
'''Searches Number of Pages entered in Sheet1 C1
        If pageNumber >= Replace(Worksheets("Sheet1").Range("C1").Value, " ", "+") Then Exit Do
            On Error Resume Next
            Set nextPageElement = htmlDoc.getElementById("pnnext")
''' Exit DO if items pages are less than pages entered, after last page is done     
       If nextPageElement Is Nothing Then Exit Do
'''Scrolls Down the Browser
            IE.document.parentWindow.Scroll 0&, 99999
'''Random delay from Max number entered in Sheet1 D1
            Application.Wait Now + TimeSerial(0, 0, Application.RandBetween(1, Worksheets("Sheet1").Range("D1").Value))
'''next web page           
         nextPageElement.Click
        Do While IE.Busy Or IE.readyState <> 4
            DoEvents
        Loop
'''Random delay from Max number entered in Sheet1 E1
        Application.Wait Now + TimeSerial(0, 0, Application.RandBetween(1, Worksheets("Sheet1").Range("E1").Value))
            Set htmlDoc = IE.document
            pageNumber = pageNumber + 1
''' Delete duplicates
            Sheet1.Columns("A").RemoveDuplicates Columns:=Array(1), Header:=xlYes
''' Delete Row If Blank
            Sheet1.Columns("A").SpecialCells(xlCellTypeBlanks).EntireRow.Delete
    Loop
MsgBox "All Done"
 

Excel Facts

Excel Wisdom
Using a mouse in Excel is the work equivalent of wearing a lanyard when you first get to college
Hi,
Can you please share the siteURL also.

Thanks,
Saurabh
 
Upvote 0
I was looking at something like this, but struggling to incorporate it. My list will be dynamic so the second option would be the best
Link Ms
 
Upvote 0
Hi, thanks. .

What variable pageNumber contains in shared code ?
Would you like to show search result of all specified pages?
 
Last edited:
Upvote 0
Hi, Kindly share the complete code to check.
If nextPageElement Is Nothing Then Exit Do - In this line you are referring a loop which is missing.
 
Upvote 0
I would set the page number to 100 pages, and for it to extract all links from the pages. If there was under 100 pages then it moves to next item in sheet2 column A. And searches for that starting at page 1
 
Upvote 0
This is an old link to the code, but the bulk of it is here. My post mr excel

I am not logged on my pc, so can not get the updated version. This is about 95% the same, no longer works from textbix, just from sheet cell.
 
Upvote 0
Hi, To extract all the search result on a sheet from a single page-

VBA Code:
Sub showSearchResults()
    Application.ScreenUpdating = False
    'On Error Resume Next
        Dim HTMLDoc As New HTMLDocument
        Dim ieBrowser As New InternetExplorer
        
    'variable to specify the row number in excel
    Dim rowno As Integer
    Dim trow As Object, searchResult As String
    
        rowno = 1
       'To open and show Internet Explorer
       ieBrowser.Visible = True
       
       'To Open website in Internet Explorer
       ieBrowser.navigate "https://www.google.com/search?q=excel"
       
       Do
       ' Wait till the Browser is loaded
       Loop Until ieBrowser.readyState = READYSTATE_COMPLETE
       
       Set HTMLDoc = ieBrowser.document
        For Each trow In HTMLDoc.getElementsByClassName("g")
                Sheets("Sites").Cells(rowno, 1) = trow.getElementsByClassName("LC20lb DKV0Md")(0).innerText
                Sheets("Sites").Cells(rowno, 2) = trow.getElementsByClassName("aCOpRe")(0).innerText
                rowno = rowno + 1
        Next
   
        Set ieBrowser = Nothing
        Set HTMLDoc = Nothing
        Application.ScreenUpdating = True
End Sub
 
Upvote 0
Saurabhj

What you have done is good, but it is not what i am after. I can already extract all urls from google, I can also navigate the pages in google. All this is fine. My problems is that I have to input 1 search then search X amount of pages on google then change the search criteria, I want it to work from A list of search criteria that is in Sheet2. So after X amount of pages have been searched it moves to the next item in Sheet2 column A. If there are not enough pages to search for that criteria it then will ALSO move to next item in search.

Currently It is Like ThisHow I would like it to be
A1 = Results
B1 = Search Criteria
C1 = Number of pages to navigate
D1 = Delay 1 in seconds
E1 = Delay 2 in seconds
1611918639560.png
I want to have a search criteria in Sheet2 column A. After Sheet1 C1 page navigation number has been reached , it moves to the next item in the search criteria and then searches that for X amount of pages as shown in Sheet1 C1. If there are NOT enough pages to extract then it moves to next item in criteria until all are done


1611919158639.png

VBA Code:
       Private Sub CommandButton1_Click()

'**** Keyword URL SCRAPER *****
Dim ie As Object
Dim HTMLdoc As Object
Dim nextPageElement As Object
Dim div As Object
Dim link As Object
Dim url As String
Dim pageNumber As Long
Dim i As Long

' ***** Currently this takes seach Criteria from Sheet1 Cell B1 and places it into google *****
url = "https://www.google.co.uk/search?q=" & Replace(Worksheets("Sheet1").Range("B1").Value, " ", "+")

' ***** Gets internet explorer read, which is set to False so does NOT SHOW ******
Set ie = CreateObject("InternetExplorer.Application")

With ie
.Visible = False
.navigate url
Do While .Busy Or .readyState <> 4
DoEvents
Loop
End With

Set HTMLdoc = ie.document

'***** Searches for URLs and places the results into Sheet1 ROW 2 Column A *****
With Sheets("Sheet1")
pageNumber = 1
i = 2
Do
For Each div In htmlDoc.getElementsByTagName("div")
            If div.getAttribute("class") = "g" Then 'r
            Set link = div.getElementsByTagName("a")(0)
            .Cells(i, 1).Value = link.getAttribute("href")
            i = i + 1
End If
Next div

'''#####################  PAGE NAVIGATION GODE ############################

'****** Searches Number of Pages entered in Sheet Cell C1, E.G 5pages *******
If pageNumber >= Replace(Worksheets("Keywords").Range("C1").Value, " ", "+") Then Exit Do
On Error Resume Next
Set nextPageElement = HTMLdoc.getElementById("pnnext")
If nextPageElement Is Nothing Then Exit Do

'****** Scrolls Down the Browser to mimic human behaviour *****
ie.document.parentWindow.Scroll 0&, 99999

'***** 1st Random delay in seconds from Max number entered in "Sheet1" Cell D1 *****
Application.Wait Now + TimeSerial(0, 0, Application.RandBetween(1, Worksheets("Keywords").Range("D1").Value))

'***** clicks on google next page ******
nextPageElement.Click
Do While ie.Busy Or ie.readyState <> 4
DoEvents
Loop
'***** 2nr Random delay from Max number entered in "Sheet1" Cell E1 *****
Application.Wait Now + TimeSerial(0, 0, Application.RandBetween(1, Worksheets("Keywords").Range("E1").Value))
Set HTMLdoc = ie.document
pageNumber = pageNumber + 1
Loop
End With

ie.Quit
Set ie = Nothing
Set HTMLdoc = Nothing
Set nextPageElement = Nothing
Set div = Nothing
Set link = Nothing

MsgBox "All Done"
End Sub
 
Upvote 0

Forum statistics

Threads
1,215,056
Messages
6,122,907
Members
449,096
Latest member
dbomb1414

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top