Random Proxy

Sharid

Well-known Member
Joined
Apr 22, 2007
Messages
1,064
Office Version
  1. 2016
Platform
  1. Windows
I have a code, that opens IE and then search for a keyword item, it extracts the URLs and then clcik the NEXT page button, the problem is after 10 pages Google dedects that it is a bot. Is it possible to use random Proxys so that after 9 pages it changes to a new proxy

E.G on sheet2 Column A1 (down) there is a list of proxys.
When the code runs it searches 9 pages under the first proxy then moves to the next proxy until all have been used and the starts from the Top again.
 

Excel Facts

Formula for Yesterday
Name Manager, New Name. Yesterday =TODAY()-1. OK. Then, use =YESTERDAY in any cell. Tomorrow could be =TODAY()+1.
How many searches are you performing? Using Google's API, you get 100 a day for free and then, I think, 5 bucks per 1000 thereafter.
 
Upvote 0
Hi

It would be more than a 100, the file link is below, its a small scraper that I built. I think adding proxy list will help so it looks like the search is done from different areas. So after every 8 pages searched it changes the proxy and loops through the proxy list.

File Download Link
 
Upvote 0
Hi
I'm just resurrecting one of my post to see if I can get some help on the problem below, I found proxy code on the web but it is out of my depth to incorporate into my code if someone how is more experience could help i would appreciate it.

How the code works, (See USERFOM1 image below)
  • User inputs search criteria, number of pages to search and 2x time delay into 4 textboxes on userform1
  • Data is transferred to a sheet called “Keywords”
  • A search is done and url are extracted to a sheet called “DATA”
  • These then show in a list box.
WHAT I NEED IT TO DO
  • All of the above and
  • Switch Proxy after every 6 pages searched and loop through a list of proxies on a sheet called “Proxy” from A1 down to 20 until number of pages to search is complete. (So it will search from proxy 1 or 20 and then back to 1 to 20 in a loop after every 6th page searched, until the number of search pages are complete or it is detected as a bot.)
  • INFO - When the proxies are switched the search should go to page 7 and not back to page1 in google.
There would be 3 sheets
  • Sheet1 “Data”
  • Sheet 2 “Keywords”
  • Sheet 3 “Proxy”
Code I found on the web, I am using IE .
I need the proxy bit, my code is below.

VBA Code:
Sub Torrent_Scraper()
    Dim http As New ServerXMLHTTP60, html As New HTMLDocument
    Dim post As HTMLHtmlElement

    With http
        .Open "GET", "https://yts.ag/browse-movies", False
        .setRequestHeader "Content-Type", "text/html; charset=utf-8"
        .setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36"
        .setProxy 1, "61.233.25.166:80", "46.101.27.218:8118"
        .send
        html.body.innerHTML = .responseText
    End With

    For Each post In html.getElementsByClassName("browse-movie-bottom")
        With post.getElementsByClassName("browse-movie-title")
            x = x + 1: Cells(x, 1) = .item(0).innerText
        End With
        With post.getElementsByClassName("browse-movie-year")
            If .Length Then Cells(x, 2) = .item(0).innerText
        End With
    Next post
End Sub

My Code

VBA Code:
Private Sub CommandButton1_Click()

' **** UserForm1 Input Data in to Textboxs*****
    Worksheets("Keywords").Range("C3") = TextBox1.Text ' keyword
    Worksheets("Keywords").Range("C4") = TextBox2.Text ' pages to search
    Worksheets("Keywords").Range("C5") = TextBox3.Text 'time delay 1
    Worksheets("Keywords").Range("C6") = TextBox4.Text 'time delay 2

'**** Keyword URL SCRAPER *****
     Dim ie As Object
     Dim HTMLdoc As Object
     Dim nextPageElement As Object
     Dim div As Object
     Dim link As Object
     Dim url As String
     Dim pageNumber As Long
     Dim i As Long
    
' ***** Takes seach Criteria from Sheet2 caleld "Keyword" Cell C3 and places it into google *****
    url = "https://www.google.co.uk/search?q=" & Replace(Worksheets("Keywords").Range("C3").Value, " ", "+")

' ***** Gets internet explorer read, which is set to False so does NOT SHOW ******
    Set ie = CreateObject("InternetExplorer.Application")
    
    With ie
        .Visible = False
        .navigate url
        Do While .Busy Or .readyState <> 4
            DoEvents
        Loop
    End With

    Application.Wait Now + TimeSerial(0, 0, 5)
    
    Set HTMLdoc = ie.document

'***** Searches for URLs and places the results into Sheet1 caleld "DATA"  ROW 2 Column A *****
    With Sheets("DATA")
        pageNumber = 1
        i = 2
        Do
            For Each div In HTMLdoc.getElementsByTagName("div")
                If div.getAttribute("class") = "r" Then
                    Set link = div.getElementsByTagName("a")(0)
                    .Cells(i, 1).Value = link.getAttribute("href")
                    i = i + 1
                End If
     Next div
     
'****** Searches Number of Pages entered in Keyword Sheet Cell C4, E.G 5pages *******
    If pageNumber >= Replace(Worksheets("Keywords").Range("C4").Value, " ", "+") Then Exit Do
      On Error Resume Next
        Set nextPageElement = HTMLdoc.getElementById("pnnext")
         If nextPageElement Is Nothing Then Exit Do
        
'****** Scrolls Down the Browser to mimic human behaviour *****
            ie.document.parentWindow.Scroll 0&, 99999

'***** 1st Random delay in seconds from Max number entered in "Keyword" sheet C5 *****
    Application.Wait Now + TimeSerial(0, 0, Application.RandBetween(1, Worksheets("Keywords").Range("C5").Value))

'***** clicks on google next page ******
        nextPageElement.Click
            Do While ie.Busy Or ie.readyState <> 4
                DoEvents
            Loop
'***** 2nr Random delay from Max number entered in "Keyword" sheet C6 *****
             Application.Wait Now + TimeSerial(0, 0, Application.RandBetween(1, Worksheets("Keywords").Range("C6").Value))
            Set HTMLdoc = ie.document
            pageNumber = pageNumber + 1
        Loop
    End With
   
ie.Quit
    Set ie = Nothing
    Set HTMLdoc = Nothing
    Set nextPageElement = Nothing
    Set div = Nothing
    Set link = Nothing

 MsgBox "All Done"
End Sub

Thanks for having a look
 
Upvote 0

Forum statistics

Threads
1,214,923
Messages
6,122,286
Members
449,076
Latest member
kenyanscott

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top