Best way to web scrape using excel as database

DianaBanana

Board Regular
Joined
Mar 10, 2014
Messages
71
Does anyone have a recommended way to scrape data off the web and get it to populate into an excel document? For example, I am going through a hotel site and I'd like to scrape all the different hotel room rates for that particular hotel location and store it in an excel document. I'd like to be able to re-run this (macro or code) weekly and see the different rates? I'm manually doing it and putting it into excel but there must be a better way.
 
only one night stay for the moment (There's just so much I want to do that I think it would be too hard to also look at multiple nights)
 
Upvote 0

Excel Facts

Spell Check in Excel
Press F7 to start spell check in Excel. Be careful, by default, Excel does not check Capitalized Werds (whoops)
I wasn't working on this, but it sounded like Norie might have been.

If you can post the work you've done so far and the troubles you've been having, then I'd be happy to help.
 
Upvote 0
Everything I've done is manual, I haven't been able to automate it. I have been reading up online and I'm just too new to this. Is there a way for you to start the code and I keep working on it? Or a way to "record" doing what I have been doing manually so I can go in and see the code? Can you help me start it?
 
Upvote 0
There's no easy way to do this that I'm aware of. I think you might have just underestimated what's involved. This isn't the best project to start learning VBA, but here's a sample that may get you started.

Code:
Option Explicit


' Requires references to: Microsoft Internet Controls
'                         Microsoft HTML Object Library
'                         Microsoft VBScript Regular Expressions #
Sub DianaBanana()
    Dim ie As InternetExplorer
    Dim ie2 As InternetExplorer
    Dim lnk As HTMLLinkElement
    Dim mch As Match
    Dim regex As RegExp
    
    Set ie = New InternetExplorer
    Set ie2 = New InternetExplorer
    Set regex = New RegExp
    regex.Global = True
    
    ie.Navigate "http://www.lq.com/lq/about/ourhotels/hotel_listings/"
    
    While ie.readyState <> READYSTATE_COMPLETE
        DoEvents
    Wend
    
    For Each lnk In ie.Document.Links
        If lnk.innerText Like "La Quinta*" Then
            ie2.Navigate lnk
            While ie2.readyState <> READYSTATE_COMPLETE
                DoEvents
            Wend
            regex.Pattern = "(Floors|Rooms): \d+"
            For Each mch In regex.Execute(ie2.Document.DocumentElement.innerHTML)
                If MsgBox(ie2.Document.Title & vbCr & mch & vbCr & vbCr & "Continue?", vbYesNo) <> vbYes Then
                    GoTo Quit
                End If
            Next
        End If
    Next
    
Quit:
    ie.Quit
    ie2.Quit
End Sub
 
Upvote 0
I was going to have a look but didn't find the time, especially as I realised how much was involved.

Even on the first page we are dealing with 100s of locations.

Each of those will lead to multiple hotels and then we need to get data for each of those hotels.

That would involve getting the data for multiple dates/room categories.
 
Upvote 0
There's no easy way to do this that I'm aware of. I think you might have just underestimated what's involved. This isn't the best project to start learning VBA, but here's a sample that may get you started.

Code:
Option Explicit


' Requires references to: Microsoft Internet Controls
'                         Microsoft HTML Object Library
'                         Microsoft VBScript Regular Expressions #
Sub DianaBanana()
    Dim ie As InternetExplorer
    Dim ie2 As InternetExplorer
    Dim lnk As HTMLLinkElement
    Dim mch As Match
    Dim regex As RegExp
    
    Set ie = New InternetExplorer
    Set ie2 = New InternetExplorer
    Set regex = New RegExp
    regex.Global = True
    
    ie.Navigate "http://www.lq.com/lq/about/ourhotels/hotel_listings/"
    
    While ie.readyState <> READYSTATE_COMPLETE
        DoEvents
    Wend
    
    For Each lnk In ie.Document.Links
        If lnk.innerText Like "La Quinta*" Then
            ie2.Navigate lnk
            While ie2.readyState <> READYSTATE_COMPLETE
                DoEvents
            Wend
            regex.Pattern = "(Floors|Rooms): \d+"
            For Each mch In regex.Execute(ie2.Document.DocumentElement.innerHTML)
                If MsgBox(ie2.Document.Title & vbCr & mch & vbCr & vbCr & "Continue?", vbYesNo) <> vbYes Then
                    GoTo Quit
                End If
            Next
        End If
    Next
    
Quit:
    ie.Quit
    ie2.Quit
End Sub
I put this in my workbook - in a new macro - and I'm getting an error message. "Compile error: User defined type not defined" -- this is highlighted in yellow: Dim ie As InternetExplorer . How do I fix this?
 
Upvote 0
I was going to have a look but didn't find the time, especially as I realised how much was involved.

Even on the first page we are dealing with 100s of locations.

Each of those will lead to multiple hotels and then we need to get data for each of those hotels.

That would involve getting the data for multiple dates/room categories.
Norie, If I simplified the project and just chose the "best available price/lowest price" which would involve just clicking on the one page into each hotel - does it seem more doable then? (Without having to get all the pricing by room type? Would I be able to record a macro for each hotel (800 times) and then put them all into one macro? Is that possible?
 
Upvote 0
I put this in my workbook - in a new macro - and I'm getting an error message. "Compile error: User defined type not defined" -- this is highlighted in yellow: Dim ie As InternetExplorer . How do I fix this?

Follow these instructions from my earlier post to set a reference to each of the libraries listed at the top of the code:
In order to run this you must set a reference to 'Microsoft Internet Controls' by checking the box next to it in the References dialog (Tools > References...):
 
Upvote 0

Forum statistics

Threads
1,215,220
Messages
6,123,695
Members
449,117
Latest member
Aaagu

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top