Scrape Price From Website URLs

DisguisedOwl

New Member
Joined
Feb 9, 2021
Messages
2
Office Version
  1. 2019
Platform
  1. Windows
Hello all,

I hope you and your family are all well during these tough times.

I am currently using Excel 2019.

I am trying to create a mini database of ingredients with prices from various supermarkets which can auto update. This will enable my family to shop at the cheapest supermarket every week to save money.

Is it possible to pull price and purchase quantity from these URLs and have them auto update? So we can also work out normalised pricing.

I am trying to get any URLs added to column D to pull price and enter it into column A and if possible purchase quantity into column B for the supermarket Tesco. Likewise, I would also like to do this for other supermarkets.

I have some VBA code here which can pull the data from the Tesco URL correctly but only appears in a text box on screen.

VBA Code:
Sub Get_Web_Data()

Dim request As Object
Dim response As String
Dim html As New HTMLDocument
Dim website As String
Dim price As Variant

' Website to go to.
website = "https://www.tesco.com/groceries/en-GB/products/259061829"

' Create the object that will make the webpage request.
Set request = CreateObject("MSXML2.XMLHTTP")

' Where to go and how to go there - probably don't need to change this.
request.Open "GET", website, False

' Get fresh data.
request.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"

' Send the request for the webpage.
request.send

' Get the webpage response data into a variable.
response = StrConv(request.responseBody, vbUnicode)

' Put the webpage into an html object to make data references easier.
html.body.innerHTML = response

' Get the price from the specified element on the page.
price = html.getElementsByClassName("value").Item(0).innerText

' Output the price into a message box.
MsgBox price

End Sub

I tried to pull data from the other supermarkets trying to use various classes, but they seem to fail using the code above. Since I am new to all this I cannot figure out the issue.

Here are some of the classes I tested which relate to price but give an error running the code:
Rich (BB code):
Sainsbury's
pd__cost-wrapper
pd__cost
pd__cost__total undefined
pd-retail-price

ASDA
pdp-main-details__price-and-uom
pdp-main-details__price-container
co-product__price pdp-main-details__price

ALDI
product-price
product-price__main
product-price__value

Thank you for taking the time to read this post!
 

Attachments

  • Annotation 2021-02-09 115911.png
    Annotation 2021-02-09 115911.png
    137.7 KB · Views: 18

Excel Facts

Using Function Arguments with nested formulas
If writing INDEX in Func. Arguments, type MATCH(. Use the mouse to click inside MATCH in the formula bar. Dialog switches to MATCH.

Worf

Well-known Member
Joined
Oct 30, 2011
Messages
4,131
Welcome


I did it with Chrome, you will need to follow the instructions on post #4 of the thread below.

Once you have this first one running, we can proceed with the other supermarkets. Would you like the pictures along with the items?

import data from Web

sbury.PNG


VBA Code:
Public driver As New ChromeDriver

Sub Sains()
Dim pt, i%
For i = 1 To Range("e" & Rows.Count).End(xlUp).row
    driver.get Cells(i, 5)
    Application.Wait Now + TimeValue("0:00:04")
    If i = 1 Then
        Set pt = driver.FindElementById("onetrust-accept-btn-handler")
        pt.Click
    End If
    Set pt = driver.FindElementsByClass("pd__cost__total")
    Cells(i, 6) = pt.item(1).Text
Next
End Sub
 

DisguisedOwl

New Member
Joined
Feb 9, 2021
Messages
2
Office Version
  1. 2019
Platform
  1. Windows
Hello Wolf,

Thank you for the reply.

I didn’t get any replies so I started learning python and various web scraping tools to try and get the data and then input it into excel. So far I have been successful with Tesco but it seems that the other major supermarkets make it really difficult.

I know this is an excel forum but do you know how to script Selenium scripts in Python?

Selenium is the method I am currently trying to figure out because without a browser other headless methods are getting blocked by the websites.

Kind regards,
 

Worf

Well-known Member
Joined
Oct 30, 2011
Messages
4,131
The bad news: I am not a Python programmer.

The good news: my example at the second post uses Selenium within Excel; you can see it works for Sainsbury.

Tell me if you want to proceed with this method.
 

Watch MrExcel Video

Forum statistics

Threads
1,130,321
Messages
5,641,523
Members
417,213
Latest member
wikk

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Top