Pull Specific li Data From Webpage ul to Excel

MJH2014

New Member
Joined
Apr 24, 2014
Messages
9
I am trying to create a macro that pulls (and will update) property auction data from auction.com. I am new to data scraping using vba and for each auction, I need to pull the Property Address (and maintain the hyperlink to the href address in Excel, ideally), Item #, Starting Bid, and auction Start Date. My problem is that I am unsure how to call data from a specific <li> nested in a <ul> but every time I try, it pulls all entries for the property that share an <li> class name. Below is an example of the HTML:


HTML:
<div class="contentDetail searchResult" property-id="193014410">    <div class="layoutSingle left ultra-wide">        <div class="searchResultGalleryData">            <p>Item #:</p>            <p class="highlightedData">B133-101</p>            <p>Arlington, TX</p>        </div>        <div class="searchResultImageBox">            <div class="searchResultImage">                <div class="secondaryAlertBanner hidden"></div>                <a class="searchResultImageLink searchTrackPDP" href="http://auction.com/Texas/commercial-auction-asset/193014410-7786-2001-and-2005-NE-Green-Oaks-Blvd-Arlington-TX-76006-B133" property-id="193014410" row-index="1"><img class="searchResultImage" src="http://cdn.mlhdocs.com/rcp_files/auctions/B-133/photos/thumbnails/ForestHill-1_bigThumb.jpg" alt=""></a>            </div>            <div class="primaryAlertBanner listOnly red">Broker co-op available</div>        </div>        <ul class="searchResultInfo major"><li class="searchResultAddress"><a class="searchTrackPDP" href="http://auction.com/Texas/commercial-auction-asset/193014410-7786-2001-and-2005-NE-Green-Oaks-Blvd-Arlington-TX-76006-B133" row-index="1">2001 And 2005 Ne Green Oaks Blvd<br>Arlington, TX 76006</a></li><li>        <span class="searchResultLabel">Asset Type:</span>        Commercial    </li><li>        <span class="searchResultLabel">Property Type:</span>        Office    </li>                        <li class="highlight">        <span class="searchResultLabel">Item #:</span>        B133-101    </li>        </ul>        <div class="clear"></div>        <div class="primaryAlertBanner galleryOnly red">Broker co-op available</div>    </div>    <div class="layoutSingle right">        <ul class="searchResultInfo minor">                                        </ul>        <ul class="searchResultInfo major"><li class="highlightLarge">        <span class="searchResultLabel">Starting Bid:</span>        $1,250,000    </li>                        <li class="highlight">        <span class="searchResultLabel">Start Date:</span>        <span>07/14/2014</span>    </li>            <li class="highlight">        <span class="searchResultLabel">Auction Type:</span>        Online    </li>                    </ul>    </div>    <div class="clear"></div>    <div class="buttonRow right">        <a class="button small transparent searchPropertySaved" property-id="193014410">Property Saved</a>            <a class="button primary small searchSaveProperty" property-id="193014410">Save Now</a>            <a class="button blue small searchTrackPDP" href="http://auction.com/Texas/commercial-auction-asset/193014410-7786-2001-and-2005-NE-Green-Oaks-Blvd-Arlington-TX-76006-B133" row-index="1">More Details</a>            </div>    <div class="galleryDataSection">    <div class="dataItem">            Starting Bid:            <span class="highlightedData dataValue">$1,250,000</span>        </div>        </div>    <div class="clear"></div>    ******** type="text/html" id="tooltip-193014410">        <div class="propertyHover">            <div class="header">                    Item #                <span class="itemNo">                    B133-101                </span>                |                <span class="address">                    2001 And 2005 Ne Green Oaks Blvd Arlington, TX 76006                </span>            </div>            <ul class="propertyFields">                <li><span class="label">County</span><span class="value">Tarrant</span></li>                <li><span class="label">Property Type</span><span class="value">Office</span></li>                <li><span class="label">Bedrooms</span><span class="value"></span></li>                <li><span class="label">Bathrooms</span><span class="value"></span></li>                <li><span class="label">Occupancy Status</span><span class="value"></span></li>                <li><span class="label">Lot Size (acres)</span><span class="value"></span></li>                <li><span class="label">Previously Valued To</span><span class="value">N/A</span></li>            </ul>            <div class="clear"></div>            <div class="propertyImageBox">                <img class="propertyImage" src="http://cdn.mlhdocs.com/rcp_files/auctions/B-133/photos/thumbnails/ForestHill-1_bigThumb.jpg" />            </div>                    </div>    *********></div>

Here is the code that I have written so far; the Property Address (without hyperlink though), Item #, and Starting Bid all pull correctly.

Code:
Sub Commercial_Filter()

ScreenUpdating = False


Set IE = CreateObject("InternetExplorer.Application")
Dim ele As Object




'Dim links As Variant, lnk As Variant
Set sht = Sheets("Daily Log")
RowCount = 1
sht.Range("A" & RowCount) = "Property"
sht.Range("B" & RowCount) = "Item #:"
sht.Range("C" & RowCount) = "Start Date"
sht.Range("D" & RowCount) = "Property Type"
sht.Range("E" & RowCount) = "Sq Feet"
sht.Range("F" & RowCount) = "Start Date"
sht.Range("G" & RowCount) = "Starting Bid"


'<-----------Jump to Commercial Search Results for Retail & Office----------->
With IE
.Visible = False
.Navigate "http://www.auction.com/search?property_type=Retail+Commercial%2COffice+Commercial&limit=48&auction_type=commercial&sort=auction_end_date+asc"
Do While .Busy Or .readyState <> 4
      DoEvents
    Loop




'<-----------Pull Property Address for Each Entry----------->
For Each ele In .document.all
      Select Case ele.className
        Case "contentDetail searchResult"
          RowCount = RowCount + 1
        Case "searchResultAddress"
          sht.Range("A" & RowCount) = ele.innertext
        End Select
      Next
'<-----------Pull Item # for Each Entry----------->
RowCount = 1
For Each ele In .document.all
    Select Case ele.className
        Case "results-section"
        Case "layoutSingle left ultra-wide"
        Case "searchResultGalleryData"
        Case "highlightedData"
          RowCount = RowCount + 1
          sht.Range("B" & RowCount) = ele.innertext
      End Select
        Next


'<-----------Pull Starting Price for Each Entry----------->
RowCount = 1


For Each ele In .document.all
    Select Case ele.className
        Case "results-section"
        Case "contentDetail searchResult"
        Case "layoutSingle right"
        Case "highlightLarge"
          RowCount = RowCount + 1
          sht.Range("H" & RowCount) = ele.innertext
          sht.Range("G" & RowCount).FormulaR1C1 = "=REPLACE(RC[1],FIND(,RC[1]),14,"""")"
          'This is to crudely attempt to remove the text "Starting Bid: " that embeds in each cell
      End Select
        Next
    
    Columns("G:G").Select
    Selection.Copy
    Columns("G:G").Select
    Selection.PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks _
        :=False, Transpose:=False
    Columns("H:H").Select
    Application.CutCopyMode = False
    Selection.Delete Shift:=xlToLeft




End With



ScreenUpdating = True


End Sub

Does anyone know a more efficient way to scrape this data? I keep seeing answers involving jQuery, which I have no experience with. I would really appreciate some guidance. Thanks!
 

Excel Facts

Fastest way to copy a worksheet?
Hold down the Ctrl key while dragging tab for Sheet1 to the right. Excel will make a copy of the worksheet.

Forum statistics

Threads
1,214,866
Messages
6,121,996
Members
449,060
Latest member
mtsheetz

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top