getting links from InternetExplorer.Application

leeple

New Member
Joined
Feb 16, 2011
Messages
5
Hi, I have a have been using VBA for under a year, never posted anything, but I am having a lot of trouble with my function being non-deterministic. I need to check web pages, look through their links (put them into an array), and check if another webpage is one of those links. Here is the important part of my code as it stands

Private Sub TestLink(ColumnNumSite As Integer, RowNum As Integer, ColumnNum As Integer)
Dim lnk
Dim MyNames() As String ' declares a dynamic array variable
Dim iCount As Integer
Dim Max As Integer
Dim ie As InternetExplorer
Dim lngLoc As Long

Set ie = CreateObject("InternetExplorer.Application")
With ie
'.Visible = True
.Navigate Cells(RowNum, ColumnNumSite)
Do Until .ReadyState = 4: DoEvents: Loop

Set doc = ie.Document
Max = doc.Links.length
ReDim MyNames(1 To Max + 2)
iCount = 1
For Each lnk In doc.Links
' finds the maximum array size
'For iCount = 1 To Max
MyNames(iCount) = lnk
'ActiveSheet.Cells(RowNum + iCount - 1, 6 + iCount) = lnk
iCount = iCount + 1
'MyNames(iCount) = ThisWorkbook.Names(iCount)
Next
lngLoc = -9
On Error Resume Next
lngLoc = Application.WorksheetFunction.Match(ActiveSheet.Cells(RowNum, ColumnNum).Formula(), MyNames(), 0)

ActiveSheet.Cells(RowNum, ColumnNumSite + 5).Value() = lngLoc
If (lngLoc = -9) Then
ActiveSheet.Cells(RowNum, ColumnNumSite + 4).Value() = "Link Does not Exist"
Else: ActiveSheet.Cells(RowNum, ColumnNumSite + 4).Value() = "Link Exists"
End If
.Quit
End With
End Sub

Right now whats happening is max gets set to a very small value ocassionally (around 3 or 28), but if I pause it and rerun that line it will recieve its full normal value. This is not the entire problem because I tried to
ReDim MyNames(1 To 1000)
instead of to Max, and it broke anyways, the breaking occurs because
For Each lnk In doc.Links
does not go through all the links sometimes, which I know because when I pause it afterwards and rerun that area I get the correct solution. Does anyone know whats going on? All I need is to set an array to the links of a webpage
 

Excel Facts

Bring active cell back into view
Start at A1 and select to A9999 while writing a formula, you can't see A1 anymore. Press Ctrl+Backspace to bring active cell into view.
Hello leeple,

Object collections in Internet Explorer are not the same as in VBA and as you found out the For Each Next construct doesn't work. Two other things to keep in mind are these collections are zero based and the Length property is always +1 greater than the objects the collection holds.

Try this version of your macro out and let me know how it works.
Code:
Private Sub TestLink(ColumnNumSite As Integer, RowNum As Integer, ColumnNum As Integer)

  Dim docLinks As Object
  Dim lnk As Object
  Dim MyNames() As String ' declares a dynamic array variable
  Dim I As Integer
  Dim iCount As Integer
  Dim Max As Integer
  Dim ie As InternetExplorer
  Dim lngLoc As Long

    Set ie = CreateObject("InternetExplorer.Application")
    
    With ie
    
     '.Visible = True
      .Navigate Cells(RowNum, ColumnNumSite)
      While .Busy And .ReadyState <> 4: DoEvents: Wend

      Set doc = ie.Document
      
    'Get the Link Objects
      Set docLinks = doc.Anchors
      
      'Check document has Links
        If Not docLinks Is Nothing Then
         'Save each URL to MyNames array
           For I = 0 To docLinks.Length - 1
             'Resize the array keeping the previous values and store the URL
               ReDim Preserve MyNames(I)
               MyNames(I) = docLinks(I).href
           Next I
        End If

      lngLoc = -9
        
      On Error Resume Next
      lngLoc = Application.WorksheetFunction.Match(ActiveSheet.Cells(RowNum, ColumnNum).Formula(), MyNames(), 0)

      ActiveSheet.Cells(RowNum, ColumnNumSite + 5).Value() = lngLoc
        
      If (lngLoc = -9) Then
         ActiveSheet.Cells(RowNum, ColumnNumSite + 4).Value() = "Link Does not Exist"
      Else
         ActiveSheet.Cells(RowNum, ColumnNumSite + 4).Value() = "Link Exists"
      End If
        
      .Quit
      
    End With
    
End Sub
Sincerely,
Leith Ross
 
Last edited:
Upvote 0
Hi,

Thanks for the quick response, sorry it has taken me so long (I'm an intern at work, still at school, and had a midterm this morning).

I am very grateful for your suggestion but currently the code is not functional. By this I mean websites that link to other websites are evaluating to say that they do not contain a link to the letter site.

Links.length - 1 evaluates to 40-50 a lot of times when there are about 250-400 links on the web page

Can I ask a couple questions: what are anchors and what is an anchor.href supposed to yield? Is there a library or api for the internet explorer functions like this within internet explorer? Also, is there a way to force excel to flush memory such that previous ie objects cannot possibly interfere with future ones?

Would it help if I posted the methods calling this function so you could test if youre not exactly sure about functionality?
 
Upvote 0
Hello leeple,

Midterms, that brings back flashbacks, I mean memories. Hope you did well.

If you could post the code with the site you want to link to, it would be helpful.

Sincerely,
Leith Ross
 
Upvote 0
'''sidenote: I took up this project started by someone else in my office and have not messed with worksheetchange or other functions that I do not understand. I feel that a lot of the code written is not pretty but the progression towards the link checking is clear and easy to understand. I apologize for the entire thing not looking pretty but I am pretty sure that it will not hurt my code.
BREAKDOWN:
checklinks is run, it opens CheckLinksForm,
Select 2 Columns to show that you are interested in relating two columns and checking for links between. This will call RelateColumns
Relate Columns will ask you for the range of interest(top left and bottom left values), and will output information on them.
Most of my data strings are too long to want to post, so heres a small one that it fails on from one of our competitors
Column A: http://shop.cafepress.com/twilight
Column B: http://www.cafepress.com/Eclipse.455073430
This should return that the first link works, the 2nd link yields a 404 error, and there is a link from the first link to the second link on their website, but it returns that no such link exists. If you check the source code of the website the link is definitely there, so this is a problem.

Sorry for the long post, but this is what I think it takes to help you run the program to test on your end.

Thanks a bunch for helping

'''''''THIS IS ALL WITHIN A MODULE TITLED RelateColumns

EDIT: removed code per request - Moderator
 
Last edited by a moderator:
Upvote 0
Hello leeple,

Would it be possible for you to either post the workbook or send me a copy? There is simply too much code here for me to spend time trying to make a working test copy.

Sincerely,
Leith Ross
 
Upvote 0
yes of course. I did not know that posting a workbook was an option, I thought it was somehow taboo. Could you send me your email address? I cannot find the "attach file" button.
 
Upvote 0
Hello leeple,

I will send you my email address via private message.

Sincerely,
Leith Ross
 
Upvote 0

Forum statistics

Threads
1,213,497
Messages
6,113,998
Members
448,541
Latest member
iparraguirre89

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top