How to get all links and sublinks from a website

Tegenfeldt

New Member
Joined
Aug 6, 2013
Messages
13
I want to step through all links and sublinks in a website to get information from some of the pages. The pages are in 7 levels, so I use nestled For.
I am able to get the first link on the top level page. I also can navigate to that link and sublinks. But when a For loop is done I don't get back to the previous level.

This is the code I use:

Code:
    Set Browser = New InternetExplorer
    Application.Wait (Now + TimeValue("0:00:01"))
    Browser.Visible = True
    Browser.navigate "http://www.purus.se/sv/Produkter/"
    While Browser.Busy Or Browser.readyState <> READYSTATE_COMPLETE
        DoEvents
        Information.status = "Väntar på Internet Explorer..."
    Wend
    Set Document = Browser.Document
    Set htmlColl = Document.getElementsByTagName("a")
    Do While Document.readyState <> "complete": DoEvents: Loop
    
    For Each htmlInput In htmlColl
        Debug.Print htmlInput
        Browser.navigate htmlInput.href
        DoEvents
        Do While Document.readyState <> "complete": DoEvents: Loop
        Set Document2 = Browser.Document
        Set htmlColl2 = Browser.Document.getElementsByTagName("a")
        
        For Each htmlInput2 In htmlColl2
            If InStr(htmlInput2.innerHTML, "Till produktdatabasen") > 0 Then
                Debug.Print htmlInput2
                Browser.navigate htmlInput2.href
                DoEvents
                Do While Document.readyState <> "complete": DoEvents: Loop
                Set Document3 = Browser.Document
                Set htmlColl3 = Document3.getElementsByTagName("iframe")
                
                For Each htmlInput3 In htmlColl3
                    If htmlInput3.nodeName = "IFRAME" Then
                        Debug.Print htmlInput3.src
                        Browser.navigate htmlInput3.src ' Till Golvbrunnar <-------
                        DoEvents
                        Do While Document.readyState <> "complete": DoEvents: Loop
                        Application.Wait (Now + TimeValue("0:00:01"))
                        Set Document4 = Browser.Document.frames("Main").Document
                        Set htmlColl4 = Document4.getElementsByTagName("a")
                        For Each htmlInput4 In htmlColl4
                            Debug.Print htmlInput4
                            htmlInput4.Focus
                            htmlInput4.Click ' Till Golvbrunnar - Plast <-------
                            DoEvents
                            Do While Document.readyState <> "complete": DoEvents: Loop
                            Application.Wait (Now + TimeValue("0:00:01"))
                            Set Document5 = Browser.Document.frames("Main").Document
                            Set htmlColl5 = Document5.getElementsByTagName("a")
                            For Each htmlInput5 In htmlColl5
                                Debug.Print htmlInput5
                                htmlInput5.Focus
                                htmlInput5.Click ' Till Golvbrunnar - Plast <-------
                                DoEvents
                                Do While Document.readyState <> "complete": DoEvents: Loop
                                Application.Wait (Now + TimeValue("0:00:01"))
                                Set Document6 = Browser.Document.frames("Main").Document
                                Set htmlColl6 = Document6.getElementsByTagName("a")
                                For Each htmlInput6 In htmlColl6
                                    Debug.Print htmlInput6
                                    If InStr(LCase(htmlInput6.href), "dbdocid") > 0 Then
                                        Browser.navigate htmlInput6.href ' Till Golvbrunnar
                                        DoEvents
                                        Do While Document.readyState <> "complete": DoEvents: Loop
                                        Application.Wait (Now + TimeValue("0:00:01"))
                                        Set Document7 = Browser.Document
                                        Set htmlColl7 = Document7.getElementsByTagName("a")
                                        Set td = Document7.getElementsByTagName("td")
                                        i = 0
                                        Debug.Print Document7.title
                                        TheTD = "" ' Nothing
                                        'Produktnamn --------   Mini Max 40 Spygatt
                                        Do While i < td.Length
                                            If td(i).className = "DocumentHeader" Then
                                                '.Visible = True
                                                Produktnamn = td(i).innerText
                                                Debug.Print Produktnamn
                                                Exit Do
                                            End If
                                            i = i + 1
                                        Loop
                                        'RSKnr och Produktinfo ----                                                                         i = 0                                        Do While i < td.Length    
                                            If td(i).className = "Documenttext" And td(i).innerText <> "" Then
                                                TheTD = td(i).innerText
                                                RskNr = ""
                                                If InStr(TheTD, "RSK") <> 0 Then
                                                    RskNr = Mid(TheTD, InStr(TheTD, "RSK") + 7)
                                                    RskNr = Left(RskNr, 10)
                                                    RskNr = Replace(RskNr, " ", "")
                                                End If
                                                Debug.Print RskNr
                                                Produktinfo = Mid(TheTD, InStr(TheTD, "Material"))
                                                Produktinfo = Trim(Produktinfo)
                                                Debug.Print Produktinfo
                                                Exit Do
                                            End If
                                            i = i + 1
                                        Loop
                                        ' Generera PDF ---------------
                                        i = 0
                                        Debug.Print Document7.title
                                        Set TheLink = Nothing
                                        'And htmlColl7(i).className = "ddmenu"
                                        Do While i < htmlColl7.Length
                                            If htmlColl7(i).className = "PDFLink" Then
                                                Set TheLink = htmlColl7(i)
                                                Debug.Print TheLink.href
                                                TheLink.Focus
                                                TheLink.Click
                                                DoEvents
                                                VäljFönster
                                                DoEvents
                                                Debug.Print DocID
                                                '.Visible = True
                                                DoEvents
                                                strURL = "http://core.purus.se/PurusWeb/Navigate/PdfQueue.aspx?&FORCE=9720&Add=Product&Catalog=Product&DocId=" & _
                                                    DocID & "&Style=SimplePdf&Hires=&Level=&Language="
                                                strFileName = "Purus" & DocID & ".pdf"
                                                strSaveFile = strSavePath & strFileName
                                                LaddaNerEnFil strURL, strSaveFile
                                                Exit Do
                                            End If
                                            i = i + 1
                                        Loop
                                    End If
                                Next htmlInput6
                            Next htmlInput5
                        Next htmlInput4
                    End If
                Next htmlInput3
                Exit For
            End If
        Next htmlInput2
    Next htmlInput

Maybe someone has code that can do this
 

Some videos you may like

Excel Facts

Convert text numbers to real numbers
Select a column containing text numbers. Press Alt+D E F to quickly convert text to numbers. Faster than "Convert to Number"

Watch MrExcel Video

Forum statistics

Threads
1,127,550
Messages
5,625,450
Members
416,107
Latest member
AVaes

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Top