Check links to see if PDF docs exist on Internet

sheepdemon

New Member
Joined
Nov 30, 2022
Messages
17
Office Version
  1. 365
Platform
  1. Windows
Hi,

I have an Excel sheet with a list of .pdf doc links online in it - these are placed into an online location by a third party supplier. I'd like a macro which can check these links and advise if the PDF documents exist online. Complicating this a bit is you don't just get a 404 if the doc isn't there, there is a custom 404 page which resolves and gives an error message about the file not existing.

I think using the HTML Head tag is likely what I want to do but I'm pretty new to this, any guidance or pointers on this would be appreciated :)
 

Excel Facts

Copy formula down without changing references
If you have =SUM(F2:F49) in F50; type Alt+' in F51 to copy =SUM(F2:F49) to F51, leaving the formula in edit mode. Change SUM to COUNT.
Welcome to MrExcel forums.

Here is a macro you can try. It expects the URL to be in cell A1 on the active sheet and returns the result in B1. Once proved to be working it can be adapted to loop through multiple cells on the sheet.

VBA Code:
Public Sub Test_File_Exists()
    Dim fileExists As Boolean
    fileExists = URL_File_Exists(Range("A1").Value)
    If fileExists Then
        Range("B1").Value = "Found"
    Else
        Range("B1").Value = "Not found"
    End If
End Sub


Public Function URL_File_Exists(URL As String) As Boolean
    With CreateObject("MSXML2.XMLHTTP")
        .Open "HEAD", URL, False
        .send
        Debug.Print .Status, .statusText
        Debug.Print .getAllResponseHeaders
        Debug.Print .responseText
        If .Status = 200 Then
            URL_File_Exists = True
        Else
            URL_File_Exists = (InStr(1, .responseText, "Error the file doesn't exist", vbTextCompare) = 0)
        End If
    End With
End Function
I have assumed that the XMLhttp request doesn't return a 404 Status if the URL doesn't exist, and instead look for a string in the 404 page. You will have to change the "Error the file doesn't exist" string to (part of) the actual custom error message on the 404 page.
 
Upvote 0
Welcome to MrExcel forums.

Here is a macro you can try. It expects the URL to be in cell A1 on the active sheet and returns the result in B1. Once proved to be working it can be adapted to loop through multiple cells on the sheet.

VBA Code:
Public Sub Test_File_Exists()
    Dim fileExists As Boolean
    fileExists = URL_File_Exists(Range("A1").Value)
    If fileExists Then
        Range("B1").Value = "Found"
    Else
        Range("B1").Value = "Not found"
    End If
End Sub


Public Function URL_File_Exists(URL As String) As Boolean
    With CreateObject("MSXML2.XMLHTTP")
        .Open "HEAD", URL, False
        .send
        Debug.Print .Status, .statusText
        Debug.Print .getAllResponseHeaders
        Debug.Print .responseText
        If .Status = 200 Then
            URL_File_Exists = True
        Else
            URL_File_Exists = (InStr(1, .responseText, "Error the file doesn't exist", vbTextCompare) = 0)
        End If
    End With
End Function
I have assumed that the XMLhttp request doesn't return a 404 Status if the URL doesn't exist, and instead look for a string in the 404 page. You will have to change the "Error the file doesn't exist" string to (part of) the actual custom error message on the 404 page.
Hi,

Sorry to dig this back up - this project went on the back burner for a while. Let me know etiquitte here, happy to start a new thread if that is preferred. I can't get that code to work ,turns out the status returned from the webpage is just a 404 though from checking the developer console :

1673955135800.png


I assume this changes how the code would work, what with it being the direct opposite of what I originally claimed.
 
Upvote 0
I think you can continue in this thread. Is the URL a direct URL to the file, i.e. if you paste the URL in the address bar of a browser does it try to open/download the file, without redirecting to another URL?
 
Upvote 0
Hi John,

Yes, exactly. The URL is to a PDF/XLS document, and if the URL resolves the file downloads. If it does not you get presented with a happy joy error message:

1673975536728.png


However checking the developer console as above shows that this still returns a status of 404.
 
Upvote 0
Change the function to send a GET request instead of HEAD (though this will be slower because GET downloads the file instead of just the headers) and look for "Document not found!" in the response.
VBA Code:
Public Function URL_File_Exists(URL As String) As Boolean
    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", URL, False
        .send
        Debug.Print .Status, .statusText
        Debug.Print .getAllResponseHeaders
        Debug.Print .responseText
        URL_File_Exists = (InStr(1, .responseText, "Document not found!", vbTextCompare) = 0)
    End With
End Function
 
Upvote 0

Forum statistics

Threads
1,214,979
Messages
6,122,552
Members
449,088
Latest member
davidcom

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top