save the source code of a webpage as text file

rutger

Board Regular
Joined
Apr 5, 2005
Messages
74
Hello all,

I have a question, is it possible to save the source code from a webpage as a text file by using vba?
Basically I want to do the same as when i right click in a website and then choose to view the source code.
I am able to save a webpage as a html file, but when i save it as text it messes up the whole format.

I can't seem to figure this out.

Thanks in advance,

Greetz,
Rutger
 

Excel Facts

Which Excel functions can ignore hidden rows?
The SUBTOTAL and AGGREGATE functions ignore hidden rows. AGGREGATE can also exclude error cells and more.
Hi Rutger

Text and html files are basically the same (html are text files, there's nothing special about them). Consequently, you could specify the file to be saved down with a txt extension (or if you wanted to modify the html file so that it ends with .txt). I don't really understand why you'd want to save it as a txt though?

Richard
 
Upvote 0
hey there,
what i'm trying to achieve is this:
I want to automatically check if a specific website has been changed. So basically i need to compare the source codes line by line to see if there is a difference.

If there is an easier way to do this it would be great.
Is there a way to read the Last Modified Date from a website's source code?

Greetz,

Rutger
 
Upvote 0
Rutger

You can automate IE from VBA.

When you do that you can directly access the HTML object model.

That model has various (lots actually!) elements/methods/properties/collections etc, once of which is body, a property of the document object.

body also has a property called innerHTML, which contains the HTML for the body of the page.

If you can post either the URL you are interested in, or a similar one I can post specific code.
 
Upvote 0
Rutger

This code will save the HTML of the body of that webpage to a text file.
Code:
Sub Test()
    
    Set ie = CreateObject("InternetExplorer.Application")
    With ie
        .Visible = True
        .navigate "http://www.nu.nl/"
        Do Until .readyState = 4: DoEvents: Loop
       
        Set doc = ie.document
    
        Open "C:\TestHTML.txt" For Output As #1
            Print #1, doc.body.innerHTML
        Close #1
    End With
    
End Sub
By the way I'm not sure how/if you can do what you want.
 
Upvote 0
Thanks, this does exactly what i was looking for. Now it's up tme to find out the rest!!
I'll post here if i manage to get this working!

Thanks again,

Greetz,

Rutger
 
Upvote 0
Rutger

Good luck.:)

I think you'll need it.:)

Parsing HTML code, which might be what you are considering, has always looked too daunting a task to contemplate.

Perhaps if you explained exactly what you want to do we could give you some other ideas.

Like I said by automating IE via VBA you have access to the complete HTML object model.

That includes forms, tables, buttons etc.
 
Upvote 0
Hey guys,

Thanks for all the help so far, i have another question though.
the script that Norie wrote does write a part of the page´s html to a textfile, but not all. When i right click in the page and then choose to see the source code, i see way more HTML then in my textfile.

Is there a simple way to fix this=

Thanks in advance,

Rutger
 
Upvote 0

Forum statistics

Threads
1,214,643
Messages
6,120,707
Members
448,981
Latest member
recon11bucks

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top