Importing web page text as browser displays it, not source HTML into sheet

Oliver Dewar

Board Regular
Joined
Apr 17, 2011
Messages
201
Hi All.

For an excel app I'm building I need to (effectively) open a url in internet explorer, select all, and paste onto a sheet... or VBA equivalent. However, I'm not after the source code... I'm after the text that the website visitor would see in their browser. If images etc end up coming as well that's fine, I'll strip them out after.

I have the code to navigate to the page and that's working fine.

I can't find a way to copy the text on a web page as the browser displays it and then paste that onto a sheet. I've found many ways to grab the source code... but that's not what I need in this specific instance.

Can anyone help?

For the record, I can easily paste the html using the .document.body.innerHTML method. So I tried copying the source HTML and pasting as unicode text but sometimes it still comes across as HTML.

I tried sendkeys select all and copy then paste... but it didn't seem to work at all. It just doesn't seem to get onto the clipboard.

It doesn't matter if the text ends up in random columns and rows when it pastes as I have a macro to handle that... I just need a helping hand to get the rendered version of the url text as oppose to the source html. Alternatively, I guess a way to paste the source code, or handle the source code so that you end up with the final, readable text only would work too.

(I hope that makes sense... just swear at me and call me names if not and I'll rephrase).

Thanks everyone.
 

Some videos you may like

Excel Facts

Why does 9 mean SUM in SUBTOTAL?
It is because Sum is the 9th alphabetically in Average, Count, CountA, Max, Min, Product, StDev.S, StDev.P, Sum, VAR.S, VAR.P.

Rando31

New Member
Joined
Dec 14, 2014
Messages
5
Hellow, good question !
I would like to do the same but to paste the tables, data, copying from a webpage opened in Chrome (to retrieve data and order them properly).
So, I would be interested to get some tips although there might be some differences as:cool: piloting Chrome by VBA ?
Thanks
 

Oliver Dewar

Board Regular
Joined
Apr 17, 2011
Messages
201
Yeah, Rando I think what you're looking for is going to be an entirely different topic mate. Piloting a different browser than Microsoft's built in IE is not straight forward if it's even doable at all. But certainly a different thread.

Anyone have any suggestions about my original question?
 

ukmikeb

Well-known Member
Joined
Jul 10, 2009
Messages
2,757
Hi Oliver
These few lines of code :-
Code:
        IE.ExecWB 17, 0
        Do Until IE.ReadyState = 4: DoEvents: Loop
        IE.ExecWB 12, 2
        ActiveSheet.PasteSpecial Format:="HTML", link:=False, DisplayAsIcon:=False, NoHTMLFormatting:=True
will copy a web page into a blank sheet, IE being the Internet Explorer object.

If you are after specific data within the page it might be easier to extract just that data rather than going through the process of searching for it and removing the data you are not interested in.

If you would like to share the URL and specify the area of the web page you are interested in it may be possible to show you alternative methods of extracting the data.

hth
 
Last edited:

Norie

Well-known Member
Joined
Apr 28, 2004
Messages
75,564
Office Version
365
Platform
Windows
If you just want the data from the page you could always try innerText/outerText.
 

Oliver Dewar

Board Regular
Joined
Apr 17, 2011
Messages
201
Thanks Guys.

UKMikeB... you nailed it mate. That's fantastic. Working like a charm.

This is for a crawler so it needs to work for any kind of website and your solution achieves this. Of course, it's best to avoid the clipboard where possible... but in this case I think it might be the best 'one-size-fits-all' solution.
 

Watch MrExcel Video

Forum statistics

Threads
1,099,697
Messages
5,470,224
Members
406,686
Latest member
BNR_ 1980

This Week's Hot Topics

Top