import UTF-8 HTML complete source codes of a HTTP webpage via vba

tourist

New Member
Joined
Oct 26, 2016
Messages
1
I have to import several HTTP webpage's source codes into separate Excel worksheets. I found this great code by searching the forum, which does exactly what I need:

Code:
Sub ImportHTMLSource()
Dim FileName As String
Dim FileNum As Long
Dim Sh As Worksheet
FileName = "D:\Source.txt"
FileNum = FreeFile
Open FileName For Output As FileNum
Print #FileNum, GetSource(Range("A1"))
Close FileNum
Set Sh = Worksheets.Add
With Sh.QueryTables.Add(Connection:="TEXT;D:\Source.txt", Destination:=Range("A2"))
.Name = "Source"
.AdjustColumnWidth = True
.TextFileParseType = xlFixedWidth
.TextFileTextQualifier = xlTextQualifierDoubleQuote
.TextFileColumnDataTypes = Array(2)
.Refresh BackgroundQuery:=False
End With
End Sub
Function GetSource(sURL As String) As String
Dim oXHTTP As Object
Set oXHTTP = CreateObject("MSXML2.XMLHTTP")
oXHTTP.Open "GET", sURL, False
oXHTTP.send
GetSource = oXHTTP.responsetext
Set oXHTTP = Nothing
End Function

The code is great for English websites but when it comes to other languages like Arabic, it fails and shows Arabic characters as some question marks like "????" while I do have language pack and fonts installed. I tried changing "windows regional and language" to Arabic but it didn't help. also changed the "Source.txt" to "Source.html" (the file that is saved and imported via the code) didn't change the results. It seems a SIMPLE task but is it possible to import full UTF8 html source of a non-English webpage into an Excel worksheet (given that the )? what am I doing wrong?
I have windows 7 and Excel 2013. Thanks in advance.
 

Excel Facts

Lock one reference in a formula
Need 1 part of a formula to always point to the same range? use $ signs: $V$2:$Z$99 will always point to V2:Z99, even after copying
Welcome to MrExcel.

Try replacing the GetSource = line with:
Code:
    GetSource = StrConv(oXHTTP.responseBody, vbUnicode)
 
Upvote 0
I have to import several HTTP webpage's source codes into separate Excel worksheets. I found this great code by searching the forum, which does exactly what I need:

Code:
Sub ImportHTMLSource()
Dim FileName As String
Dim FileNum As Long
Dim Sh As Worksheet
FileName = "D:\Source.txt"
FileNum = FreeFile
Open FileName For Output As FileNum
Print #FileNum, GetSource(Range("A1"))
Close FileNum
Set Sh = Worksheets.Add
With Sh.QueryTables.Add(Connection:="TEXT;D:\Source.txt", Destination:=Range("A2"))
.Name = "Source"
.AdjustColumnWidth = True
.TextFileParseType = xlFixedWidth
.TextFileTextQualifier = xlTextQualifierDoubleQuote
.TextFileColumnDataTypes = Array(2)
.Refresh BackgroundQuery:=False
End With
End Sub
Function GetSource(sURL As String) As String
Dim oXHTTP As Object
Set oXHTTP = CreateObject("MSXML2.XMLHTTP")
oXHTTP.Open "GET", sURL, False
oXHTTP.send
GetSource = oXHTTP.responsetext
Set oXHTTP = Nothing
End Function

The code is great for English websites but when it comes to other languages like Arabic, it fails and shows Arabic characters as some question marks like "????" while I do have language pack and fonts installed. I tried changing "windows regional and language" to Arabic but it didn't help. also changed the "Source.txt" to "Source.html" (the file that is saved and imported via the code) didn't change the results. It seems a SIMPLE task but is it possible to import full UTF8 html source of a non-English webpage into an Excel worksheet (given that the )? what am I doing wrong?
I have windows 7 and Excel 2013. Thanks in advance.
I have just completed such a task. All it is is transposing utf 8 into regular text. UTF8 has 65000+ CHARACTERS in blocks. Basically you need to translate that code page range to Windows Ansi or whatever. Look for my post "Translate UTF-8". Ye VBA code is very simple and effective at doing this. Whatever the language used, it will translate into legible characters.
 
Upvote 0

Forum statistics

Threads
1,215,633
Messages
6,125,922
Members
449,274
Latest member
mrcsbenson

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top