Workbooks.OpenText doesn't work with UTF-8, ActiveSheet.QueryTables.Add does?!

kriddy

New Member
Joined
May 11, 2011
Messages
5
I have this csv file which is UTF-8 encoded. It contains special characters like ë, ï, etc. To import it into Excel, I use the following code, where Origin:=65001 should ensure that Excel reads it as an UTF-8 encoded file:

Workbooks.OpenText Comma:=True, DataType:=xlDelimited, Filename:=ActiveWorkbook.Path & "\test.csv", Origin:=65001

It doesn't work however and keeps showing the wrong characters (ë instead of ë for example). I tried other encodings and Excel is influenced by it (with Origin:=1 ë becomes √´), so no typing error.

What goes wrong here?
 

Excel Facts

Quick Sum
Select a range of cells. The total appears in bottom right of Excel screen. Right-click total to add Max, Min, Count, Average.
I also tried a completely different approach, using the Text Import Wizard and changing the recorded code:

With ActiveSheet.QueryTables.Add(Connection:= _
"TEXT;" & ActiveWorkbook.Path & "\test.csv" _
, Destination:=Range("$A$1"))
.Name = "test"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = False
.RefreshOnFileOpen = False
.RefreshStyle = xlOverwriteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = False
.RefreshPeriod = 0
.TextFilePromptOnRefresh = False
.TextFilePlatform = 65001
.TextFileStartRow = 1
.TextFileParseType = xlDelimited
.TextFileTextQualifier = xlTextQualifierDoubleQuote
.TextFileConsecutiveDelimiter = False
.TextFileTabDelimiter = False
.TextFileSemicolonDelimiter = False
.TextFileCommaDelimiter = True
.TextFileSpaceDelimiter = False
.TextFileColumnDataTypes = Array(2, 2, 2)
.TextFileTrailingMinusNumbers = True
.Refresh BackgroundQuery:=False
End With
Cells.QueryTable.Delete

Encoding works (TextFilePlatform = 65001). Multiline texts don't however:
"Line 1
Line 2"
shows up in two lines. With OpenText it stays in one cell.
 
Upvote 0
It seems that Excel doesn't support UTF-8. See http://www.myintervals.com/forum/di...v-data-exports-with-utf8-more-excel-friendly/, which says:

MS Excel has trouble opening CSV files encoded in UTF8. Characters outside the range of ASCII characters get garbled into nonsense. Excel does have support for Unicode, it's spotty. Here are some of the issues we encountered when trying to make CSV output more friendly to MS Excel:

1. To get Unicode to display properly in Office (including Excel), UTF-8 has to be converted to UTF-16LE (little endian).

2. Once the output is converted to UTF-16LE, Excel doesn't pay attention to the column definitions. To get around this, you have to use the tab ("\t") as the value separator rather than a comma, effectively changing the CSV file into a TSV file.

3. Using TSV format, the columns are lined up correctly, but multi-line content within a single cell isn't regarded, and the content flows across multiple rows in Excel.
The above is exactly what I found with Excel 2003, with the exception that the TSV file has to be in UTF-16LE BOM (Byte Order Mark) format.
 
Upvote 0
Thanks for your reply! Excel somehow does support UTF-8: if I use the Text Import Wizard in Excel 2007 or Excel 2010, everything goes fine.

The site http://www.the-art-of-web.com/php/dataexport/#section_6 also mentiones that that Excel doesn't handle UTF-8 characters very well.

I'll have to look into it and find out how I can export in UTF-16LE using phpexcel...
 
Upvote 0

Forum statistics

Threads
1,214,979
Messages
6,122,552
Members
449,088
Latest member
davidcom

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top