Bug 88701

Summary: FILEOPEN: Problem with import of HTML/ MHTML files with extension XLS on Windows
Product: LibreOffice Reporter: Ajay Pal Singh Atwal <ajaypal>
Component: CalcAssignee: Not Assigned <libreoffice-bugs>
Status: RESOLVED DUPLICATE    
Severity: normal CC: frederic.parrenin, ilmari.lauhakangas, philipz85
Priority: medium    
Version: Inherited From OOo   
Hardware: All   
OS: All   
See Also: https://bugs.freedesktop.org/show_bug.cgi?id=73682
https://bugs.freedesktop.org/show_bug.cgi?id=49639
https://bugs.documentfoundation.org/show_bug.cgi?id=94887
Whiteboard:
Crash report or crash signature: Regression By:
Bug Depends on:    
Bug Blocks: 111951    
Attachments: HTML file with MIME header with XLS extension
File in LibreOffice 3.3.0 on Mac OSX
File in LibreOffice 5.2.1.4 on Mac OSX
Prinstcreen from MS Office 2016

Description Ajay Pal Singh Atwal 2015-01-22 12:23:56 UTC
Created attachment 112661 [details]
HTML file with MIME header with XLS extension

A particular brand of ERP software is being used for generating reports in our organizations. It has an Export of Excel button. On the web based interface of the ERP the report can be downloaded as an XLS spreadsheet.
The XLS file is actually an HTML file with an XLS extension.

File contains a MIME header and then HTML tags for tables etc. Example file is attached.

This file if opened on Ubuntu GNU/ Linux with LibreOffice 4.2.7.2 ; can be imported as calc table. A little annoyance is the mime header at the top of file.

On Windows versions (64bit Windows 7) (I have tried 4.3 and 4.4 RC1 both) the import dialog displays the file contents as HTML text and prompts to import it as TAB, SPACE delimited text file. Rendering the file imported data unusable.

On renaming the file as .html the same can be imported as calc table with the little annoyance of mime header on top.

There were other bug reports with broken HTML as XLS, this one has MIME headers before start of HTML
Comment 1 Buovjaga 2015-03-06 12:28:49 UTC
Tries to import as CSV both in Linux & Windows.

Win 7 Pro 64-bit, LibO Version: 4.4.1.2
Build ID: 45e2de17089c24a1fa810c8f975a7171ba4cd432
Locale: fi_FI

Ubuntu 14.10 64-bit 
Version: 4.4.1.2
Build ID: 40m0(Build:2)
Locale: en_US
Comment 2 Yousuf Philips (jay) (retired) 2015-03-06 12:56:26 UTC
Hello Ajay,

As excel 2003 and 2013 are able to import it correctly by stripping away the mime information, it would be good for calc to do the same.
Comment 3 Ajay Pal Singh Atwal 2015-03-08 21:39:10 UTC
In case it helps, the ERP software from which such file can be exported, is SAP EP/ BI
Comment 4 Ajay Pal Singh Atwal 2015-05-15 06:16:26 UTC
The file is MHTML format and not HTML.
See: http://en.wikipedia.org/wiki/MHTML

Someone from our organisation asked the ERP vendor Support and this is their terse response:
-----------
Libre Office is not supported by SAP
-----------
As the support guy was able to find ****only one note**** about Libre Office and assumed it is not being used and hence not supported. (Rolling Eyes)

In another related communication some other SAP notes 1517552 and 1178858 have been referred

Relevant section of note 1517552 reproduced below:
-------------------------------------------------
During the 'Export to Excel' and 'Export to Excel 2000' functions, the file generated is internally an MHTML file, while the file extension is 'set' to .xls during the export


Relevant section of note 1178858 reproduced below:
-------------------------------------------------
The export to Excel function is supported as of Excel 2003. It generates an XHTML file in the Multi Mime format. This means that Mimes (for example, icons and screens) are stored in the file.


Also note that MS Excel 2007 onwards a warning is display for such files thta it is not in correct format but excel is able to import MHTML files
See: https://support.microsoft.com/en-us/kb/948615#top
Comment 5 Ajay Pal Singh Atwal 2015-05-15 06:46:04 UTC
This also seems relevant
https://bz.apache.org/ooo/show_bug.cgi?id=101436
Comment 6 QA Administrators 2016-09-20 09:42:11 UTC Comment hidden (obsolete)
Comment 7 Ajay Pal Singh Atwal 2016-09-20 11:47:50 UTC
Created attachment 127459 [details]
File in LibreOffice 3.3.0 on Mac OSX

This is how this file looks when opened in in LibreOffice 3.3.0 on Mac OSX 10.11.6
Comment 8 Ajay Pal Singh Atwal 2016-09-20 11:56:05 UTC
Created attachment 127460 [details]
File in LibreOffice 5.2.1.4 on Mac OSX

File when opened in LibreOffice 5.2.1.4 on Mac OSX
Comment 9 Buovjaga 2017-11-10 13:41:14 UTC
*** Bug 106856 has been marked as a duplicate of this bug. ***
Comment 10 QA Administrators 2018-11-11 03:46:57 UTC Comment hidden (obsolete)
Comment 11 Svatopluk Vít 2021-03-01 09:29:23 UTC
Created attachment 170144 [details]
Prinstcreen from MS Office 2016

This is printscreen of the file imported to MSO 2016
Comment 12 Svatopluk Vít 2021-03-01 09:30:00 UTC
The bug is still present.

Version: 7.1.1.1 (x64) / LibreOffice Community
Build ID: 575c5867c4cc13d7ae78f9ce39a54a52ed38c769
CPU threads: 4; OS: Windows 10.0 Build 19042; UI render: Skia/Vulkan; VCL: win
Locale: cs-CZ (cs_CZ); UI: cs-CZ
Calc: threaded
Comment 13 Mike Kaganski 2022-03-14 10:58:56 UTC

*** This bug has been marked as a duplicate of bug 83601 ***