Bug 94887 - Add import support for "Webarchive" files
Summary: Add import support for "Webarchive" files
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: Other All
: lowest enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-10-08 14:42 UTC by Bastien Nocera
Modified: 2022-03-14 10:57 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
webarchive example (36.04 KB, application/octet-stream)
2015-10-08 16:03 UTC, Bastien Nocera
Details
converted XML file (51.68 KB, text/plain)
2015-10-08 16:03 UTC, Bastien Nocera
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Bastien Nocera 2015-10-08 14:42:16 UTC
Webarchives[1] are a way to save HTML and auxiliary data, such as images, in a bundle of data, similarly to the open MHTML format[2].

Adding support to LibreOffice would mean that users can open, and print those documents.

[1]: https://en.wikipedia.org/wiki/Webarchive
[2]: https://en.wikipedia.org/wiki/MHTML
Comment 1 Michael Stahl (allotropia) 2015-10-08 15:48:48 UTC
wikipedia says, "The webarchive format is a concatenation of source files with filenames saved in the binary plist format using NSKeyedArchiver."

that sounds pretty crazily platform-specific, and it's only supported by a niche web browser (Safari).

back in the day, Konqueror had these ".war" files, which is the same idea but as a sane plain standard PK-ZIP file.
Comment 2 Bastien Nocera 2015-10-08 15:52:50 UTC
(In reply to Michael Stahl from comment #1)
> wikipedia says, "The webarchive format is a concatenation of source files
> with filenames saved in the binary plist format using NSKeyedArchiver."
> 
> that sounds pretty crazily platform-specific, and it's only supported by a
> niche web browser (Safari).

It's not niche on OS X. It's certainly less niche than Numbers, Pages or Keynote support (which has a lesser percentage of OSX users).

> back in the day, Konqueror had these ".war" files, which is the same idea
> but as a sane plain standard PK-ZIP file.

Right. Except that the problem isn't creating but consuming those files.
Comment 3 Bastien Nocera 2015-10-08 16:03:21 UTC
(In reply to Bastien Nocera from comment #2)
> (In reply to Michael Stahl from comment #1)
> > wikipedia says, "The webarchive format is a concatenation of source files
> > with filenames saved in the binary plist format using NSKeyedArchiver."
> > 
> > that sounds pretty crazily platform-specific, and it's only supported by a
> > niche web browser (Safari).
> 
> It's not niche on OS X. It's certainly less niche than Numbers, Pages or
> Keynote support (which has a lesser percentage of OSX users).

I'm pretty sure we already have binary plist support in LO for those formats as well. So, given the example .webarchive file I'm about to attach, using the plistutil application available on Linux:
$ plistutil -i _iCloud.webarchive -o _iCloud.xml
$ file _iCloud.xml 
_iCloud.xml: XML document text

The XML file looks like it has the various parts in base64 format.

If MHTML, or .war is supported, the support for webarchive isn't far off.
Comment 4 Bastien Nocera 2015-10-08 16:03:39 UTC
Created attachment 119426 [details]
webarchive example
Comment 5 Bastien Nocera 2015-10-08 16:03:55 UTC
Created attachment 119427 [details]
converted XML file
Comment 6 Michael Stahl (allotropia) 2015-10-09 09:56:23 UTC
well it looks like a potential DLP feature, maybe David can tell us if this would fit into libetonyek...
Comment 7 Robinson Tryon (qubit) 2016-01-19 04:39:19 UTC
(In reply to Bastien Nocera from comment #2)
> > back in the day, Konqueror had these ".war" files, which is the same idea
> > but as a sane plain standard PK-ZIP file.
> 
> Right. Except that the problem isn't creating but consuming those files.

Updating Summary to reflect that this is a request for /import/ support.

(In reply to Michael Stahl from comment #6)
> well it looks like a potential DLP feature, maybe David can tell us if this
> would fit into libetonyek...

Importance is Lowest/Enhancement and this has been in UNCONFIRMED for a couple months, so I'll toss into NEW for now. We can move the bug to the DLP project or otherwise re-evaluate in the future.