Bug 142356

Summary: [accessibility] filter save as HTML places caption inside image, export to XHTML drops the caption neither is good AT support
Product: LibreOffice Reporter: Stéphane Guillou (stragu) <stephane.guillou>
Component: WriterAssignee: Not Assigned <libreoffice-bugs>
Status: NEW ---    
Severity: normal CC: c_strobbe-fdo, nagharshita16, stephane.guillou, vsfoote
Priority: medium Keywords: accessibility
Version: 6.2.5.2 release   
Hardware: All   
OS: All   
Whiteboard:
Crash report or crash signature: Regression By:
Bug Depends on:    
Bug Blocks: 101912, 108799    
Attachments: ODT document with image and caption
example document saved as HTML
image with caption saved alongside HTML file

Description Stéphane Guillou (stragu) 2021-05-18 14:11:12 UTC
Description:
In this document, a caption of a picture is exported together with the picture, and is therefore not readable by a screen reader for sight-impaired users.
This is a very concerning accessibility shortcoming.

Steps to Reproduce:
1. Open attached ODT (same document as in bug 109334
2. File > Save as > HTML

Actual Results:
The resulting HTML file contains a picture with the caption integrated into it. No screen reader can pick this up, an OCR tool should be used instead.

Expected Results:
The caption is exported as a proper caption, using for example the <figcaption> tag, as described here: https://www.w3schools.com/TAGS/tag_figcaption.asp


Reproducible: Always


User Profile Reset: No



Additional Info:
Interestingly, this does not happen with a freshly created document. Also, an XHTML export has the caption as text.

Version: 7.2.0.0.alpha0+ / LibreOffice Community
Build ID: 6b09276d157abada74e1a4989700139167207778
CPU threads: 8; OS: Linux 4.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
TinderBox: Linux-rpm_deb-x86_64@86-TDF, Branch:master, Time: 2021-05-14_04:32:30
Calc: threaded
Comment 1 Harshita Nag 2021-05-19 03:03:18 UTC
Can't reproduce this. 
Version: 7.0.4.2
Build ID: dcf040e67528d9187c66b2379df5ea4407429775
CPU threads: 8; OS: Linux 5.3; UI render: default; VCL: gtk3
Locale: en-IN (en_IN); UI: en-US
Calc: threaded
Comment 2 Stéphane Guillou (stragu) 2021-05-19 05:09:57 UTC
Created attachment 172158 [details]
ODT document with image and caption

Forgot to attach the problematic document.

Open, save as HTML, see gif file created along with HTML document.

Confirmed with 7.0.4.2, 7.1.3 and 7.2 alpha0+
Comment 3 Stéphane Guillou (stragu) 2021-05-19 05:11:45 UTC
Created attachment 172159 [details]
example document saved as HTML
Comment 4 Stéphane Guillou (stragu) 2021-05-19 05:12:21 UTC
Created attachment 172160 [details]
image with caption saved alongside HTML file
Comment 5 V Stuart Foote 2021-05-25 12:49:27 UTC
Does the XSL based XHTML filter do better with an Export to XHTML? Or is that a problem as well?
Comment 6 Stéphane Guillou (stragu) 2021-05-25 13:17:25 UTC
Hi Stuart

By "XSL-based XHTML filter", do you mean the one used when exporting as XHTML with "File > Export as... > XHTML"? If so, I said in the description that XHTML export saves the caption as text.

Or do you mean a different filter?

Also, do you confirm the behaviour when saving as HTML?
Comment 7 V Stuart Foote 2021-05-25 14:53:53 UTC
(In reply to stragu from comment #6)
> Hi Stuart
> 
> By "XSL-based XHTML filter", do you mean the one used when exporting as
> XHTML with "File > Export as... > XHTML"? If so, I said in the description
> that XHTML export saves the caption as text.
> 

Yes sorry, I missed your 
"Additional Info:
Interestingly, this does not happen with a freshly created document. Also, an XHTML export has the caption as text." as I skimmed the ticket.

> 
> Also, do you confirm the behaviour when saving as HTML?

Yes the ancient HTML filter simply converts the image frame and its caption to a GIF.  So the caption "Illustration 1: my caption text" is embedded into the bitmap and not available to AT.

For XSL based 'Export' to XHTML, just the image is embedded as PNG in base64. The frame caption is completely dropped.  And the alternative text is picked up as normal text (not linked to the image). So an AT fail there as well.
Comment 8 Stéphane Guillou (stragu) 2021-07-08 06:11:03 UTC
Reproduced in 6.2.5 as well.

Version: 6.2.5.2
Build ID: 1ec314fa52f458adc18c4f025c545a4e8b22c159
CPU threads: 4; OS: Linux 5.4; UI render: default; VCL: gtk3; 
Locale: en-AU (en_AU.UTF-8); UI-Language: en-US
Calc: threaded
Comment 9 Christophe Strobbe 2022-05-16 17:27:04 UTC
I confirm that this bug applies to LibreOffice 7.1.4.2:

Version: 7.1.4.2 / LibreOffice Community
Build ID: 10(Build:2)
CPU threads: 8; OS: Linux 5.5; UI render: default; VCL: kf5
Locale: en-GB (en_GB.utf8); UI: en-GB
Calc: threaded

The difficult part is deciding how to deal with this, i.e. what HTML code should be exported.

For ODF images that have a caption, it makes sense to generate the following code with the "Save as HTML" function (some code exported by LibreOffice omitted for the sake of simplicity:

<figure>
  <img src="..." alt="..." />
  <figcaption>Illustration 1: my Caption Text</figcaption>
</figure>

If the caption is above the image in the ODF file, the figcaption element should end up above the img element in the HTML code.

It is less obvious what to do with the XHTML export, beyond simply adding a span or a p element below the img element. figure and figcaption were first introduced in HTML 5, whereas LibreOffice's "Export to XHTML" function exports to XHTML 1.1 + Math ML 2.0, which is an outdated specification. XHTML 1.1 was superseded in 2018: https://www.w3.org/TR/xhtml11/ .

The XHTML code might end up looking as follows:

<p class="Illustration"><img alt="..." src=" ..." /></p>
<p class="caption">Illustration ... </p>

Note:
1. The content of the alt attribute (the contents of ODF's <svg:title> element) should not be repeated after the img element (which is a different bug).
2. I have replaced <div class="Illustration"> with a p element, which is more appropriate. The value of the class attribute may depend on the caption category in the ODF file.
Comment 10 QA Administrators 2024-05-16 03:16:25 UTC
Dear stragu,

To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year.

There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present.

If you have time, please do the following:

Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/

If the bug is present, please leave a comment that includes the information from Help - About LibreOffice.
 
If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice.

Please DO NOT

Update the version field
Reply via email (please reply directly on the bug tracker)
Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not 
appropriate in this case)


If you want to do more to help you can test to see if your issue is a REGRESSION. To do so:
1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from https://downloadarchive.documentfoundation.org/libreoffice/old/

2. Test your bug
3. Leave a comment with your results.
4a. If the bug was present with 3.3 - set version to 'inherited from OOo';
4b. If the bug was not present in 3.3 - add 'regression' to keyword


Feel free to come ask questions or to say hello in our QA chat: https://web.libera.chat/?settings=#libreoffice-qa

Thank you for helping us make LibreOffice even better for everyone!

Warm Regards,
QA Team

MassPing-UntouchedBug