Bug 151826 - PAC Checker fails with PDF containing TOC
Summary: PAC Checker fails with PDF containing TOC
Status: RESOLVED DUPLICATE of bug 155228
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Printing and PDF export (show other bugs)
Version:
(earliest affected)
7.5.0.0 alpha0+
Hardware: All All
: medium normal
Assignee: Gabor Kelemen (allotropia)
URL:
Whiteboard:
Keywords: accessibility, bibisected, regression
Depends on:
Blocks: PDF-Accessibility
  Show dependency treegraph
 
Reported: 2022-10-30 12:23 UTC by Gabor Kelemen (allotropia)
Modified: 2023-07-05 09:06 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Example file from Writer (13.45 KB, application/vnd.oasis.opendocument.text)
2022-10-30 12:23 UTC, Gabor Kelemen (allotropia)
Details
The example file converted to accessible PDF (42.92 KB, application/pdf)
2022-10-30 12:24 UTC, Gabor Kelemen (allotropia)
Details
The exception in PAC tool and the original doc in Writer (77.57 KB, image/png)
2022-10-30 12:24 UTC, Gabor Kelemen (allotropia)
Details
Screenshot at acept to export (69.40 KB, image/png)
2022-10-30 15:12 UTC, m_a_riosv
Details
Another example file, with simple hyperlinks (16.42 KB, application/vnd.oasis.opendocument.text)
2022-11-21 23:06 UTC, Gabor Kelemen (allotropia)
Details
The example file with simple hyperlinks converted to PDF with PDF/UA (51.98 KB, application/pdf)
2022-11-21 23:13 UTC, Gabor Kelemen (allotropia)
Details
Contacted the PAC project (65.06 KB, image/png)
2022-12-12 12:32 UTC, Gabor Kelemen (allotropia)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Gabor Kelemen (allotropia) 2022-10-30 12:23:40 UTC
Created attachment 183340 [details]
Example file from Writer

Attached simple document contains a few paragraphs formatted as heading and a TOC with default settings.

When this is exported to PDF with PDF/UA enabled, the resulting file causes an exception in the PAC validator tool. It also stops checking the document at the page where the TOC is.

1, Open attached document in Writer
2, Export to PDF with PDF/UA enabled.
3, Open the resulting file in PAC checker: https://pdfua.foundation/en/pdf-accessibility-checker-pac

It will stop with an unhandled exception: "Object cannot be converted from type xxxx to type yyyy".

Version: 7.5.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: bdb76c9ff1832041fa7a9bda30e8d4d7d937ff94
CPU threads: 14; OS: Windows 10.0 Build 19044; UI render: Skia/Raster; VCL: win
Locale: de-DE (hu_HU); UI: en-US
Calc: threaded

11 days old bibisect-7.5 does not generate this error:

Version: 7.5.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: f3a82a8ba51195cf31b0f78164735acc7ebbcd2f
CPU threads: 14; OS: Windows 10.0 Build 19044; UI render: Skia/Raster; VCL: win
Locale: hu-HU (hu_HU); UI: en-US
Calc: threaded

Likely a side effect from bug 148934.
Comment 1 Gabor Kelemen (allotropia) 2022-10-30 12:24:12 UTC
Created attachment 183341 [details]
The example file converted to accessible PDF
Comment 2 Gabor Kelemen (allotropia) 2022-10-30 12:24:45 UTC
Created attachment 183342 [details]
The exception in PAC tool and the original doc in Writer
Comment 3 m_a_riosv 2022-10-30 15:12:07 UTC
Created attachment 183343 [details]
Screenshot at acept to export

I hit some Accessibility messages, once clicking to export, see in the attached screenshot.
So exporting anyway, seems adequate, producing issues at checking.
With
Version: 7.5.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 9cd0f4c2d25462feba0ffcbd906c199273821243
CPU threads: 4; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: es-ES (es_ES); UI: en-US Calc: CL threaded Jumbo
Comment 4 Joesph Wyman 2022-11-15 09:51:32 UTC Comment hidden (spam)
Comment 5 Michael Stahl (allotropia) 2022-11-15 19:25:03 UTC
uh... i can't reproduce this, if i open the attachment https://bugs.documentfoundation.org/attachment.cgi?id=183341 i don't get any exception

i have PAC 3 version 3.0.7.0

it reports a validation error about the title missing but no exception or crash dialog.
Comment 6 Gabor Kelemen (allotropia) 2022-11-21 23:05:19 UTC
Bibisected with linux-7.5 repo, indeed side effect (on the external tool) from bug 148934.
Comment 7 Gabor Kelemen (allotropia) 2022-11-21 23:06:55 UTC
Created attachment 183710 [details]
Another example file, with simple hyperlinks

bug 140617 is about the case of simple hyperlinks, now I can not decide with the PAC v2021 if those are fixed or not.
Comment 8 Gabor Kelemen (allotropia) 2022-11-21 23:13:38 UTC
Created attachment 183711 [details]
The example file with simple hyperlinks converted to PDF with PDF/UA

PAC fails on this, but https://demo.verapdf.org/ lets it pass with the result:

Validation Profile:	PDF/UA-1 validation profile
Compliance: 	Passed
Statistics
Version:	
Parser:	GreenField
Build Date:	
Processing time:	00:00:00.010
Total rules in Profile:	91
Passed Checks:	393
Failed Checks:	0
Comment 9 bugzilla 2022-11-29 15:59:22 UTC
I'm using LibreOffice 7.4.2.3(x64) - as of today the latest release.

If you create a document with TOC and export it as PDF/UA two things fail with PAC 21.0.0 (latest as of today).

1.) TOC: for every line it fails with "Anmerkung ohne alternative Beschreibung" (annotation without description)

2.) on TOC page: "Tab-Reihenfolge einer Seite mit Anmerkungen ist nicht auf "Struktur" gesetzt" (Tab order on page with annotion is not set to "structure")


I do agree, that repeating the headings text as links description seems wrong - and in case of some ARIA checking tools this is even considered as wrong for html documents. 
Nevertheless: I do have real life examples where setting the alternate description does make sense / would actually be helpful for users. In my case these documents are generated via API so the possibility to set the description without the Libre UI supporting it would be helpful, eq:
..
 xEntry.setPropertyValue ( "AlternativeText", "Big dogs! Falling on my head!" );
 xEntry.setPropertyValue ( "Description", "Attention: following this link could hurt" );


VeraPDF fails with the same document like this:

Specification: ISO 19005-3:2012, Clause: 6.6.2.3, Test number: 7	
All properties specified in XMP form shall use either the predefined schemas defined in the XMP Specification, ISO 19005-1 or this part of ISO 19005, or any extension schemas that comply with 6.6.2.3.2.	Failed
1 occurrences
Comment 10 Gabor Kelemen (allotropia) 2022-12-12 12:32:43 UTC
Created attachment 184108 [details]
Contacted the PAC project

So since this is perhaps not our bug (verapdf does not complain about new files), I contacted the upstream PAC project via their contact form, as seen on this image.
I'll come back when they answer.
Comment 11 bugzilla 2022-12-13 10:14:04 UTC
Thank you for contacting PAC - I think this should have more "weight" than it would have if I had contacted them about this (to honest: did not came up with that idea).

Anyway: there was a second part in my comment 9 - although ruling missing titles it as an error maybe a bug on PAC side there are circumstances where titles do make sense and adding this feature would be beneficiary.

And I'm really curios about PAC teams's answer.
Comment 12 Gabor Kelemen (allotropia) 2023-07-04 20:30:10 UTC
This issue seems to be solved since

https://git.libreoffice.org/core/+/0025190152a35e18c7847e91ad171df339657910

author	Michael Stahl <michael.stahl@allotropia.de>	Fri May 12 11:47:24 2023 +0200
committer	Michael Stahl <michael.stahl@allotropia.de>	Fri May 12 13:45:38 2023 +0200

tdf#155228 vcl: PDF export: /Tabs needs PDF name, not string

*** This bug has been marked as a duplicate of bug 155228 ***
Comment 13 Michael Stahl (allotropia) 2023-07-05 09:06:40 UTC
oh interesting; so there *was* actually a problem with the PDF produced by LO, it was just impossible to see from the gibberish error message what the problem was.