Bug 131321 - FILEOPEN: ms word docx with styles that include bulleting / numbering loses bullet / number part of style in Writer
Summary: FILEOPEN: ms word docx with styles that include bulleting / numbering loses b...
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.4.0.3 release
Hardware: All All
: medium normal
Assignee: Justin L
URL:
Whiteboard: target:7.0.0
Keywords: bibisected, bisected, filter:docx, filter:rtf, regression
Depends on:
Blocks: DOCX-Styles DOCX-Bullet-Number-Outline-Lists
  Show dependency treegraph
 
Reported: 2020-03-13 06:01 UTC by Michelle
Modified: 2020-12-08 08:26 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Simple example file which causes the bug (13.19 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2020-03-13 06:02 UTC, Michelle
Details
printscreen from LO 7 (54.02 KB, image/png)
2020-03-13 20:59 UTC, raal
Details
Comparison of styles between original MSO docx and LO docx (78.83 KB, image/png)
2020-03-16 02:53 UTC, Michelle
Details
tdf131321_paraStyleNumbering.odt: export to DOCX is fine - just missing import (12.06 KB, application/vnd.oasis.opendocument.text)
2020-04-10 16:12 UTC, Justin L
Details
Example compared MSO LO (122.58 KB, image/png)
2020-05-08 10:57 UTC, Timur
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michelle 2020-03-13 06:01:38 UTC
Description:
Docx files created in MS word (2019) which includes styles that have bulleting / numbering as part of the style lose the bullet / number part of the style when opened in Writer.

Steps to Reproduce:
1. Create new Docx in MS Word
2. Define a style which includes a Bullet/Number
3. Open in Writer
4. Open styles pane and apply style


Actual Results:
No bullet/number is applied as part of the style. Style definition is missing the bullet/number part.
If you then save as a new Docx file from Writer and open in MS Word the bullet/number is also missing from the style.

Expected Results:
Bullet/number should be retained as part of the style.


Reproducible: Always


User Profile Reset: Yes



Additional Info:
Version: 6.2.8.2 (x64)
Build ID: f82ddfca21ebc1e222a662a32b25c0c9d20169ee
CPU threads: 16; OS: Windows 10.0; UI render: GL; VCL: win; 
Locale: en-AU (en_AU); UI-Language: en-GB
Calc: CL

Have also tried latest release with same results:
Version: 6.4.1.2 (x64)
Build ID: 4d224e95b98b138af42a64d84056446d09082932
CPU threads: 16; OS: Windows 10.0 Build 18363; UI render: GL; VCL: win; 
Locale: en-AU (en_AU); UI-Language: en-GB
Calc: CL
Comment 1 Michelle 2020-03-13 06:02:41 UTC
Created attachment 158661 [details]
Simple example file which causes the bug
Comment 2 Michelle 2020-03-13 06:09:59 UTC
RTF files are similarly affected. However, doc files are not.
Comment 3 raal 2020-03-13 20:59:19 UTC
Created attachment 158672 [details]
printscreen from LO 7

Looks good to me with Version: 7.0.0.0.alpha0+
Build ID: a11c10a83f6fceae6cfb519725d06f8eaf1013fb
CPU threads: 4; OS: Linux 4.15; UI render: default; VCL: gtk3; 

Please compare with Word, if it's still wrong attach printscreen from Word.
Comment 4 Michelle 2020-03-16 02:53:12 UTC
Created attachment 158705 [details]
Comparison of styles between original MSO docx and LO docx

See attached image showing the comparison between the original .docx file created in MS Word and the output .docx after converting from docx -> odt -> docx (Using latest build of LO7).

The content is unaffected. However in the Styles section you can see that MyBulletStyle and MyNumberStyle have lost the bullet/number part of the style during the conversion.
Comment 5 QA Administrators 2020-03-17 02:49:38 UTC Comment hidden (obsolete)
Comment 6 Dieter 2020-03-26 10:15:06 UTC
(In reply to Michelle from comment #0)
> Actual Results:
> No bullet/number is applied as part of the style. Style definition is
> missing the bullet/number part.

I can confirm that with

Version: 7.0.0.0.alpha0+ (x64)
Build ID: 5dcbd1bb557450a2d658a710c163b310c0cee157
CPU threads: 4; OS: Windows 10.0 Build 18363; UI render: Skia/Raster; VCL: win; 
Locale: de-DE (de_DE); UI-Language: en-GB
Calc: CL

But if I open the format dialog to change MyBulletStyle in MS Word (I use 2016), it has also no numbering. I can set a numbering format to that style, but when I press OK and open style format again it seems, that Word doesn't save that setting.

So in sum I would say NOTOURBUG.
Comment 7 Timur 2020-03-26 17:04:33 UTC
I confirm the bug as fileopen (test 7.0+ in Win).
What may confuse in LO: while it looks OK, it's not, styles really don't have numbering, can be seen if applied on empty paragraph.
What may confuse in MSO: Modify Style - Format - Numbering doesn't mark applied bullet/numner, but it is applied, as seen in Modify Style descriptipo which says "Bulleted" or "Numbered".

Interesting that bullet and numbering styles worked in LO 4.3, so regression from 4.4. 
So many regressions...
Comment 8 Justin L 2020-04-08 11:58:58 UTC
The regression in LO 4.4 was from commit e49d2b31fb2020d065b4ad940d1031d07b10f32b
Author:  Vinaya Mandke CommitDate: Tue Jun 10 09:57:45 2014 +0200
     fdo#78939 [DOCX] Hang while opening due to incorrect modification of Style
    
http://opengrok.libreoffice.org/xref/core/sw/source/core/unocore/unosett.cxx#1884
modifies the referenced style of the numbering rule to use the current numbering rule. Actually the referenced style is not supposed to be modified.
As the numbering level format only uses that properties particular style, which may or may not be a numbering style.
    For this Particular document the numbering format refers the "Default Style" (Normal). Almost all of the styles in style.xml are based on it. Normal was modified, and as a result the whole document was bulletized; Which caused the hang while opening.

Removed the addition of style as a PARA_STYLE, as the properties of the referenced style are already added in ListLevel::AddParaProperties
   Reviewed on: https://gerrit.libreoffice.org/9668
Comment 9 Justin L 2020-04-09 18:58:55 UTC
It is fairly easy to revert this. https://gerrit.libreoffice.org/c/core/+/91996

However, that won't do much except to allow a one-time import. Exporting has never round-tripped the values, and it is irrelevant for viewing - only for editing - and so round-tripping support is essential.

Everything about numbering is crazy - since we have to emulate what Word does which isn't exactly how LO does things - and grabbagging is probably useless for numbering. At this point I can't even understand Microsoft's documentation on how this is SUPPOSED to work. But I'll try and see if I can work out what to do for exporting...
Comment 10 Justin L 2020-04-10 09:52:54 UTC
A little history lesson, using ooxmlexport's tdf95376.docx, looking at the Paragraph style Bullet (parent style Plain Text).
Back in LO 3.6, numbering was not imported. Then it switched to "Outline Numbering" likely due to fixing an exception related to the string of patches starting with commit fb68711fc3fbab99e47cc94f5abd27b1425bc468 by Author: Lubo? Lu?ák on Thu Apr 5 13:57:05 2012 +0200

That was quickly fixed to set the numbering style to WWNum1 by commit 042da092165eea856596db5ba5f18ea1273b88eb Author: Lubo? Lu?ák on Wed May 2 17:40:10 2012 +0200
    finish handling of w:pStyle in numbering (bnc#751028)

But I think the premise is ALL WRONG here.  w:pStyle should have nothing to do with assigning this numbering style to a paragraph style. It just means that you get certain properties from that style. (see ListLevel::AddParaProperties)

This ought to come from NumPr_numId - where the style assigns itself to a nonAbstract numbering list. So I think the real question is why isn't this working?
Comment 11 Justin L 2020-04-10 15:48:58 UTC
(In reply to Justin L from comment #10)
> So I think the real question is why isn't this working?
A big part of that is because Styles are finished before lists are processed.
        resolveFastSubStream(rStream, OOXMLStream::FONTTABLE);
        resolveFastSubStream(rStream, OOXMLStream::STYLES);
        resolveFastSubStream(rStream, OOXMLStream::NUMBERING);

Numbering needs style information from char styles, and para styles, so we can't simply switch those two around (and doing so caused a unit test failure in ooxmlexport10 - good).
Comment 12 Justin L 2020-04-10 16:12:58 UTC
Created attachment 159472 [details]
tdf131321_paraStyleNumbering.odt: export to DOCX is fine - just missing import
Comment 13 Justin L 2020-04-10 16:24:21 UTC
(In reply to Justin L from comment #9)
> Exporting has never round-tripped the values...
Please recognize the conversation has changed here.  pStyle - the numbering.xml property - has never been exported.

numPr's numId in styles.xml IS being exported.
Comment 14 Justin L 2020-04-11 11:30:35 UTC
proposed patch https://gerrit.libreoffice.org/c/core/+/92058

This fixes DOCX import (and since export already worked, it is completely fixed).
RTF export is not working (it explicitly only tries to export for paragraphs, not styles) and so I will ignore that case. A separate bug report should be made if that case is important.
Comment 15 Commit Notification 2020-04-14 07:48:16 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/35fc5ef0a759884b24ed8b83cd05702a0fab64cc

tdf#131321 writerfilter: ApplyNumberingStyleNameToParaStyles()

It will be available in 7.0.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 16 Xisco Faulí 2020-04-15 11:44:11 UTC
Hi Justin,
thanks for fixing this issue. Do you think it would make sense to backport it to libreoffice-6-4 considering it's a regression ?
Comment 17 Justin L 2020-04-15 14:58:38 UTC
(In reply to Xisco Faulí from comment #16)
> considering it's a regression ?
Ultimately, this was not a regression. It just happened to work in one specific case, but the original implementation was completely wrong.

Because there could be strange interactions with chapter numbering, I wasn't planning on backporting. This doesn't affect layout at all, and few people would use numbering and styles, so importance is very low.
Comment 18 Timur 2020-05-08 10:57:11 UTC
Created attachment 160530 [details]
Example compared MSO LO

Looks OK now, I'll set Verified.

I see that bullets are wrong on filesave and reopen, I hope that's not related.
I may bibisect later.
Comment 19 Timur 2020-05-08 12:02:34 UTC
Bullet is bug 132766.
Comment 20 Justin L 2020-05-15 08:30:10 UTC
Good thing I didn't backport this. It exposed nasty import weaknesses, as reported in bug 133000. Numbering code is so fragile...