Bug 140235

Summary: Plenty of list-related character styles created on DOCX export which clearly not needed
Product: LibreOffice Reporter: Telesto <telesto>
Component: WriterAssignee: Not Assigned <libreoffice-bugs>
Status: NEW ---    
Severity: normal CC: eyalroz1, jluth, libreoffice, michelle, nemeth, telesto, vmiklos
Priority: medium    
Version: unspecified   
Hardware: All   
OS: All   
See Also: https://bugs.documentfoundation.org/show_bug.cgi?id=133410
https://bugs.documentfoundation.org/show_bug.cgi?id=147673
Whiteboard:
Crash report or crash signature: Regression By:
Bug Depends on:    
Bug Blocks: 108769    
Attachments: Example file
Simple reproducer file
The reproducer document saved to docx
The docx version of the reproducer saved back to odt
The original reproducer and its docx version
The docx version and the odt saved from it

Description Telesto 2021-02-07 11:11:14 UTC
Description:
Plenty of styles created on DOCX export which clearly not needed

Steps to Reproduce:
1. Open the attached file
2. Save as DOCX
3. File reload
4. Go to character styles.. notice a whole list of additional styles
5. CTRL+A
6. CTRL+C
7. CTRL+V
8. Save
9. File reload -> Most of the crap gone
10. Save to ODT -> Still no crap


1. Open the attached file
2. Save as DOCX
3. File reload
4. Go to character styles.. notice a whole list of additional styles
5. Save to ODT -> ODT filled with junk

Actual Results:
Plenty of junk styles.. and those styles opening/saving files pretty slow

Expected Results:
Less junk makes managing styles lovelier & speed-up


Reproducible: Always


User Profile Reset: No



Additional Info:
Version: 7.2.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: 3ed9bba283a6a67864c0928186e277240be0d9ba
CPU threads: 4; OS: Windows 6.3 Build 9600; UI render: Skia/Raster; VCL: win
Locale: nl-NL (nl_NL); UI: en-US
Calc: CL
Comment 1 Telesto 2021-02-07 11:11:34 UTC
Created attachment 169550 [details]
Example file
Comment 2 Telesto 2021-02-07 11:14:08 UTC
@Justin
We had a bug about this somewhere.. And if I the styles code was to horribly broken (and to complex). But this might give an approach to to get rid of most of the junk styles without a big re-factoring
Comment 3 Justin L 2021-03-04 07:37:56 UTC
Specifically, these are character styles ListLabel1 - ListLabel171. A second round-trip brings it to ListLabel225.

A similar report for DOC is bug 133410 which says,
see LO 5.0.6's tdf#95213 DOCX import: don't reuse list label styles.

which was mollified somewhat by LO 6.3's tdf#92335 DOCX: fix multiplying of "ListLabel" styles.

Especially important is "However, making a change here would be fraught with danger."
Comment 4 Telesto 2021-03-04 08:58:01 UTC Comment hidden (no-value)
Comment 5 Miklos Vajna 2021-03-04 09:01:38 UTC Comment hidden (no-value)
Comment 6 NISZ LibreOffice Team 2021-03-26 10:11:52 UTC
I think the problem here is at import time. Doing this:

Steps to Reproduce:
1. Open the attached file
2. Save as DOCX
3. File reload
4. Go to character styles.. notice a whole list of additional styles

does not result in extra character styles being saved to the docx files styles.xml (since bug #92335 was fixed).

These extra styles appear only at docx import time, then these can be saved to odt (unfortunately).
Comment 7 NISZ LibreOffice Team 2021-03-26 10:12:36 UTC
Created attachment 170749 [details]
Simple reproducer file
Comment 8 NISZ LibreOffice Team 2021-03-26 10:14:16 UTC
Created attachment 170750 [details]
The reproducer document saved to docx
Comment 9 NISZ LibreOffice Team 2021-03-26 10:16:46 UTC
Created attachment 170751 [details]
The docx version of the reproducer saved back to odt

This has the extra character styles saved to sytles.xml
Comment 10 NISZ LibreOffice Team 2021-03-26 10:19:10 UTC
Created attachment 170752 [details]
The original reproducer and its docx version

The problem starts at the docx import.
Comment 11 NISZ LibreOffice Team 2021-03-26 10:20:43 UTC
Created attachment 170753 [details]
The docx version and the odt saved from it

Once these fake ListLabel character styles are created they are saved to odt.

Version: 7.2.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: 2fb274950e5207ca55f4f52325fb522bd44024e1
CPU threads: 4; OS: Windows 10.0 Build 18363; UI render: Skia/Raster; VCL: win
Locale: en-US (hu_HU); UI: en-US
Calc: CL
Comment 12 Justin L 2021-03-31 18:07:18 UTC
(In reply to NISZ LibreOffice Team from comment #11)
> Once these fake ListLabel character styles are created they are saved to odt.

I don't think we would want to do anything to try and stop this though.
However, one possibility would be to create a "clean up document" tool that goes through and removes excess styles, obsolete direct formatting, etc.  (This, of course, is a HUGE enhancement idea, and so far we can't even get someone to write a compress-all-pictures tool...)
Comment 13 NISZ LibreOffice Team 2021-07-13 07:05:50 UTC
*** Bug 143237 has been marked as a duplicate of this bug. ***
Comment 14 Eyal Rozenberg 2023-01-26 20:31:42 UTC
(In reply to Justin L from comment #12)
> (In reply to NISZ LibreOffice Team from comment #11)
> > Once these fake ListLabel character styles are created they are saved to odt.
> 
> I don't think we would want to do anything to try and stop this though.

Why? I mean, if styles are generated artifically, why not prevent that from happening? Or at least, use more stringent criteria to decide when a style is generated?

> However, one possibility would be to create a "clean up document" tool that
> goes through and removes excess styles, obsolete direct formatting, etc. 
> (This, of course, is a HUGE enhancement idea, and so far we can't even get
> someone to write a compress-all-pictures tool...)

and it would also be a separate bug IMHO.