Bug 118972

Summary: Writer should recognize ToC entries structure and formatting from MSO DOC/X (test with Update index)
Product: LibreOffice Reporter: Timur <timur>
Component: WriterAssignee: Not Assigned <libreoffice-bugs>
Status: NEW ---    
Severity: enhancement CC: aron.budea, himajin100000, hmslima1992, kelemeng, libreoffice, paul.jowett, xiscofauli
Priority: medium    
Version: Inherited From OOo   
Hardware: All   
OS: All   
See Also: https://bugs.documentfoundation.org/show_bug.cgi?id=65100
https://bugs.documentfoundation.org/show_bug.cgi?id=121561
https://bugs.documentfoundation.org/show_bug.cgi?id=121523
https://bugs.documentfoundation.org/show_bug.cgi?id=123429
https://bugs.documentfoundation.org/show_bug.cgi?id=137988
https://bugs.documentfoundation.org/show_bug.cgi?id=115798
https://bugs.documentfoundation.org/show_bug.cgi?id=143722
Whiteboard:
Crash report or crash signature: Regression By:
Bug Depends on:    
Bug Blocks: 104524, 112862    
Attachments: Test MSO DOCX with ToC
Test MSO DOC with ToC
Sample DOCX with different fonts
Comparison of MSWord vs Writer rendering

Description Timur 2018-07-27 14:57:11 UTC
Created attachment 143804 [details]
Test MSO DOCX with ToC

Writer and Word seem to be mutually not interoperable on import to recognize entries set in Table of Contents (ToC).
While ToC looks fine on import, Update index shows that they don't really recognize and import ToC structure.

To test, open in LO attached Test DOCX with ToC created in MSO and right-click Update index. Structure is changed and that is obvious because default tab stops are set differently. That doesn't happen with custom ToC saved in ODT. 

I'll set as Enhancement. 
This is Fileopen. Filesave to DOC and DOCX is another issue. Update index on fileopen of RT file loses structure.
Comment 1 Timur 2018-07-27 14:58:09 UTC
Created attachment 143805 [details]
Test MSO DOC with ToC
Comment 2 Xisco FaulĂ­ 2018-07-31 09:50:46 UTC
Confirmed in

Version: 6.2.0.0.alpha0+
Build ID: 72b099d279e7096d41a04fe8c0dd493a5fc18a33
CPU threads: 4; OS: Linux 4.15; UI render: default; VCL: gtk3; 
Locale: ca-ES (ca_ES.UTF-8); Calc: group threaded
Comment 3 Timur 2018-08-01 16:28:45 UTC
*** Bug 118259 has been marked as a duplicate of this bug. ***
Comment 4 Aron Budea 2018-09-18 12:07:31 UTC
Created attachment 144984 [details]
Sample DOCX with different fonts

And here's a sample with different fonts in the ToC that get lost upon update. While it's not strictly structure, I'd say it belongs in this ticket (if not, I can open a separate one).
Comment 5 Timur 2019-08-28 07:45:17 UTC
*** Bug 126998 has been marked as a duplicate of this bug. ***
Comment 6 Alvaro Segura 2020-05-09 23:29:49 UTC
Created attachment 160582 [details]
Comparison of MSWord vs Writer rendering

This screenshots show this issue. It also highlight the importance of keeping those tabs in documents like this one, which try to keep a proper alignment of numbers and titles (a style so necely done be LaTeX for example). This is IMHO more important than it seems.

The style is kept upong loading the file, but is lost when updateing the TOC.


Here is a possible hint:

My guess is that Writer's and Word's TOC systems are different (Writer's is more sophisticated, I'd say). In Writer one can explicitly define that a TAB must exist between number and title. However, Word does not specify anything about the separa tion of number and title. What Word does, I think, is use the same format used in the numbering of headers in the document. If the document has TABS after section numbers then the TOC will have TABS, too.

That is defined in the format of the multilevel numbering scheme. "Define new multilevel list", hitting "More >>" to expand the dialog to more options, and selecting an option for "Number followed by:". Can be TAB, SPACE or NOTHING. and this affects both the headings throughout the document, and the Table of Contents.

Then, to achieve the same results, Writer could look at the existing format of multilevel heading numbering (which is correctly read). And add this TAB to TOC elements if the multilevel numbering uses TABS.