Bug 158175 - FILEOPEN does not initialize some formulas/cells correctly
Summary: FILEOPEN does not initialize some formulas/cells correctly
Status: UNCONFIRMED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
7.6.2.1 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-11-11 12:07 UTC by nmatravolgyi
Modified: 2023-11-28 11:38 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
Erroneous formula values after load. (362.54 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2023-11-11 12:07 UTC, nmatravolgyi
Details

Note You need to log in before you can comment on or make changes to this bug.
Description nmatravolgyi 2023-11-11 12:07:53 UTC
Created attachment 190787 [details]
Erroneous formula values after load.

Hi!

There are some very finicky circumstances where some formula are not evaluated correctly when opening specific spreadsheets. I'm fairly sure it is a bug, because simply reordering any of the sheets in the workbook will result in the formulas in question to update and display the correct values.


History and debugging of the issue

1. First I've received an XLSX file with the issue of some "VLOOKUP" formulas not working. Upon analysis I could discern that the file opened in MS Office works as expected, but not in LibreOffice. Furthermore I noticed that saving the file as ODS and opening that file would work as expected. This lead me to believe there is an MS Office compatibility issue, but was proven wrong later.
2. With the original XLSX is opened, the formula evaluations that were wrong had search values with trailing spaces. Removing the spaces seemed to fix the issue, but after saving and opening the file, the fix would not always "stick". I have not investigated this further, because trailing-spaces or not the formula was valid and should work regardless.
3. The LibreOffice version that originally manifested the issue was Collabora 23.05 master (f64dcce7) while the same issue seemed to not be present in desktop LibreOffice 7.5.1-7.5.7 or 7.6.2.1. So I bisected the sourcetree and found that reverting bb9dcec7 on the master would "fix" the issue on the Collabora branch. All this lead me to believe I've found the issue, so I started to strip the privacy-sensitive information from the spreadsheet to produce a submitable reproduction example.
4. By complete chance I've made just the "right" changes for the issue to manifest even with the commit reverted and also in all of the mainline LibreOffice versions I've mentioned. What is even more interesting, the issue now is present regardless of the file being saved as XLSX or ODS.
5. Further investigation lead me to believe that the erroneous values are wrongly assumed to be correct at FILEOPEN and they are not re-calculated to be correct. The commit bb9dcec7 on the Collabora branch deals with a flag `mbGeneratorKnownGood` set in `WorkbookGlobals::initialize()`. If this flag is not set, the original XLSX is correctly loaded. So finally it seems that at some point the formula results got stored in a way that gets persisted across document saving and loading.


Reproduction

There are 5 sheets in the document, with the "RH" containing the issue described. The faulty evaluations are in $RH.E45:E74. The cells E45, E47, E52, E55, E61 and E69 of RH fail to evaluate and display "#N/A" while the other cells display "0" instead of being empty.

To trigger some evaluation update and see what it should look like, just grab the RH sheet and move it to be the last one for example.


Reproduction stability

The issue is not CPU/hardware dependent, I've reproduced on two different desktops (AMD64) and an ARM single-board-device as well.


Original cause

After falling down the rabbit hole, it seems that there could have been a specific operation (maybe the first XLSX opening by LibreOffice) that broke the "stored" value of cells with the formula. I'm not sure if I can find that out.


I guess it would be nice if LibreOffice would fix the obvious inconsistency at least when opening the document, even if we can't find out the original source of the issue. I'll try to get a hold of the original XLSX file, before it was ever opened in LibreOffice and see if there is any more information I can gather.

Let me know what more information would be useful, if any.


Version: 7.6.2.1 (X86_64) / LibreOffice Community
Build ID: 60(Build:1)
CPU threads: 6; OS: Linux 6.1; UI render: default; VCL: kf5 (cairo+xcb)
Locale: en-US (en_US.UTF-8); UI: en-US
7.6.2-3
Calc: threaded
Comment 1 nmatravolgyi 2023-11-11 22:27:32 UTC
Quick update: I've managed to acquire the original MS Office XLSX and I've determined that the original cause of the issue is fixed by "tdf#142905 XLSX import: don't trim leading spaces in formula cell" (7ec4c6df) in the 7.6 branch.

The attached file was imported with LibreOffice 7.5, so the XLSX import broke the document in a way that is not fixed automatically at open even in 7.6.

This issue can be seen as separate from tdf#142905 to handle such cases, or be closed as a duplicate of tdf#142905 if there is no chance of having a document being fixed at open after it got broken.
Comment 2 Stéphane Guillou (stragu) 2023-11-28 11:38:55 UTC
Thank you for the report and thorough description, nmatravolgyi!

Aron and Czeber, copying you in because your commits bb9dcec77c26df3ae7bfe940591a1a1d04002820 and 15e8278144a7326e9587a665e9e1068562309295 were referenced, in case you are interested or have an opinion on:

(In reply to nmatravolgyi from comment #1)
> This issue can be seen as separate from tdf#142905 to handle such cases, or
> be closed as a duplicate of tdf#142905 if there is no chance of having a
> document being fixed at open after it got broken.

My take is to mark as duplicate of bug 142905, with the workaround of using "recalculate hard", but is there anything extra that could be done, Czeber?