Bug 141416

Summary: [FILEOPEN] Excel file very long to open (more than one hour)
Product: LibreOffice Reporter: laurent combe <laurent.combe>
Component: CalcAssignee: Xisco Faulí <xiscofauli>
Status: RESOLVED FIXED    
Severity: normal CC: himajin100000, markus.mohrhard, xiscofauli
Priority: medium Keywords: bibisected, bisected, perf, regression
Version: 5.0.0.5 release   
Hardware: All   
OS: All   
See Also: https://bugs.documentfoundation.org/show_bug.cgi?id=81396
Whiteboard: target:7.2.0 target:7.1.3
Crash report or crash signature: Regression By:
Attachments: very long time needed to open this file
bisect log
same as GA2 but with less data so open quickly
ODS file with data reduced to 270 lines in fourth tab (prevision)
XSLX file with data reduced to 270 lines in fourth tab (prevision)
flamegraph (opening of suiviGA_270.xlsx file)

Description laurent combe 2021-04-01 07:27:32 UTC
Created attachment 170886 [details]
very long time needed to open this file

I have a example of a Excel file not opening normally in LibreOffice

It's a Excel file in .xslm format. I'm aware that Excel macro are not managed by LibreOffice i just need to open this file to read it with LibreOffice (i'm not using macro)

when i try to open this file under Libreoffice 7.1 (Ubuntu 18.04)
the soffice process consume 1 cpu thread at 100%
i leave it and 1h ~ 1h30 later the file is opening !!

it's not a pb of capacity of my PC

i remove personal data from the file
and i can submit a example file showing this issue
Comment 1 Xisco Faulí 2021-04-01 08:15:42 UTC
Reproduced in

Version: 7.2.0.0.alpha0+ / LibreOffice Community
Build ID: 7da7f6ca37c92ab33e34a76fd25efab526b7c80a
CPU threads: 4; OS: Linux 5.7; UI render: default; VCL: x11
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded

and

Version: 5.2.0.0.alpha1+
Build ID: 5b168b3fa568e48e795234dc5fa454bf24c9805e
CPU Threads: 4; OS Version: Linux 4.8; UI Render: default; 
Locale: ca-ES (ca_ES.UTF-8)

but not in

Version: 5.0.0.0.alpha1+
Build ID: 0db96caf0fcce09b87621c11b584a6d81cc7df86
Locale: ca-ES (ca_ES.UTF-8)

thus, it's a regression and it has to be bisected with repo 5.1 on windows
Comment 2 laurent combe 2021-04-02 14:29:47 UTC
no bibisect for linux 5.1 ? i've found only 5.2 ...
Comment 3 laurent combe 2021-04-03 13:48:03 UTC Comment hidden (obsolete)
Comment 4 laurent combe 2021-04-03 13:49:55 UTC
Created attachment 170933 [details]
bisect log
Comment 5 laurent combe 2021-04-04 15:36:14 UTC
Created attachment 170948 [details]
same as GA2 but with less data so open quickly
Comment 6 laurent combe 2021-04-04 15:56:59 UTC
Created attachment 170949 [details]
ODS file with data reduced to 270 lines in fourth tab (prevision)
Comment 7 laurent combe 2021-04-04 15:57:47 UTC
Created attachment 170950 [details]
XSLX file with data reduced to 270 lines in fourth tab (prevision)
Comment 8 laurent combe 2021-04-04 16:05:10 UTC
the perf issue is due to the data in the fourth tab (named 'Prevision')

i reduce the number of lines in this tab (i keep 270 lines) and
save it (with Libroffice 7.1) in :
 - ODS format (cf attached file) : suiviGA_270.ods
 - XLSX format (cf attached file) : suiviGA_270.xslx

Libreoffice opens ODS file in few seconds
Libreoffice opens XLSX file in ~40 seconds on my PC

so it appears to be dependant of the file format ... i hope it helps to find this perf issue ...
Comment 9 laurent combe 2021-04-05 09:47:36 UTC
Created attachment 170962 [details]
flamegraph (opening of suiviGA_270.xlsx file)

i've generated the flamegraph corresponding to opening of suiviGA_270.xlsx file)

see attached file

i've launch callgrind against the reading of the two files (ods and xslx). it seems a little big to post them here. but i kept these trace, if someone need them
Comment 10 Xisco Faulí 2021-04-06 08:53:36 UTC
The bisection done in comment 3 doesn't seem to be correct. If the patch is reverted locally the performance issue is still present
Comment 11 Xisco Faulí 2021-04-08 10:41:34 UTC
Regression introduced by:

https://cgit.freedesktop.org/libreoffice/core/commit/?id=d4743045a0b320449d07a957463a76bb8b13f939

author	Markus Mohrhard <markus.mohrhard@googlemail.com>	2015-10-24 09:45:58 +0200
committer	Eike Rathke <erack@redhat.com>	2015-10-25 16:30:00 +0000
commit d4743045a0b320449d07a957463a76bb8b13f939 (patch)
tree 4501ab47ec45b2a47e29fd5dcead4b69d5b53590
parent 5ce68783148aa77d77086aac220fabdfa211429d (diff)
the cells need to be imported before we handle charts, tdf#81396

Bisected with: win32-5.1

Adding Cc: to Markus Mohrhard
Comment 12 Commit Notification 2021-04-09 09:15:50 UTC
Xisco Fauli committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/30f222c91fa816a7863bf4bfc4a36e503e0bf2d3

tdf#141416: partial revert of the fix for tdf#81396

It will be available in 7.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 13 Xisco Faulí 2021-04-09 09:16:38 UTC
Import time is around 1 minute after my patch.
@Laurent, please verify with a daily build
Comment 14 Commit Notification 2021-04-10 08:38:37 UTC
Xisco Fauli committed a patch related to this issue.
It has been pushed to "libreoffice-7-1":

https://git.libreoffice.org/core/commit/77e95e208c9d22eb1350d75135e09426c16a6726

tdf#141416: partial revert of the fix for tdf#81396

It will be available in 7.1.3.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 15 laurent combe 2021-04-10 09:34:59 UTC
with master~2021-04-09_15.49.45_LibreOfficeDev_7.2.0.0.alpha0_Linux_x86-64_deb.tar.gz i can confirm that the original file is opening in ~40s.

great job !