Bug 157994 - "Error in writing sub-document content.xml", with large files
Summary: "Error in writing sub-document content.xml", with large files
Status: UNCONFIRMED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
7.6.2.1 release
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-10-31 00:49 UTC by JoshYang
Modified: 2024-02-28 15:48 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
main error message (427.24 KB, image/png)
2023-10-31 00:49 UTC, JoshYang
Details
Screenshot 1 of the bug on LibreOffice 24_2_0 on an Intel Mac (294.95 KB, image/png)
2024-02-28 15:32 UTC, rvmanza
Details
Screenshot 2 of the bug on LibreOffice 24_2_0 on an Intel Mac (298.05 KB, image/png)
2024-02-28 15:32 UTC, rvmanza
Details

Note You need to log in before you can comment on or make changes to this bug.
Description JoshYang 2023-10-31 00:49:01 UTC
Created attachment 190524 [details]
main error message

There seem to be many reports here and on the internet about this error, people seem to have been getting it for over a decade. At least some of the reports of this error message seem to have been caused by something else, since the authors claim the problem has been solved for them, other reports have remained unsolved for years.

I think this is the first time someone has encountered this error and is able to provide files and steps to actually reproduce it.

My system info: https://pastebin.com/raw/gNFjGpAQ
Libreoffice version: 
Version: 7.6.2.1 (X86_64) / LibreOffice Community
Build ID: 56f7684011345957bbf33a7ee678afaf4d2ba333
CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Flatpak
Calc: threaded

I can't share the original files causing this error publicly (of which I have more than one) and it took hours to create files which can reproduce the issue reliably, so please be nice.

The example files are exagerrated (too large, images too high res) to ensure the error will be reproducible, real files are much more sane and will be described in the end of this description.

Here are two example ODT files: https://e.pcloud.link/publink/show?code=kZhmesZdfD1h7n0KmVIjk2czH7XK5ueeGVy
(few hundred MB so I couldn't attach here)

The files contain placeholder text and many images from Pixabay.
To reproduce the issue:

1) Open both files. This is critical. If only one file is opened at a time, the bug doesn't happen.

2) Copy the contents of "test 2.odt" into the last blank page of "test 1.odt". 

3) Save "test 1.odt". There is a 50% chance in my case that you will get the error "error.png" attached here.

4) If the error does not happen, just paste again and save "test 1.odt" again. In my case there's a 100% chance you'll get the error now. If you won't on your system, please try pasting and saving few more times.
I've made sure I'm not running out of RAM when this happens.

5) After getting the error, if you'll close the error window and try saving again, both Libreoffice ODT file windows will simply silently crash and close.

6) If you'll try opening Libreoffice again, it will either silently crash at the banner (but not the second time), or will launch the recovery wizard like so: https://i.ibb.co/LPj0mbP/error-2.png

7) If you'll attempt a recovery of "test 1.odt", very likely it will fail like so but not always: https://i.ibb.co/sFhGkVp/error-3.png

8) Same with "test 2.odt" which it will try to recover next, very likely it will fail like so but not always: https://i.ibb.co/bdcfqcJ/error-4.png

Here's a real life example of this happening:

I have many research documents with many medium resolution (~1024x1024) images in them and moderate amount of text. Their sizes range in 100-400MB and each are at least 50 pages. Often times I need two such documents open at once, since I need to fill data in both as I work. Sometimes, when adding a text and a single medium resolution image to one, saving it will cause this error.

I can close the second ODT file before saving to avoid the error, but closing and reopening the second window seriously slows down my worlflow.

I've been experiencing this bug for around 6 months, it happens randomly. I've lost many hours of work because of it, by forgetting to save every few minutes.

Sometimes after the crash, if I open the unsaved file, fill it with the same text and add the same image, it won't crash the second time (even with the second ODT file open). This is why it's so extremely hard to report this bug and provide an example file. It's not only the issue of people being unable to share their original files, but also due to the error requiring large files and not having very clear steps that always guarantee it will happen.
Comment 1 JoshYang 2023-10-31 01:02:10 UTC
I'd like to clarify that this can happen with only one file open at a time. I do have one confidential file (470MB) which often throws this error when saved with nothing more added to it. I frogot about it when writing the bug report.

Sadly, I'm unable to create a test file which can reliably reproduce this behavior by itself, with no second file needed to be opened.
Comment 2 Buovjaga 2023-11-03 16:29:13 UTC
There are many reports found with the message:
https://bugs.documentfoundation.org/buglist.cgi?short_desc=Error%20in%20writing%20sub-document%20content.xml&short_desc_type=allwordssubstr

They have mainly been closed as worksforme at some point, though.
Comment 3 JoshYang 2023-11-03 16:37:22 UTC
I have a slight suspicion it may still be an error due to running out of RAM. The Linux Mint system monitor GUI shows over 1GB free (8GB total with this laptop), but it may be not updating or not providing the remainder to LibreOffice.

If there's any way to verify this, let me know.

If this is indeed the case, I think LibreOffice can do a better job at checking the system RAM and telling the user the RAM is the issue, rather than this non-descriptive error message.
Comment 4 JoshYang 2023-11-04 01:03:05 UTC
Happened again today while doing actual real work. Observe the System Monitor data: https://i.imgur.com/7kDaZbd.png

Same OS and LO version as original post.
Comment 5 JoshYang 2023-11-04 01:12:35 UTC
Sorry for double post, but I was able to repeat the error with a background program closed, which reduced RAM usage by another 400MB  (which is the size of the entire ODT file), and as you can see the System Monitor claims less than 80% RAM usage when the error happened this time: https://i.imgur.com/ABsKseP.png
Comment 6 JoshYang 2023-11-04 02:18:19 UTC
Being even more generous, almost nothing else running, and still the same crash. There's around 40% memory free and it's not even touching the increased swap memory which is barely utilized by anything else: https://i.imgur.com/HYy3NRT.png

I'm out of ideas.

I do find it odd how LibreOffice loads the whole document into RAM, and then if you save the file it then loads the whole data again into RAM until saving is complete (so like the temp file as the file is written also fully goes into RAM). Why not operate by 100MB RAM data chunks for the saving temp data and have a temp file on disk instead?
Comment 7 JoshYang 2023-11-07 13:04:33 UTC
I've upgraded the RAM of my laptop to 32GB. I can confirm the problem has gone away with all the files that produces this issue.
But hold your horses, this doesn't mean there isn't an issue with the Libreoffice code:

1) Libreoffice shouldn't copy all of the file to RAM when saving. It makes no sense, work in chunks. Your putting an artifical practical limit of 256GB on file sizes. And let's be real, it will be more like 8-32GB for most users.
And being more real, modern OSes, with something running in the background and the OS not allowing a program to use up last remaining gigabyte of RAM (like in my case) means ~6-7GB is already occupied, that leaves only 500-1000MB avilable for LO. That's not a lot and not at all hard to exceed for a large (50+ pages) document with a lot of uncompressed embedded image files.

2) The error message is useless, as we've seen in this discussion and many similar ones over the years about this. It simply tells you where the program couldn't proceed, not why. LO should check the RAM data and be able to provide an actual warning/error. At least put a "Out of RAM?" in the popup window.

3) LibreOffice should be able to use the swap memory for this when the RAM isn't available. I set up 32GB of swap memory and LO refused to use any of it when saving the file and throwing the errror.
Comment 8 JoshYang 2023-11-07 13:05:59 UTC
By the way, this "They have mainly been closed as worksforme at some point, though." is not acceptable. Bugs are not bugs simply because you can't reproduce them.
I can't believe this is your logic and makes me question if it makes sense to keep donating to a project when this is the level of thinking devs use.
Comment 9 Buovjaga 2023-11-07 13:10:44 UTC
(In reply to JoshYang from comment #8)
> By the way, this "They have mainly been closed as worksforme at some point,
> though." is not acceptable. Bugs are not bugs simply because you can't
> reproduce them.
> I can't believe this is your logic and makes me question if it makes sense
> to keep donating to a project when this is the level of thinking devs use.

Why would it be unacceptable for the reporters to close their own reports, if the problems have disappeared?
Comment 10 JoshYang 2023-11-07 13:18:07 UTC
I misread your message as you changing the label to worksforme, ignore it.
Comment 11 rvmanza 2024-02-28 15:32:03 UTC
Created attachment 192847 [details]
Screenshot 1 of the bug on LibreOffice 24_2_0 on an Intel Mac
Comment 12 rvmanza 2024-02-28 15:32:18 UTC
Created attachment 192848 [details]
Screenshot 2 of the bug on LibreOffice 24_2_0 on an Intel Mac
Comment 13 rvmanza 2024-02-28 15:39:33 UTC
After reading the comments I think I am affected by exactly the same bug commented in this entry.

I am using LibreOffice 24.2.0 on an Intel Mac.

The main difference with the comments posted so far is:
-I am able to reproduce the issue without any problems. I just open the document, add a bit of text or an image and bam! the bug appears. So no problem reproducing it.
-I am using lots of images in my document but the document is not huge (170 pages or so).
-The document has many images but not thousands, maybe around 150 or so images.
-The images are not very big. It's usually small images of sometimes 300 kbs, sometimes even less.
-The final document is not huge. I mean it is a 20 Mb document, which is a lot. But it is not 200 Mb.

-One thing is common with the comments commented so far: When I check the memory LibreOffice is using a huge amount of memory, more than 1 Gb, even 2 Gb sometimes. This doesn't make sense to me as the images are small. Even all the images combined cannot make more than 50 Mb or so... so where those 2 Gb come from?

I think unfortunately I will have to stop using LibreOffice and come back to LyX. I did the same document with LyX in the past and no issues at all. I just wanted to use a more WYSIWYG tool this time for the 2nd revised edition but it seems to me it is not possible :(

(If a developer wants to have the document for tests I have no problems to give it, it's a work that when finished will be published for free on the internet so... feel free to msg me for a copy of it to test the bug)
Comment 14 JoshYang 2024-02-28 15:48:26 UTC
I still think this is a RAM issue.
Still, LibreOffice must address this:
1) Provide actually descriptive error message
2) Don't copy the entire file content to RAM when saving, update the output file in chunks. Otherwise you're limiting max file size to max available RAM.

The reason I think this is still a RAM issue:
1) You did not mention what format the image files you imported into LO were. It's very possible they are stored uncompressed in the file or RAM
2) You did not provide your system specs. If you have small amount of RAM storage and small amount of free RAM, even 50 images could cause you to exceed it. Check your available RAM with (A) LO closed, (B) LO open with an empty file, (C) LO open with the file that causes the issue, (D) during saving the file that causes the issue.