Bug 141209 - URP bridge disposed by invalid UTF-16 (lone low surrogate) in Writer document
Summary: URP bridge disposed by invalid UTF-16 (lone low surrogate) in Writer document
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: sdk (show other bugs)
Version:
(earliest affected)
7.0.5.2 release
Hardware: x86-64 (AMD64) Windows (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-03-23 18:38 UTC by Thomas K
Modified: 2023-03-30 03:25 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
Java project which reproduces the problem (2.93 KB, application/x-zip-compressed)
2021-03-23 18:38 UTC, Thomas K
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas K 2021-03-23 18:38:54 UTC
Created attachment 170669 [details]
Java project which reproduces the problem

1. Start the attached Java project (It is based on the "DocumentLoader.java" example from the 7.1 SDK. So you can just drop the Main.java into your preferred environment if maven is not to your taste.) The example code starts a new LibreOffice Writer process, waits for 30 seconds and then prints the content of the document to sysout via com.sun.star.text.XTextDocument.getText().
2. To reproduce the problem enter a lone low surrogate into the Writer document. E.g. enter "df09" into the document and then press Alt+X (or Alt+C depending on your locale).
3. Wait until the 30 s timeout has passed.

Observed result:
----------------
The example program will exit with the following exception:
com.sun.star.lang.DisposedException
    at com.sun.star.lib.uno.environments.remote.JobQueue.removeJob(JobQueue.java:201)
    at com.sun.star.lib.uno.environments.remote.JobQueue.enter(JobQueue.java:308)
    at com.sun.star.lib.uno.environments.remote.JobQueue.enter(JobQueue.java:281)
    at com.sun.star.lib.uno.environments.remote.JavaThreadPool.enter(JavaThreadPool.java:81)
    at com.sun.star.lib.uno.bridges.java_remote.java_remote_bridge.sendRequest(java_remote_bridge.java:619)
    at com.sun.star.lib.uno.bridges.java_remote.ProxyFactory$Handler.request(ProxyFactory.java:145)
    at com.sun.star.lib.uno.bridges.java_remote.ProxyFactory$Handler.invoke(ProxyFactory.java:129)
    at com.sun.proxy.$Proxy6.getString(Unknown Source)
    at com.vector.Main.main(Main.java:28)
Caused by: java.io.EOFException
    at java.base/java.io.DataInputStream.readInt(DataInputStream.java:396)
    at com.sun.star.lib.uno.protocols.urp.urp.readBlock(urp.java:364)
    at com.sun.star.lib.uno.protocols.urp.urp.readMessage(urp.java:96)
    at com.sun.star.lib.uno.bridges.java_remote.java_remote_bridge$MessageDispatcher.run(java_remote_bridge.java:92)

The LibreOffice process is still running afterwards but the socket connection is closed. (There is no issue when “closing” the surrogate pair, e.g. for the input 🌉(\ud83c\udf09).)

Expected result
---------------
The URP bridge is still usable for further requests by the "client" after catching the exception for getText().
Comment 1 Miklos Vajna 2021-03-24 08:22:32 UTC
Patch to insert the problematic string on F12 into Writer in debug mode:

diff --git a/sw/source/uibase/docvw/edtwin.cxx b/sw/source/uibase/docvw/edtwin.cxx
index 93560ec91f3b..6dee14bc7668 100644
--- a/sw/source/uibase/docvw/edtwin.cxx
+++ b/sw/source/uibase/docvw/edtwin.cxx
@@ -1458,6 +1458,13 @@ void SwEditWin::KeyInput(const KeyEvent &rKEvt)
 
     if ( getenv("SW_DEBUG") && rKEvt.GetKeyCode().GetCode() == KEY_F12 )
     {
+        // xéx
+        // std::vector<sal_Unicode> aChars = {0x0078, 0x00e9, 0x0078, 0x0000};
+        // x, low surrogate, x
+        std::vector<sal_Unicode> aChars = {0x0078, 0xdf09, 0x0078, 0x0000};
+        rSh.Insert2(OUString(aChars.data()));
+        return;
+
         if( rKEvt.GetKeyCode().IsShift())
         {
             GetView().GetDocShell()->GetDoc()->dumpAsXml();

That being said, I can reproduce this. CC Stephan (just FYI)
Comment 2 Stephan Bergmann 2021-03-29 06:37:44 UTC
It IMO does not make sense for Writer to allow non-valid Unicode (i.e., lone surrogates) in its content model.

Without looking at the example code, the relevant UNO interfaces apparently use the UNO string type to communicate some Writer document content.  The UNO string type models "arbitrary-length sequences of Unicode scalar values" (<http://www.openoffice.org/udk/common/man/typesystem.html>), and the UNO bridges generally do not allow to pass data that violates the UNO type system.
Comment 3 QA Administrators 2023-03-30 03:25:36 UTC
Dear Thomas K,

To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year.

There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present.

If you have time, please do the following:

Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/

If the bug is present, please leave a comment that includes the information from Help - About LibreOffice.
 
If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice.

Please DO NOT

Update the version field
Reply via email (please reply directly on the bug tracker)
Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not 
appropriate in this case)


If you want to do more to help you can test to see if your issue is a REGRESSION. To do so:
1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from https://downloadarchive.documentfoundation.org/libreoffice/old/

2. Test your bug
3. Leave a comment with your results.
4a. If the bug was present with 3.3 - set version to 'inherited from OOo';
4b. If the bug was not present in 3.3 - add 'regression' to keyword


Feel free to come ask questions or to say hello in our QA chat: https://web.libera.chat/?settings=#libreoffice-qa

Thank you for helping us make LibreOffice even better for everyone!

Warm Regards,
QA Team

MassPing-UntouchedBug