Bug 75263

Summary: FILESAVE XLS Cyrillic (Russian) characters inserted by macro appear as question marks
Product: LibreOffice Reporter: sergejs.kozlovics
Component: BASICAssignee: Andreas Heinisch <andreas.heinisch>
Status: RESOLVED FIXED    
Severity: normal CC: alexkaltsas, himajin100000, igor.sheludko, ilmari.lauhakangas, kelemeng, post.box
Priority: medium Keywords: bibisected, filter:xls, regression
Version: 5.1.0.3 release   
Hardware: Other   
OS: All   
See Also: https://bugs.documentfoundation.org/show_bug.cgi?id=107882
Whiteboard: target:7.4.0
Crash report or crash signature: Regression By:
Bug Depends on:    
Bug Blocks: 108908    
Attachments: workbook with macro, described in the testcase
Tests of diferent configurations for the bug

Description sergejs.kozlovics 2014-02-20 15:48:55 UTC
I opened an xls file containing sheets with Russian names. There was a VBA macro that accessed those sheets. When I tried to launch the macro, LibreOffice Calc crashed.


When I looked at the macro, I saw that Russian strings (sheet names, etc.) were mangled, i.e., incorrectly converted from Excel (in Excel all Russian strings are OK). When I manually typed normal Russian strings in VBA editor in Calc (instead of the mangled ones), LibreOffice Calc stopped crashing.

It seems that also other functions, like MsgBox do not work with such mangled Russian arguments.
Comment 1 Buovjaga 2014-11-08 14:16:37 UTC
Please attach a document we can test with and set to UNCONFIRMED.
Comment 2 Urmas 2014-11-08 23:04:17 UTC
Is your macro password-protected?
Comment 3 QA Administrators 2015-06-08 14:28:45 UTC Comment hidden (obsolete)
Comment 4 QA Administrators 2015-07-18 17:27:21 UTC Comment hidden (obsolete)
Comment 5 Igor 2020-06-26 22:29:56 UTC
Created attachment 162439 [details]
workbook with macro, described in the testcase

The workbook contents macro to represent the bug regarding corrupted Cyrillic letters in macro body, after the file is saved as .xls in LO.
It writes word "проба" in cell A1. But after being saved by LO, the letters are corrupted and the macro prints "?????" instead of "проба"
The content of the macro:

Sub Macro1()
    ActiveSheet.Cells(1, 1).Value = "проба"
End Sub
Comment 6 Buovjaga 2020-06-27 10:06:17 UTC
Igor: I can reproduce with your file, but only on Linux. It works fine on Windows. Which operating system were you using?
Was the file created in Microsoft Office? Which version?

Arch Linux 64-bit
Version: 7.1.0.0.alpha0+
Build ID: 076c95b27bf0e9be1fa1c077674cf974b22210fd
CPU threads: 8; OS: Linux 5.7; UI render: default; VCL: kf5
Locale: fi-FI (fi_FI.UTF-8); UI: en-US
Calc: threaded
Built on 27 June 2020
Comment 7 Buovjaga 2020-06-27 10:18:30 UTC
I tested further and noted the problem is not yet in Linux 50max bibisect repo, but is already in the oldest of 6.3 repo. So this is not the same issue as the original 2014 report. Let's keep hijacking this report anyway.
Comment 8 Igor 2020-06-29 13:23:07 UTC
Buovjaga: Windows 10 LTSB 64bit English on Intel x86 CPU. 
I have this issue on LO 6.3, 6.4 and now on 7.0
The xls is created on Excel 2016 EN, I can try other versions as well.

Steps to reproduce (I missed to write it ):
1. Create .xls file in MS Excel with macro (or use the attached file on 2020-06-26):
Sub Macro1()
    ActiveSheet.Cells(1, 1).Value = "проба"
End Sub
2. Save and close the xls file.
3. Open in Calc - everything works fine. You can see "проба" inside macro body.
4. Save the xls file using Calc. Close Calc
5. Open the xls file in Calc again. Open the macro for edit. The word "проба" is corruped, and shows as "?????"


In 2016-17 we had the same issue in another place (Win 7 Ultimate EN, Excel 2003 EN, LO 5.1 fresh branch). We downgraded to LO 5.0.6.3 still branch - no bug there. That configuration still works fine there. If it is helpful, I can test LO 5.0.6.3 and LO 7.0 with Win 10 EN, Win 8 EN and with Excel 2003/2013/2016.
Comment 9 Buovjaga 2020-06-29 13:31:18 UTC
Hmm, weird that I was unable to reproduce on Win 10 with any version. Maybe there is some dependency that we have no clue on.
Comment 10 Igor 2020-06-29 13:44:13 UTC
Buovjaga: I am sorry, I have launched the old configuration (Win 7, Excel 2003, LO 5.0.6.3) and it have another bug there - it removes the macro module, after saving.. So I will test it with different configurations and prepare a pivot report of the results. So it doesn't work fine, as I have pointed earlier.
Comment 11 Igor 2020-06-29 22:21:41 UTC
Created attachment 162528 [details]
Tests of diferent configurations for the bug

Here I fill the tests. Until this moment I have made 13 tests on 3 computers.
Version 5.0.6.3 and some versions earlier work fine on all computers. Next version (5.1.0.1) corrupts the file.
There are video-links attached as well, so you can see what exactly I see.
Thanks, Igor
Comment 12 Buovjaga 2020-06-30 19:39:19 UTC
Hmm, weird, now I get the problem with fresh master on Windows.

I bibisected with Win 5.1 repo and there were some skipped commits and also a chunk containing a range of commits. I think this is roughly the range where the problem appeared:
https://git.libreoffice.org/core/+log/c5aeca430288057a721688975173ed764860d8b8..f19402658dce6944e82a9058a6888e488b37b336

It certainly contains commits to vba export code.
Comment 13 Mike Kaganski 2020-10-05 16:26:04 UTC
LibreOffice reads PROJECTCODEPAGE record [1] (in this file it's Windows-1251) in VbaProject::readVbaModules (oox/source/ole/vbaproject.cxx). It uses the codepage to decode the strings in the project, but then forgets the value, and when writing, uses hard-coded Win-1252, naturally turning everything not representable in that codepage into question marks.

Code pointer: writePROJECTCODEPAGE in oox/source/ole/vbaexport.cxx.

[1] [MS-OVBA] sect. 2.3.4.2.1.4 https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-ovba/575462ba-bf67-4190-9fac-c275523c75fc
Comment 14 Andreas Heinisch 2022-04-17 15:26:00 UTC
*** Bug 107882 has been marked as a duplicate of this bug. ***
Comment 15 Andreas Heinisch 2022-04-17 15:29:09 UTC
*** Bug 118179 has been marked as a duplicate of this bug. ***
Comment 16 Commit Notification 2022-04-25 20:11:10 UTC
Andreas Heinisch committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/e4bdf75260169898f3204cae7071d91da5946e09

tdf#75263 - Export VBA-Project using detected charset on import

It will be available in 7.4.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 17 Mike Kaganski 2022-07-06 19:57:10 UTC
*** Bug 149882 has been marked as a duplicate of this bug. ***