Bug 160970

Summary: Problem in command line file conversion (XLSX to DBF) with special character
Product: LibreOffice Reporter: joerg.goerner
Component: filters and storageAssignee: Not Assigned <libreoffice-bugs>
Status: NEEDINFO ---    
Severity: normal CC: aron.budea, joerg.goerner, stephane.guillou
Priority: medium    
Version: 7.6.6.3 release   
Hardware: All   
OS: All   
URL: https://ask.libreoffice.org/t/change-codepage-in-dbf-file/62278
Whiteboard:
Crash report or crash signature: Regression By:
Bug Depends on:    
Bug Blocks: 103266    
Attachments: Address list as sample

Description joerg.goerner 2024-05-07 08:02:05 UTC
Description:
I use the file conversion methode in the command line like this:
"C:\Program Files\LibreOffice\program\scalc.exe" --convert-to dbf Testlist.xlsx

If a cell contains a string with the czech character 'š' (ASCII 154) conversion ends before this row. I have also tried it with different character sets.

Steps to Reproduce:
1. Creating a simple address list in excel, like this:
   PLZ	ORT	STRASSE
   14169	Berlin	Teltower Damm 1
   140 00	Praha	Antala Staška 2
   42781	Haan	Schallbruch 3
2. Save the Excel file
3. Try to convert the excel-file by command line

Actual Results:
The dbf-file will end with after first line of data 

Expected Results:
the complete address list with all records


Reproducible: Always


User Profile Reset: No

Additional Info:
Version: 7.6.6.3 (X86_64) / LibreOffice Community
Build ID: d97b2716a9a4a2ce1391dee1765565ea469b0ae7
CPU threads: 12; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: de-DE (de_DE); UI: de-DE
Calc: CL threaded
Comment 1 joerg.goerner 2024-05-07 08:06:19 UTC
Created attachment 194012 [details]
Address list as sample
Comment 2 Stéphane Guillou (stragu) 2024-05-23 05:17:55 UTC
If using the GUI, the default character set used is "Western Europe (DOS/OS2-850/International), which results in this error message:

Error saving the document Testlist:
Write Error.
Cell SfxBaseModel::impl_store <file:///home/stragu/Downloads/Testlist.dbf>
failed: 0x40c03(Error Area:Sc Class:Write Code:3) arg1=C3 arg2=Western
Europe (DOS/OS2-850/International) at /home/tdf/lode/jenkins/workspace/
lo_gerrit/tb/src_master/sfx2/source/doc/sfxbasemodel.cxx:3304 contains
characters that are not representable in the selected target character set "$
(ARG2)".

Resulting file only has one address.

Using the command line, I get in the console:

warn:connectivity.drivers:151848:151848:connectivity/source/drivers/dbase/DTable.cxx:521: Parsing warning: 0 records claimed, recovering
warn:sc:151848:151848:sc/source/ui/docshell/docsh8.cxx:986: ScDocShell::DBaseExport com.sun.star.sdbc.SQLException message: "The string “Antala Staška 2” cannot be converted using the encoding “ibm850”. at /home/tdf/lode/jenkins/workspace/lo_gerrit/tb/src_master/connectivity/source/commontools/dbtools2.cxx:910" SQLState: 22018 ErrorCode: 22018
    wrapped: 
warn:sc:151848:151848:sc/source/ui/docshell/docsh8.cxx:1045: ScDocShell::DBaseExport encoding error, string with default replacements: ``Antala Staška 2''
Error: Please verify input parameters... (SfxBaseModel::impl_store <file:///home/stragu/Downloads/Testlist.dbf> failed: 0x40c03(Error Area:Sc Class:Write Code:3) arg1=C3 arg2=Western Europe (DOS/OS2-850/International) at /home/tdf/lode/jenkins/workspace/lo_gerrit/tb/src_master/sfx2/source/doc/sfxbasemodel.cxx:3304 at /home/tdf/lode/jenkins/workspace/lo_gerrit/tb/src_master/sfx2/source/doc/sfxbasemodel.cxx:1822)

Same result.

One would need to pick a suitable character set for it, see: https://help.libreoffice.org/latest/en-US/text/shared/guide/lotusdbasediff.html

For example this works for me, using the encoding "Windows-1250/WinLatin 2 (Central European)":

soffice --headless --convert-to dbf:dBase:33 ./Testlist.xlsx

Does an equivalent command work for you?