160221 – Precision error display on CSV import differs from display of manually created spreadsheet

Bug 160221 - Precision error display on CSV import differs from display of manually created spreadsheet

Summary: Precision error display on CSV import differs from display of manually create...

Status:	UNCONFIRMED

Alias:	None

Product:	LibreOffice
Classification:	Unclassified
Component:	Calc (show other bugs)
Version: (earliest affected)	7.6.5.2 release
Hardware:	All All

Importance:	medium enhancement
Assignee:	Not Assigned

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2024-03-15 17:41 UTC by Bruce H
Modified:	2024-03-15 20:00 UTC (History)
CC List:	0 users

See Also:
Crash report or crash signature:

Attachments
screenshot of the described issue (34.30 KB, image/jpeg) 2024-03-15 17:41 UTC, Bruce H	Details
Additional examples of this issue (192 bytes, text/csv) 2024-03-15 17:48 UTC, Bruce H	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Bruce H 2024-03-15 17:41:48 UTC

Created attachment 193131 [details]
screenshot of the described issue

This is an enhancement request.  I think there should be consistency in the display of the result of a subtraction operation, regardless of the method of creating the spreadsheet (manually vs. CSV import).

Some very simple math operations may have a result that is "wrong" due to floating point inaccuracies in extremely low order digits (i.e., in the 15th or greater significant digit of a decimal number).  This is to be expected, due to the inherent precision capability of floating point math.

However, the behavior may be different depending on whether a spreadsheet is created manually by typing data into a new (empty) spreadsheet, or if it is imported from a .CSV file.

Here is my example that demonstrates this issue.  The following 3 lines represent the entire file:

11.92
11.29
=A1-A2

If the lines above are entered into a new spreadsheet, then the result that appears in cell A3 is 0.63 (as one would expect).

But if those same 3 lines are in a .CSV file, and you import that file into Calc using the Text Import wizard (be sure to check the "Evaluate Formulas" checkbox), then the field width is expanded and a full 15 digits of precision are displayed, with the result in cell A3 being "0.630000000000001".

I would like for the CVS import method to follow the same rules as the manually created spreadsheet case, and only display the meaningful digits (just 0.63).

Note that in both cases, no cell formatting has been done, and no changes have been made to the column width.

Also note that this is a common occurrence when subtracting numbers that have 2 digits after the decimal point.  A .csv file containing financial data is a VERY common application of a spreadsheet;  numbers representing dollars & cents always have 2 digits after the decimal.  So when the user's spreadsheet consists mostly of simple math on tables of dollars.cents values, getting the anomalous math results is distracting and annoying, and it forces you to have to manually change the formatting.

I would also add that if the user has not applied any formatting change to a cell (presumably it has some kind of default formatting), then why should Calc's default formatting cause it to display 15+ significant digits, when it is a well-known fact that numeric values are not reliable beyond 14 or so digits?  

In my opinion, the default format of a cell containing a math formula should not display low-order digits of a decimal number beyond 14 significant digits.  Unless the user has explicitly changed the format to display more digits, there's no reason to show 15 or more digits when we KNOW that the lowest order digit(s) are likely to be garbage.

At a minimum though, I'd like to see consistency between the CSV import case and the manually created spreadsheet case.

Comment 1 Bruce H 2024-03-15 17:48:48 UTC

Created attachment 193132 [details]
Additional examples of this issue

Comment 2 ady 2024-03-15 18:31:23 UTC

(In reply to Bruce H from comment #0)

> 11.92
> 11.29
> =A1-A2
> 
> If the lines above are entered into a new spreadsheet, then the result that
> appears in cell A3 is 0.63 (as one would expect).

Just expand the width of column A and you should see the same as the import-from-CSV value.

Comment 3 Bruce H 2024-03-15 20:00:13 UTC

(In reply to ady from comment #2)
> (In reply to Bruce H from comment #0)
> 
> > 11.92
> > 11.29
> > =A1-A2
> > 
> > If the lines above are entered into a new spreadsheet, then the result that
> > appears in cell A3 is 0.63 (as one would expect).
> 
> Just expand the width of column A and you should see the same as the
> import-from-CSV value.


I was just describing what I see on the screen when no cell formatting has been done, and no manual changing of column size has been done.  If you don't change those things, then the behavior (i.e. what's displayed on screen) appears different.

OK, then let me try a different approach:

Cells have some default formatting (or so I assume).  If the user has not modified the formatting of a cell, then it appears that at least 15 digits of precision are displayed by default.  That seems ill-advised, considering that we KNOW that the accuracy of any floating point math operation is generally limited to 14 digits.  Given that low-order digits in the 15th (or more) position are not reliable, why not change the "default precision" to 14 significant digits?

Another way to look at it is, once you've gone beyond the max count of reliable digits, then any choice for a precision length seems arbitrary.  Why not choose to display 20 digits?  or 50?