Bug 160835

Summary: Unclear meaning of "a basic total population"
Product: LibreOffice Reporter: Tuomas Hietala <tuomas.hietala>
Component: CalcAssignee: Rafael Lima <rafael.palma.lima>
Status: RESOLVED FIXED    
Severity: minor CC: czeslaw.wolanski, rafael.palma.lima, stephane.guillou
Priority: low    
Version: Inherited From OOo   
Hardware: All   
OS: All   
Whiteboard: target:24.8.0
Crash report or crash signature: Regression By:
Bug Depends on:    
Bug Blocks: 105582    
Attachments: Calc functions in UI and Help
Compatibility issue with AVERAGE in Calc vs Excel

Description Tuomas Hietala 2024-04-26 18:22:36 UTC
Description:
In Calc, there's the following UI string: "Value 1; value 2; ... are arguments representing a sample taken from a basic total population."

I have no idea what the expression "a basic total population" means. Nothing seems to come up on a web or Wikipedia search. I assume this is a mistake of some kind, perhaps a mistranslation from another language?

Steps to Reproduce:
1. The string on Weblate: https://translations.documentfoundation.org/translate/libo_ui-master/scmessages/en/?checksum=56d94ade7c12a99c


Actual Results:
The string is confusing.

Expected Results:
The string is using generally accepted English statistics terminology.


Reproducible: Always


User Profile Reset: No

Additional Info:
n/a
Comment 1 Tuomas Hietala 2024-04-26 18:34:54 UTC
Correction: the same string appears in two other places, too:
https://translations.documentfoundation.org/translate/libo_ui-master/scmessages/en/?q=a+basic+total+population&sort_by=-priority%2Cposition&checksum=
Comment 3 Tuomas Hietala 2024-04-26 19:38:21 UTC
(In reply to ady from comment #2)
> <https://en.wikipedia.org/wiki/Statistical_population>

Yes, I know what "population" means in statistics. What I don't know is what "basic total population" is supposed to mean.
Comment 4 ady 2024-04-26 21:50:44 UTC
IMHO, it would help if you could link to the actual Help content (i.e. function) we are talking about. I mean that "Value 1; value 2; ... are arguments representing a sample taken from a basic total population" sounds as the description of a "ValueN" argument of some specific Calc function, probably part of the Statistical Category of functions.
Comment 5 nutka 2024-04-27 16:35:57 UTC
Calc functions: AVERAGEA, STDEVA and VARA - cf. the attached juxtaposition (UI vs Help).
Comment 6 nutka 2024-04-27 16:37:14 UTC
Created attachment 193879 [details]
Calc functions in UI and Help
Comment 7 Stéphane Guillou (stragu) 2024-05-13 02:23:40 UTC
Rafael, what do you think? I feel like this could be simplified to just "population".

The string exists in OOo 3.3, but doesn't mean anything to me either, and an online search seems to confirms that it does not have any special meaning in stats.

Elsewhere:
- MS Excel uses "a sample of a population"[1][2].
- Google Docs only says it's a sample without having to mention "from a population". (Only uses "population" to refer to other functions that consider the whole population instead of a sample.)[2]

[1]: https://support.microsoft.com/en-us/office/stdeva-function-5ff38888-7ea5-48de-9a6d-11ed73b29e9d
[2]: https://support.microsoft.com/en-us/office/vara-function-3de77469-fa3a-47b4-85fd-81758a1e1d07
[3]: https://support.google.com/docs/answer/3094055
Comment 8 Rafael Lima 2024-05-17 17:06:14 UTC
Created attachment 194180 [details]
Compatibility issue with AVERAGE in Calc vs Excel

In short, indeed this is simply a "population". The main difference between AVERAGE and AVERAGEA is how logical values (TRUE / FALSE) or empty text is treated. See below:

https://support.microsoft.com/en-us/office/averagea-function-a5ae0aea-11ad-4bba-856a-031e08567df0

The problem is that there I've just detected a compatibility issue in LO Calc, where the function AVERAGE does not work as in Excel. See attached image for an example.

Apparently, Calc's AVERAGE works like AVERAGEA in Excel (beware that the Excel file in the screenshot is using pt-BR, hence the formulas are translated).

Can anyone else confirm? Should we treat this compatibility issue on a separate ticket?
Comment 9 Rafael Lima 2024-05-17 17:13:45 UTC
This patch fixes the affected strings:

https://gerrit.libreoffice.org/c/core/+/167688

I guess we'd better treat this compatibility issue in a separate ticket. Anyways, if anyone can confirm it, please let me know.
Comment 10 Commit Notification 2024-05-17 19:57:28 UTC
Rafael Lima committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/8d16b7a9e6dad1c3590c74af3eee952bd8fc3284

tdf#160835 Fix the use of "population" in Calc functions

It will be available in 24.8.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 11 ady 2024-05-18 00:42:45 UTC
(In reply to Rafael Lima from comment #8)

> Apparently, Calc's AVERAGE works like AVERAGEA in Excel (beware that the
> Excel file in the screenshot is using pt-BR, hence the formulas are
> translated).
> 
> Can anyone else confirm? Should we treat this compatibility issue on a
> separate ticket?

I think that they work as expected. The difference is, most probably, that Excel deals with the boolean category in a different way than numeric values, whereas in Calc the boolean category is treated/considered as numeric values – just as percentage, scientific and other categories are treated as numeric values too.