Bug 158726 - word count bloat: all symbols counted as separate words
Summary: word count bloat: all symbols counted as separate words
Status: NEEDINFO
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
7.6.2.1 release
Hardware: All Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-12-16 00:22 UTC by Carlos Lange
Modified: 2023-12-29 10:02 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Carlos Lange 2023-12-16 00:22:56 UTC
Description:
All symbols (colon, semi-colon, period, quotes, marks, accents) are counted as separate words, whether they are joined with other characters or not.

Steps to Reproduce:
1. Type any document with symbols.
2. Watch the word count at the bottom.
3. Both the total count and the selection count include the symbol as a word.

Actual Results:
The string:
Lc-ms sd:jk asdf.”
is counted as 9 words, with each group counted as 3 words.

Expected Results:
The same string in version 7.2:
Lc-ms sd:jk asdf.”
is counted as 3 words, with each group counted as a single word, which is the expected result.


Reproducible: Always


User Profile Reset: No

Additional Info:
A limited workaround (dash still breaks a word in two) can be obtained by adding all the symbols to the "Tools - Options - LibreOffice Writer - General - Additional separators" field:
 ~ ` ! @ # $ % ^ * ( ) - _ = + { } [ ] : ; " ' , . / < > ? “ ” ‘ ’ … 

My exact version is:
Version: 7.6.2.1 (X86_64) / LibreOffice Community
Build ID: 60(Build:1)
CPU threads: 8; OS: Linux 5.14; UI render: default; VCL: kf5 (cairo+xcb)
Locale: en-CA (en_CA.UTF-8); UI: en-US
Calc: threaded
Comment 1 Dieter 2023-12-29 10:02:18 UTC
I can't confirm with

Version: 24.8.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 40617d867346956588ac023511f31210107217f4
CPU threads: 4; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: de-DE (de_DE); UI: en-GB
Calc: CL threaded

Result: 3 words and 18 characters (expected)

Perhaps a Linux issue or a problem with your user profile. does ist also happen in SafeMode (help -> Restart in SafeMode)?
=> NEEDINFO