Bug 139319

Summary: Hunspell produces not applicable spell suggestions (german)
Product: LibreOffice Reporter: Telesto <telesto>
Component: WriterAssignee: Not Assigned <libreoffice-bugs>
Status: RESOLVED NOTOURBUG    
Severity: normal CC: aron.budea, caolan.mcnamara
Priority: medium    
Version: 6.0.0.3 release   
Hardware: All   
OS: All   
See Also: https://bugs.documentfoundation.org/show_bug.cgi?id=136306
Whiteboard:
Crash report or crash signature: Regression By:
Bug Depends on:    
Bug Blocks: 108728    
Attachments: Example file
Example file

Description Telesto 2020-12-30 13:48:18 UTC
Description:
Hunspell produces not applicable spell suggestions

Steps to Reproduce:
1. open the attached file
2. Right click the Word: gebärdensprache 


Actual Results:
4 Results: including "sprachegebundende"

Expected Results:
sprachegebundende shouldn't be present


Reproducible: Always


User Profile Reset: No



Additional Info:
Found in
Version: 7.2.0.0.alpha0+ (x64)
Build ID: 4e3ce9dd6ace0b22f7b3f45cf2338b201f4dc305
CPU threads: 4; OS: Windows 6.3 Build 9600; UI render: Skia/Raster; VCL: win
Locale: nl-NL (nl_NL); UI: en-US
Calc: CL

and in
Versie: 6.0.4.1
Build ID: a63363f6506b8bdc5222481ce79ef33b2d13c741
CPU-threads: 4; Besturingssysteem: Windows 6.3; UI-render: GL; 
Locale: nl-NL (nl_NL); Calc: CL

still as expected in
5.4
Comment 1 Telesto 2020-12-30 13:48:32 UTC
Created attachment 168578 [details]
Example file
Comment 2 Telesto 2020-12-30 13:59:30 UTC
Created attachment 168579 [details]
Example file

Another example.. bit different though
Comment 3 Telesto 2020-12-30 14:19:36 UTC
(In reply to Telesto from comment #2)
> Created attachment 168579 [details]
> Example file
> 
> Another example.. bit different though

Bindegewebereiche/ Bindegewebsgeschwulst don't really belong.. 3.5. results make more sense to me
Comment 4 Telesto 2020-12-30 14:26:50 UTC
Another hunspell suggestion topic.
Comment 5 Telesto 2020-12-30 14:29:26 UTC
@Caolan
Number of questions:
Does it make sense to bibisect this against LibreOffice? 
Does hunspell have a kind of bibisect repro?
What is the appropriate bug tracker? Here or at hunspell
Comment 6 Telesto 2021-01-03 12:11:33 UTC
Possible kind not belong to this.. but well, 'Unterwasse" (expected Unterwasser') lists also lots of non-obvious results (for my taste). Being the case in 3.3.0 already.. And no MSO (2003) doesn't do better job.

But well how compounding works kind of mystery to me (black box). Still rule based? Looks like nice area for machine learning (OK i assume google doing this already.. suggestions and stuff).
Comment 7 Caolán McNamara 2021-01-08 20:55:02 UTC
$ cd core/dictionaries/de
$ echo gebärdensprache | hunspell -d de_DE_frami
Hunspell 1.7.0
& gebärdensprache 4 0: Gebärdensprache, -gebärdensprache, sprachgebundene, sprachgebunden
so it can be reproduced on the command line without libreoffice involved so its not an issue within LibreOffice code and is presumably due to the o rule of
gebärdensprache/ozm in the de_DE_frami.dic/aff, i.e. specific to the spelling dictionary itself

The last update to those copies in LibreOffice came from https://bugs.documentfoundation.org/show_bug.cgi?id=105396
Comment 8 Aron Budea 2021-02-08 05:41:29 UTC
Telesto, please raise this with the dictionary maintainer:
https://wiki.documentfoundation.org/Development/Dictionaries
(further details can be found in readmes in share\extenstions\dict-de in your LibreOffice installation)