Bug 145251 - Hunspell produces non-valid compounding results (for Dutch)
Summary: Hunspell produces non-valid compounding results (for Dutch)
Status: UNCONFIRMED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
7.3.0.0.alpha1+
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-10-21 08:36 UTC by Telesto
Modified: 2024-01-25 09:47 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Example file (9.04 KB, application/vnd.oasis.opendocument.text)
2021-10-21 08:36 UTC, Telesto
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Telesto 2021-10-21 08:36:26 UTC
Description:
Hunspell produces non-valid compounding results (for Dutch)

Steps to Reproduce:
1. Open the attached file (with Dutch Dictionary installed)
2. Right click Opstappelen (wrong spelled). A long list of suggestion will present.
3. The result will be a long list which includes a lot of non-existing word combinations (not in the Dutch Dictionary). 
Like:
Opstapdelen
Opstapspelen
Opstapcel
Opstappalen
Opstappolen

Also 'Oogstappelen' is really unexpected. It's a true Word, but feels really out of context.

---

Word only produces 3 results
See also: bug 139319 and https://github.com/OpenTaal/opentaal-hunspell/issues/3

Actual Results:
Plenty of odd results

Expected Results:
Not so much junk


Reproducible: Always


User Profile Reset: No



Additional Info:
Version: 7.3.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: 93115d2c54d645bcf2f80fde325e3ede39dee4d5
CPU threads: 4; OS: Windows 6.3 Build 9600; UI render: Skia/Raster; VCL: win
Locale: nl-NL (nl_NL); UI: en-US
Calc: CL
Comment 1 Telesto 2021-10-21 08:36:43 UTC
Created attachment 175859 [details]
Example file
Comment 2 Telesto 2021-10-21 08:37:38 UTC
Bug 139319 comment 7

$ cd core/dictionaries/de
$ echo gebärdensprache | hunspell -d de_DE_frami
Hunspell 1.7.0
& gebärdensprache 4 0: Gebärdensprache, -gebärdensprache, sprachgebundene, sprachgebunden
so it can be reproduced on the command line without libreoffice involved so its not an issue within LibreOffice code and is presumably due to the o rule of
gebärdensprache/ozm in the de_DE_frami.dic/aff, i.e. specific to the spelling dictionary itself

The last update to those copies in LibreOffice came from https://bugs.documentfoundation.org/show_bug.cgi?id=105396
Comment 3 Telesto 2021-10-21 08:43:25 UTC
@Laslo
I'm posting this technically at the wrong bugtracker. Except that encounter this in LibreOffice when using the spell correct. And having no clue where the problem is. Hunspell or dictionary. 

However if it's the dictionary multiple are broken :-( I only tested it on: Dutch/German). Those languages are both pretty identical regarding compounding.
Comment 4 ost backup 2023-04-05 09:29:32 UTC Comment hidden (spam)
Comment 5 ost backup 2023-04-05 09:31:32 UTC Comment hidden (spam)
Comment 6 CarlioFeel 2024-01-25 09:47:21 UTC Comment hidden (spam)