Summary: | Word count incorrect if language is set to Finnish | ||
---|---|---|---|
Product: | LibreOffice | Reporter: | Simo Kaupinmäki <isokumma> |
Component: | Writer | Assignee: | Caolán McNamara <caolan.mcnamara> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | caolan.mcnamara, isokumma, serval2412 |
Priority: | medium | ||
Version: | 3.5.4 release | ||
Hardware: | Other | ||
OS: | All | ||
See Also: | https://bugs.freedesktop.org/show_bug.cgi?id=51818 | ||
Whiteboard: | target:4.3.0 | ||
Crash report or crash signature: | Regression By: | ||
Bug Depends on: | |||
Bug Blocks: | 103479 | ||
Attachments: |
Different word counts for corresponding Finnish and English texts.
Different word counts for corresponding Finnish and English texts. |
Description
Simo Kaupinmäki
2012-10-06 22:29:55 UTC
Created attachment 68182 [details]
Different word counts for corresponding Finnish and English texts.
Confirmed on Version 3.6.2.2 (Build ID: 360m1(Build:2)) on Kubuntu 12.10 (Linux) Any update with a recent LO version? (4.1.5 or 4.2.3) Indeed, I'm quite sure there have been some fixes about word counting since 3.6. Now I can't say it'll solve your problem. The bug is still present in 4.2.3.3. Simo: thank you for your feedback Caolán: I thought you might be interested in this one (seeing http://cgit.freedesktop.org/libreoffice/core/commit/?id=eae2e87ba4de1ae59779cbfc56109aa6c27fbc17 for example) Simo: Just realized that this commit isn't in 4.2 branch just on master (future 4.3.0 but could help (take a look to fdo#51818 put in See Also) For the test (because it's a development version), could you give a try to a daily build from master sources (see http://dev-builds.libreoffice.org/daily/master/)? OK, thanks for the tip. Unfortunately, the bug still seems to be present in the development version too. Tested on Windows: Version: 4.3.0.0.alpha1+ Build ID: 0b03f7ed575838f90e6b1ebec3538a3a214f81fb TinderBox: Win-x86@42, Branch:master, Time: 2014-04-30_02:30:23 The difference is (I'm guessing) probably because we have (for some reason) customized rules in http://cgit.freedesktop.org/libreoffice/core/tree/i18npool/source/breakiterator/data for Finnish (the _fi files) and persumably they are wrong or out of date. Ideally we would have no such custom rules and prefer just the in-built icu rules. Its likely that the custom rules were created long ago when icu had some poor Finnish rules and its possible that the normal rules are no better than our custom ones. The customizations are due to these two old reports. https://issues.apache.org/ooo/show_bug.cgi?id=58513 https://issues.apache.org/ooo/show_bug.cgi?id=85411 The other possibility is that the problem lies in icu and the custom rules here are a red herring. Caolan McNamara committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=6e225b41f1ab3e6cac395b0c0c6db73414658625 Resolves: fdo#55707 Word count incorrect if language is set to Finnish The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback. With the above changes I get 10 words for the Finnish text in LibreOffice and MSOffice and ctrl+right/left gives equal boundaries. Both apps believe "10 %" and "10 €" comprise of 2 words each. In English they definitely do form two diffent words as the practice is 10% and 10€ in that language. Though I know the practice is "10 %" in other languages I don't know if it counts as a single word or not. Any issues around that would have to be raised in icu itself. Give the dailies a go tomorrow or so and see if there are side-effects of the change. Thank you for the fix! The word count now works as expected for Finnish too. Indeed, I'd expect "10 %" etc. to count as 2 words, since basically it stands for "ten percent" (or "kymmenen prosenttia" in Finnish). |