Summary: | InStr finds match in string containing diacritics | ||
---|---|---|---|
Product: | LibreOffice | Reporter: | Jordi <bugs.df.org> |
Component: | BASIC | Assignee: | Not Assigned <libreoffice-bugs> |
Status: | UNCONFIRMED --- | ||
Severity: | normal | CC: | andreas.heinisch, erack, himajin100000, sokol |
Priority: | medium | ||
Version: | 7.3.2.2 release | ||
Hardware: | All | ||
OS: | All | ||
See Also: | https://bugs.documentfoundation.org/show_bug.cgi?id=139840 | ||
Whiteboard: | |||
Crash report or crash signature: | Regression By: |
Description
Jordi
2022-04-18 13:23:45 UTC
Returns 8 in: Version: 7.3.2.2 (x64) / LibreOffice Community Build ID: 49f2b1bff42cfccbd8f788c8dc32c1c309559be0 CPU threads: 6; OS: Windows 10.0 Build 19042; UI render: default; VCL: win Locale: ru-RU (ru_RU); UI: ru-RU Calc: threaded The transliteration (with only TransliterationFlags::IGNORE_CASE set!) used to make case-insensitive match converts the 16-character string into these 20 characters: 'ά' - 'ά' 'έ' - 'έ' 'ί' - 'ί' 'ό' - 'ό' 'ύ' - 'ύ' 'ώ' - 'ώ' 'ή' - 'ή' 'ΐ' - 'ι' '̈' '́' 'ΰ' - 'υ' '̈' '́' 'Ά' - 'ά' 'Έ' - 'έ' 'Ί' - 'ί' 'Ό' - 'ό' 'Ύ' - 'ύ' 'Ώ' - 'ώ' 'Ή' - 'ή' Indeed, the searched character is found there. Of course, it *seems* that the original code should use case-insensitive ("binary") comparison, and replacing 'instr(grkchrs, "ι")' with 'instr(1, grkchrs, "ι", 0)' gives the expected 0. But is the transliteration correct in this case? Eike: do you know if it's correct? FTR: Calc's '=SEARCH("ι";A1)' also returns 8. (In reply to Mike Kaganski from comment #2) > The transliteration (with only TransliterationFlags::IGNORE_CASE set!) used > to make case-insensitive match converts the 16-character string into these > 20 characters: > 'ά' - 'ά' > 'έ' - 'έ' > 'ί' - 'ί' > 'ό' - 'ό' > 'ύ' - 'ύ' > 'ώ' - 'ώ' > 'ή' - 'ή' > 'ΐ' - 'ι' > '̈' > '́' > 'ΰ' - 'υ' > '̈' > '́' > 'Ά' - 'ά' > 'Έ' - 'έ' > 'Ί' - 'ί' > 'Ό' - 'ό' > 'Ύ' - 'ύ' > 'Ώ' - 'ώ' > 'Ή' - 'ή' > > Indeed, the searched character is found there. > Of course, it *seems* that the original code should use case-insensitive > ("binary") comparison, and replacing 'instr(grkchrs, "ι")' with 'instr(1, > grkchrs, "ι", 0)' gives the expected 0. But is the transliteration correct > in this case? > > Eike: do you know if it's correct? Great, thanks for the heads up on the mode option. It solves my immediate problem. Alas 7.3.2.2 is unstable for me so back to my old version I go. Should we close this as NOTABUG since Calc's '=SEARCH("ι";A1)' returns the same result as the macro? (In reply to Andreas Heinisch from comment #5) The problem here is the inconsistency IMO. Every character with a diacritic could be represented as a base character plus combining characters. But only two were decomposed like that. |