Bug 71956

Summary: Other: The ability to set a different color for diacritics is missing for Arabic
Product: LibreOffice Reporter: Munzir Taha <munzirtaha>
Component: LibreOfficeAssignee: Jonathan Clark <jonathan>
Status: REOPENED ---    
Severity: enhancement CC: heiko.tietze, hossein, jonathan, khaled, mikekaganski, rasheed12824
Priority: medium    
Version: Inherited From OOo   
Hardware: All   
OS: All   
See Also: https://bugs.documentfoundation.org/show_bug.cgi?id=129330
https://bugs.documentfoundation.org/show_bug.cgi?id=150724
https://bugs.documentfoundation.org/show_bug.cgi?id=150726
Whiteboard: BSA target:24.8.0
Crash report or crash signature: Regression By:
Bug Depends on: 61444    
Bug Blocks: 71732, 112810, 161236    
Attachments: a picture with Allah's name and a red colored fatha
a file with colored fatha in arabic that shows that the problem is solved
Screenshot illustrating rendering fix

Description Munzir Taha 2013-11-24 05:40:02 UTC
Problem description:
Currently, I see no way to set the color of diacritics. In Arabic diacritics are separate glyphs and should be treated as such in the sense that one can select them and apply any formatting to them.

Steps to reproduce:
Type two Arabic glyphs e.g:
ضَ
Current behavior:
Cannot change the color of the second glyph

Expected behavior:
Should be able to set the color for each glyph separately.

Operating System: Linux (Other)
Version: unspecified
Comment 1 Munzir Taha 2013-11-24 20:23:46 UTC
I want to be able to type this word in libreoffice
https://en.wikipedia.org/wiki/File:Arabic_components_(letters)_in_the_word_Allah.svg
Comment 2 Safeer Pasha 2020-01-22 06:17:52 UTC
Created attachment 157311 [details]
a picture with Allah's name and a red colored fatha
Comment 3 ⁨خالد حسني⁩ 2020-01-23 05:36:53 UTC
How was that image created? Please also attach the ODT file. This issue is not fixed.
Comment 4 Safeer Pasha 2020-01-25 14:36:54 UTC
Created attachment 157422 [details]
a file with colored fatha in arabic that shows that the problem is solved
Comment 5 ⁨خالد حسني⁩ 2020-01-30 05:58:41 UTC
This is not really fixed, the mark position is incorrect and depending on the font can be greatly misplaced. For this to be really fixed, the mark position should be the same with and without colors in any font.
Comment 6 Munzir Taha 2020-05-08 03:20:45 UTC
I am not sure how you selected the diacritic and changed its color. Why didn't you make the whole word as the one I referred to? If you have typed it separately and then moved it to that location, that won't count of course. There should be an easy way to select the diacritic.

For the implementation to be complete, there should also be a way to find and change the Arabic diacritic color. An example of an app that already implemented this so you understand the issue better is:

https://helpx.adobe.com/indesign/using/arabic-hebrew.html#id_26234
Comment 7 Heiko Tietze 2020-05-08 05:56:00 UTC
*** Bug 129330 has been marked as a duplicate of this bug. ***
Comment 8 Jonathan Clark 2024-05-22 17:32:37 UTC
Created attachment 194281 [details]
Screenshot illustrating rendering fix
Comment 9 Jonathan Clark 2024-05-22 17:36:57 UTC
As of the fixes for bug 61444 and bug 124116, Writer should now correctly render diacritics with different colors (subject to those diacritics having separate glyphs in the font).

However, it is not yet possible to select the diacritics and apply colors to them in the user interface.

I've attached a screenshot illustrating this rendering change, using a hand-crafted fodt file.
Comment 10 Heiko Tietze 2024-05-22 17:53:22 UTC
(In reply to Jonathan Clark from comment #9)
> However, it is not yet possible to select the diacritics and apply colors to
> them in the user interface.
Don't think that's needed. If there would be a use case, I could imagine some decomposition mode similar to field Field Names where the individual glyphs are separated could do the trick UI-wise. My take: FIXED.

Awesome work!
Comment 11 Jonathan Clark 2024-05-22 18:33:06 UTC
(In reply to Heiko Tietze from comment #10)
> Don't think that's needed. If there would be a use case, I could imagine
> some decomposition mode similar to field Field Names where the individual
> glyphs are separated could do the trick UI-wise. My take: FIXED.

This sounds reasonable to me, so I will mark it fixed.


This bug was fixed as part of the fix for bug 124116, in the following commit:

https://git.libreoffice.org/core/commit/ab0a4543cab77ae0c7c0a79feb8aebab71163dd7

tdf#124116 Correct Writer text shaping across formatting changes

It will be available in 24.8.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 12 Hossein 2024-05-22 21:24:22 UTC
(In reply to Heiko Tietze from comment #10)
> (In reply to Jonathan Clark from comment #9)
> > However, it is not yet possible to select the diacritics and apply colors to
> > them in the user interface.
> Don't think that's needed. If there would be a use case, I could imagine
> some decomposition mode similar to field Field Names where the individual
> glyphs are separated could do the trick UI-wise. My take: FIXED.
> 
> Awesome work!
@Heiko:
As an RTL-CTL user who uses Arabic script, let me disagree here.

Being able to select diacritics is IMO essential, for editing, styling, reviewing and also for accessibility purposes, like the one described by a blind user in tdf#100854 for the screen readers to read the diacritics alongside the characters. 

I should also add that the lack of this feature is an incompatible behavior with other platforms, like MS Office and other Windows applications, from Word to even Wordpad.

I can open another bug report for this specific issue, if needed. Here, only styling is discussed, but I can provide numerous example use cases for "editing, styling, reviewing and also for accessibility purposes", as I've stated above.

Anyway, thanks Jonathan for your great work!
Comment 13 Heiko Tietze 2024-05-23 07:25:13 UTC
(In reply to Hossein from comment #12)
> Being able to select diacritics is IMO essential...
Sure, but not the parts. Sticking to the Latin alphabet we have ^ and o that together constitutes ô. Why would you need to change attributes for parts of the character, eg. the circumflex in red? I wonder if parts of a symbol can be seen as separate characters at all.
Comment 14 Hossein 2024-05-23 07:57:58 UTC
(In reply to Heiko Tietze from comment #13)
> (In reply to Hossein from comment #12)
> > Being able to select diacritics is IMO essential...
> Sure, but not the parts. Sticking to the Latin alphabet we have ^ and o that
> together constitutes ô. Why would you need to change attributes for parts of
> the character, eg. the circumflex in red? I wonder if parts of a symbol can
> be seen as separate characters at all.
I think that use case comes from the differences in the languages and the scripts. Is it possible (and needed) to put ^ on each and every Latin alphabet? Probably not. But, in Arabic script, it IS possible (and in fact needed) to put diacritics on each and every Arabic alphabet. Considering the joining of characters and putting diacritics on some characters, it is totally different from Latin script.

And for the use case that you have mentioned, I should say yes, they exist. There are books with diacritics printed in different color, and red is one of the usual choices. Such books exist among traditional hand-written manuscripts, and also in modern printed ones.

See, for example: (old manuscript)
https://www.meisterdrucke.ie/fine-art-prints/Unknown/637012/Page-from-a-Quran-with-Kufic-script-and-diacritics-in-red%2C-9th-10th-century--%28see-also-131332%29.html

And also this one: (modern typesetting)
https://www.researchgate.net/figure/The-same-word-ktb-with-different-diacritic-signs-colored-in-red-For-interpretation-of_fig5_264382424
It is from the research article:
https://www.researchgate.net/publication/264382424_A_calligraphic_based_scheme_to_justify_Arabic_text_improving_readability_and_comprehension

Even in MS Office, you can set it to show diacritics in different color. Use "Options > Advanced > Diacritics > Use this colour for diacritics".

I have also tested Windows on screen reader for reading character by character in MS Word, and now I better understand the validity of the use case brought up by tdf#100854. It reads alphabets and diacritics one by one, just as a real person may reads that for teaching, reviewing or proof reading.

I think being able to select diacritics is needed for Arabic script.
Comment 15 Heiko Tietze 2024-05-23 09:43:57 UTC
(In reply to Hossein from comment #14)
> And for the use case that you have mentioned, I should say yes, they exist.
Convincing examples.

> "Options > Advanced > Diacritics > Use this colour for diacritics".
Would be okay in case of a one-for-all option. If you want to modify the color per character I've made my proposal.
Comment 16 Jonathan Clark 2024-05-23 13:13:06 UTC
I have no strong opinions about whether we close this bug or open a new one, but I do think this bug needs to be clarified before someone can work on it.

We have a few suggestions already: Wordpad-style cursor advance, MSO-style diacritic color, Heiko's grapheme decomp suggestion. I don't think we need all of them in order to implement Arabic diacritic colors, and if we do want all of them, they should be separate bugs.

It also probably goes without saying that the implementation should work for scripts other than Arabic, too.

(In reply to Hossein from comment #12)
> I should also add that the lack of this feature is an incompatible behavior
> with other platforms, like MS Office and other Windows applications, from
> Word to even Wordpad.

I spent a few minutes playing around with Wordpad. From what I can tell, it has a special-case cursor advance for Arabic script, where it divides grapheme clusters containing vowel marks, tanwin, shadda, and sukun into evenly spaced advances. It didn't do this at all for samples of random Thai, Vietnamese, or Hebrew text (I did get an odd zero-width advance for a segol on a shekel sign, but that's probably a bug).

I would characterize this as a visually-nicer version of the cursor movement we get while holding down alt and pressing the arrow keys, but only for Arabic text. I'm not sure what would be involved to make this the default behavior for Arabic script, or to make this work for selections, but conceptually it should be possible to 'feather out' the total advance for a grapheme cluster across its components, if this is what we want.
Comment 17 Hossein 2024-05-23 13:41:58 UTC
I agree with Jonathan that the scope of this issue is somehow large, and it should be broken into specific issues with proper description. I'll try to make it happen.

One thing to immediately mention here is that you can (somehow) move between characters and diacritics using alt+right and alt+left in LibreOffice, but it is completely different with what movement and selection is done in MS Word. In LibreOffice, right now you can not select anything this way, and when a diacritical mark is on a character, you do not notice the movement. In contrast, the movement in Word is useful because movement goes half way through the width of character, and you can select diacritics.

I also suggest testing Word cursor movement and selection with Arabic screen reader enabled to see the announcement of the specific alphabets and diacritics. Windows on screen reader should work, if you install Arabic language with all the provided features.
Comment 18 ⁨خالد حسني⁩ 2024-05-23 21:07:40 UTC Comment hidden (no-value)
Comment 19 ⁨خالد حسني⁩ 2024-05-23 21:17:14 UTC Comment hidden (no-value)
Comment 20 ⁨خالد حسني⁩ 2024-05-23 21:33:03 UTC
OK, I should read all the comments first :) So alt+arrow key should be good enough (it is probably a separate issue that it does not work on macOS, may be I just don’t know the right key combination), but it is lacking the visual feedback that makes it usable.

Visually highlighting selected combining marks is tricky becausw usually stack vertically, while selection highlighting works horizontally. I think we can go with a Word-like behavior (not by default, only when alt is pressed) and divide the grapheme cluster into equal parts (like we do with ligatures, when the font does not have GDEF ligature caret positioning data).
Comment 21 Heiko Tietze 2024-05-24 07:53:05 UTC
Hard to imagine a workflow where DF'ing of diacritics is wanted. Taking the example in comment 14, I suggest an option under Font Effects > Font Color: "Diacritics: [Automatic]" that allows to change the... is it always the first part? - anyway to change all at once with the ability to apply per CS and PS. Automatic would follow the Font Color.
Comment 22 Hossein 2024-05-24 13:00:43 UTC
(In reply to Heiko Tietze from comment #21)
> Hard to imagine a workflow where DF'ing of diacritics is wanted.
That is because the above examples in comment 14 are just a few that use the same color for each and every diacritical mark. I can provide various examples for coloring of separate diacritics.

> Taking the
> example in comment 14, I suggest an option under Font Effects > Font Color:
> "Diacritics: [Automatic]" that allows to change the... is it always the
> first part? - anyway to change all at once with the ability to apply per CS
> and PS. Automatic would follow the Font Color.

In Word, it is a per-application setting, and not a per-document (or character) setting. But, IMO it would be fine to implement it as a character setting in LibreOffice. In this way, it can be even better as it is easier to use, and probably more portable.

I will file another issue for selecting diacritics, which needs more explanation.

One important question remains: will it need any change in ODF/OOXML to load/save/edit such document?
Comment 23 Munzir Taha 2024-05-24 16:49:36 UTC
(In reply to Heiko Tietze from comment #21)
> Hard to imagine a workflow where DF'ing of diacritics is wanted. Taking the
> example in comment 14, I suggest an option under Font Effects > Font Color:
> "Diacritics: [Automatic]" that allows to change the...

Yes, it's common for Arabic script to be written in different colors for diacritics, e.g. https://en.wikipedia.org/wiki/Arabic_diacritics has this image https://en.wikipedia.org/wiki/Arabic_diacritics#/media/File:Arabic_script_evolution.svg where Shaddah is in blue or black whereas other diacritics are in red. Another old example is this 11th-Century Qur’an in Eastern Kufic at https://commons.wikimedia.org/wiki/File:11th-Century_Qur%E2%80%99an_in_Eastern_Kufic_WDL8937.pdf with different colors for diacritics.

> is it always the
> first part? - anyway to change all at once with the ability to apply per CS
> and PS. Automatic would follow the Font Color.

Diacritics are the second part. So you have the letter or character and then the diacritic comes after it but there could be double diacritics like in case of Shaddah followed by fatha as an example