Bug 66276

Summary: MathML export: avoid using combining characters for accents and diacritical marks
Product: LibreOffice Reporter: Frédéric Wang <fred.wang>
Component: Formula EditorAssignee: Frédéric Wang <fred.wang>
Status: RESOLVED FIXED    
Severity: normal    
Priority: medium    
Version: 4.2.0.0.alpha0+ Master   
Hardware: All   
OS: All   
See Also: https://bugs.freedesktop.org/show_bug.cgi?id=65765
Whiteboard: target:4.2.0
Crash report or crash signature: Regression By:
Attachments: Sample output

Description Frédéric Wang 2013-06-27 21:08:57 UTC
As indicated in the MathML spec:

"In the UCS there are many combining characters that are intended to be used for the many accents of numerous different natural languages. Some of them may seem to provide markup needed for mathematical accents. They should not be used in mathematical markup. Superscript, subscript, underscript, and overscript constructions as just discussed above should be used for this purpose. Of course, combining characters may be used in multi-character identifiers as they are needed, or in text contexts."

LibreOffice should try to use the non-combining versions when possible. Some of the work is already done in bug 66024. BTW, "U+20D7 COMBINING RIGHT ARROW ABOVE" looks really ugly in Firefox so I wonder if it should be replaced by "U+2192 RIGHTWARDS ARROW", at least for MathML export.
Comment 1 Frédéric Wang 2013-06-27 21:55:32 UTC
Mass changes to assign bugs to myself.
Comment 2 Frédéric Wang 2013-06-30 10:55:23 UTC
Created attachment 81733 [details]
Sample output

I've submitted a patch for review:

https://gerrit.libreoffice.org/#/c/4630/

I attach a testcase comparing the old and new output. The difference is not very visible visually (you can see in e.g. gedit that the accents no longer combine with the previous ">" char). In Firefox you can see that some diacritical marks are now better centered.
Comment 3 Commit Notification 2013-07-02 07:46:22 UTC
Frederic Wang committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=fbc9c18875d1e86c9b3d7d5c13e1db13af23e3f0

 fdo#66276 - MathML export: avoid using combining characters.



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 4 Frédéric Wang 2013-07-02 09:03:14 UTC
I'm closing this, although "U+2192 RIGHTWARDS ARROW" is still used instead of "U+20D7 COMBINING RIGHT ARROW ABOVE" and I'm still not sure what would be the best way to deal with that. Another bug can be opened later if necessary.