Bug 131315 - Index: Implement letter by letter alphabetising
Summary: Index: Implement letter by letter alphabetising
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
: 129500 (view as bug list)
Depends on:
Blocks: TableofContents-Indexes Authors
  Show dependency treegraph
 
Reported: 2020-03-12 17:13 UTC by R. Green
Modified: 2024-05-24 14:02 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description R. Green 2020-03-12 17:13:25 UTC
The current system wrongly indexes entries containing non-alphanumeric characters. For example, you cannot put an apostrophe at the beginning of an entry because this causes it to be indexed under the apostrophe, rather than the letter! And if a non-alphanumeric occurs in the middle of an entry, the entry will be listed in an unexpected order.

LO needs to implement an accepted worldwide standard which allows for non-alphanumeric characters: i.e. the LETTER BY LETTER system of alphabetising indexes.

This system is detailed at https://www.press.uchicago.edu/Misc/Chicago/CHIIndexingComplete.pdf (Chicago Manual of Style.

To give an example. This (AFAIK) is a CORRECTLY indexed series of entries (letter by letter):

"Bell, Andrew, 413
Bell, book, and candle 36
Bell, Garden, the 175, 651
Bell Hotel, The 559pp.
Bellmen 116
Bell-ringers, the 205, 380, 578
Bells of the Parish Church 173"

However, this is how it is INCORRECTLY indexed by LO:

"Bell Hotel, The 559pp.
Bell-ringers, the 205, 380, 578
Bell, Andrew, 413
Bell, book, and candle 36
Bell, Garden, the 175, 651
Bellmen 116
Bells of the Parish Church 173"
Comment 1 Dieter 2020-03-21 17:31:00 UTC
As far as I can see, this reports leads to the general and very important question, if LO should follow the rules for a special style (for example Chicago Manual of Style). So I would change the bug summary to "Alphabetical index should follow rules of Chicago Manual of Styles". Decision about that is not trivial.

Perhaps alphabetical index follows the rules of a certain style guide. If this is the case, there should be an information in LO help [1]

cc: Design Team for further dicussion an decision

[1] https://help.libreoffice.org/7.0/en-GB/text/swriter/01/04120212.html?&DbPAR=WRITER&System=WIN
Comment 2 Heiko Tietze 2020-03-23 10:09:28 UTC
Guess the sorting algorithm would be the same for Writer and Calc but it isn't. Trying with 6.4 it returns

"Bell, Andrew, 413
Bell Hotel, The 559pp.
Bell-ringers, the 205, 380, 578
Bell, book, and candle 36
Bell, Garden, the 175, 651
Bellmen 116
Bells of the Parish Church 173"

in case of Calc but "Bell... at the end in case of Writer, alphabetical ToC.
Comment 3 Heiko Tietze 2020-03-27 10:47:59 UTC
UX-wise the sorting would go into the "key type" dropdown as an alternative to alphanumeric. So introducing another algorithm wouldnt be a problem at all and is welcome.
Comment 4 Buovjaga 2020-05-17 18:23:47 UTC
*** Bug 129500 has been marked as a duplicate of this bug. ***
Comment 5 R. Green 2020-07-10 10:15:51 UTC
Another aspect of this problem: the Index does not allow you to have entries under the SAME alphabetical delimiter which have the ame letter but different diacritical marks. So, for example, A, Ā, À, Á  etc. are placed under DIFFERENT alphabetical delimiters!

There's a section about alphabetization of diacritical characters on Wikipedia:

https://en.wikipedia.org/wiki/Diacritic#Alphabetization_or_collation
Comment 6 R. Green 2021-02-05 12:31:45 UTC
See also, Bug 140186: Index needs to ignore diacritics when alphabetising.