Bug 92564

Summary: Search for italic regexp adjacent to non-italic
Product: LibreOffice Reporter: Luke Kendall <luke.kendall>
Component: WriterAssignee: Not Assigned <libreoffice-bugs>
Status: NEW ---    
Severity: enhancement CC: buzea.bogdan, ilmari.lauhakangas
Priority: medium    
Version: 4.4.3.2 release   
Hardware: Other   
OS: All   
Whiteboard:
Crash report or crash signature: Regression By:
Bug Depends on:    
Bug Blocks: 102847    

Description Luke Kendall 2015-07-06 03:33:36 UTC
While editing my document with change tracking turned on, I noticed that occasionally when I deleted a capital letter and replaced it with a lower case and then italicised the word, LO would re-capitalise the letter.  I did not always notice it doing this, so now my 160k word file is sprinkled with words that start with a letter that's the opposite italic-ness to the rest of the word.

So I'd like to search for italic{[a-z]}regular{[a-z]} and for regular{[a-z]}italic{[a-z]}, but I can't see a way to set the text attributes for just part of the search string.

Hence this request.
Comment 1 Buovjaga 2015-07-24 16:40:02 UTC
Sounds somewhat reasonable, so I'll set to NEW.

For now, I guess you could unzip the .odt file, look inside the content.xml, see if you can do a find & replace in your text editor to fix the problem and finally re-zip the thing + rename to .odt.
Comment 2 Luke Kendall 2015-12-11 07:39:04 UTC
I just had a look at the feasibility of the workaround.

Unfortunately, without a dictionary of the thousands of (unnecessary) styles generated by LO, it's really very difficult to know what style any piece of text is by looking at the XML. As mentioned in other reports, the XML saved by LO exposes the history of edit changes: at each point where text was changed, it's broken into a separate <span>, so you end up with stuff that looks like this:

 In fact, he looked </text:span><text:span text:style-name="T604">guilty!</text:span></text:p><text:p text:style-name="P949">&apos;<text:span text:style-name="T447">When</text:span> did they decide?&apos;</text:p><text:p text:style-name="P514"><text:span text:style-name="T811">&apos;</text:span><text:span text:style-name="T822">Several months ago</text:span><text:span text:style-name="T811">.&apos;</text:span></text:p><text:p text:style-name="P987"><text:span text:style-name="T311">Several months ago.</text:span><text:span text:style-name="T875">  S</text:span><text:span text:style-name="T425">he just stared at him.  &apos;</text:span><text:span text:style-name="T46">When were you going to tell me?&apos;</text:span></text:p><text:p text:style-name="P220">&apos;Don&apos;t take that tone of voice with me, young lady!&apos;</text:p><text:p text:style-name="P988"><text:span text:style-name="T47">He was really angry, she saw, though most people wouldn&apos;t </text:span><text:span text:style-name="T50">ha</text:span><text:span text:style-name="T47">ve seen the signs.  B</text:span><text:span text:style-name="T311">ut </text:sp

Since this is also caused by https://bugs.documentfoundation.org/show_bug.cgi?id=62603, it's unfortunate that no mechanism exists to correct the consequences of that bug.

Oh, well.

Back to a manual scan of my 400-page book looking for italic errors introduced by LO.