Bug 44710

Summary: FILEOPEN PDF: Rotated texts at wrong position and scrambled
Product: LibreOffice Reporter: Rainer Bielefeld Retired <LibreOffice>
Component: DrawAssignee: vvort
Status: RESOLVED FIXED    
Severity: major CC: LibreOffice, mathog, vvort
Priority: medium    
Version: 3.3.0 release   
Hardware: Other   
OS: All   
Whiteboard: target:4.3.0
Crash report or crash signature: Regression By:
Bug Depends on:    
Bug Blocks: 43806    
Attachments: Screenshots
Example 3 from bug 43806
Example with different font sizes and text angles
source SVG for example. Make example with save->as pdf in Inkscape
Example opened in LODraw
variant PDF
Screen shot of variant PDF, it has only the duplication defect
Screen shot, rotated text box seen in LODraw
3.4.5 release views of two test PDFs (screen shots)

Description Rainer Bielefeld Retired 2012-01-11 22:55:05 UTC
Created attachment 55477 [details]
Screenshots

Steps to reproduce with Parallel Dev-Installation of  "LibreOffice 3.5.0 Beta2- WIN7 Home Premium (64bit) German UI [Build-ID : 8589e48-760cc4d-f39cf3d-1b2857e-60db978]:

0. Download third sample from "Bug 43806 - [Task] Sample Collection"
   <https://bugs.freedesktop.org/attachment.cgi?id=54407>
1. Open nes DRAW document from LibO Start Center
2. Menu 'File -> Open -> bounding_line4_no_opacity.pdf'
3. Compare view with view in Acrobat X: red Texts at right top corner
   ("Arial 40 px, 32 pt" and others)
   Expected: Position of texts inside page, first line "Arial 4 px, 3.2 pt", 
             last line "Arial 40 px, 32 pt"
   Actual: Texts position outside page, text lines scrambled, splitted into 
           character groups

You can see similar problems with the 90° rotated blue text at the left bottom position 

OOo 3.4 Beta has the same problem
Comment 1 Rainer Bielefeld Retired 2012-01-11 23:54:56 UTC
Also a problem with LibO 3.3.0
Comment 2 Rainer Bielefeld Retired 2012-01-13 08:59:09 UTC
New due to original report in Bug 43806
Comment 3 mathog 2012-01-20 11:13:46 UTC
Created attachment 55849 [details]
Example 3 from bug 43806

Same problem in 3.5.0rc1 Windows.

Note, character duplication is also evident, which I had not noticed before in this example as the text is so very messed up.  That was separately posted in bug 45001.
Comment 4 mathog 2012-01-20 11:51:45 UTC
Created attachment 55859 [details]
Example with different font sizes and text angles

Better example for debugging problem
Comment 5 mathog 2012-01-20 11:52:22 UTC
Created attachment 55860 [details]
source SVG for example.  Make example with save->as pdf in Inkscape
Comment 6 mathog 2012-01-20 12:02:22 UTC
Created attachment 55861 [details]
Example opened in LODraw

View in 3.5.0rc1.  (3.4.4 release is similar except it does not also have the character duplications).

Take home points:

1.  Character duplication occurs for even the tiniest angle (lower left hand corner example).  This was introduced between 3.4.4 and 3.5.0b2.

2.  Vertical offset is proportional to font size (top two clusters of examples)

3.  Scale of messy displacement of characters above/below average rotation line proportional to angle of rotation (bottom examples).

4.  Messy displacement of characters above/below average rotation line varies by character.  That is, all the "r" are displaced one amount, and all the "a" another, and so forth.  (Not shown: rotate text like "aiDaaiiDDaaaiiiDDD" and all of each type of letter end up in a straight line, just not the same straight line.)
Comment 7 mathog 2012-01-20 12:13:12 UTC
Created attachment 55862 [details]
variant PDF

This variant PDF file was made by printing from Inkscape through PDF Creator (ghostscript).  The PDF file looks the same in a PDF viewer, but is handled differently by LODraw.
Comment 8 mathog 2012-01-20 12:17:56 UTC
Created attachment 55867 [details]
Screen shot of variant PDF, it has only the duplication defect

Screen shot of the PDFCreator produced pdf file opened in LODraw 3.5.0rc1.  

The character duplication defect is now on steroids. It occurs on all text, rotated or not.

However, the displacement and character specific offsets from the rotated line are gone.

Also the simple horizontal text is no longer recognized as a long text string.  Rather it is a collection of (overlapping) two character text boxes.  Like:

DD
 ee
  gg
   rr
etc. (Hopefully that will retain its fixed width spacing when posted)
Comment 9 mathog 2012-01-20 12:25:18 UTC
Created attachment 55868 [details]
Screen shot, rotated text box seen in LODraw

One other clue for whoever maintains this piece of code.  On the first PDF example
(the one where the characters are not all in a line) if a character is selected such that the rotated text box is visible, it can be seen that the center of that series of text boxes lines on the straight line corresponding to where the text characters should be.  The characters are always shown at the top of this text box.  The boxes are various heights, and the height varies with the type of character.
Comment 10 mathog 2012-01-20 15:48:51 UTC
Created attachment 55879 [details]
3.4.5 release views of two test PDFs (screen shots)

Tested 3.4.5 release on Windows XP SP3.

This does not have the character duplication problem.  That defect was introduced >3.4.5 release and <= 3.5.0b2.

The attachment shows screen shots of test_import_pdf_creator.pdf on the left and test_import.pdf on the right, both opened in LODraw 3.4.5.  Looking closely at the left side it can be seen that the text lines are not quite straight, the letters deviate a bit both perpendicular and tangent to the rotated line:

1.  The extent of the deviation increases with the angle rotated.

2. (not shown) at a given rotation angle, the font size does not affect the relative offset.  That is, the 12pt and 20pt variants appear to differ only by the constant scale factor.

3. The perpendicular displacement of characters with respect to the rotated line is in opposite directions on the +45 and -45 degree lines.  

4. The lateral displacements along the rotated line appear to be the same on the +45 and -45 degree lines.

5.  There are no defects in either spacing apparent on the text line which was not rotated.

The right side (showing the PDF made by "save as" from Inksape) is horrible, hard to describe that mess.  

There are no visible defects in either PDF file when displayed with a PDF viewer (PDF X-Change Viewer).
Comment 11 Rainer Bielefeld Retired 2012-04-01 22:50:47 UTC
No longer reproducible with "LibreOffice 3.5.2.2 German UI/Locale [Build-ID: 281b639-6baa1d3-ef66a77-d866f25-f36d45f] on German WIN7 Home Premium (64bit) and with "LibreOffice 3.4.5 German UI [Build ID: OOO340m1 (Build:502)]" parallel Server installation on German WIN7 Home Premium (64bit).

But now those documents suffer from "Bug 45848 - PDF: Text in a pdf is imported twice", in "Example 3 from bug 43806" rotated texts are affected.
Comment 12 Teo91 2013-09-29 19:06:00 UTC
I can confirm this with LO 4.1.1 on Windows 7 SP1.
Comment 13 vvort 2014-03-23 11:15:00 UTC
This problem must be fixed with this patch:
https://gerrit.libreoffice.org/8725

But it needs testing.
So if there are any regressions - please let me know.
Comment 14 Commit Notification 2014-03-23 11:43:34 UTC
Vort committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=53cbca6ee1b8e72144310147c88585a4f4b854c8

fdo#44710 PDF Import: Correction of position of rotated text



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.