Solved

Garbled text when copy/paste from a PDF that was generated via Internet Explorer 9 => Print to pdf

Posted on 2013-01-15
7
4,842 Views
Last Modified: 2015-06-03
Info on our setup(s) used:

Windows 7 Enterprise  SP1
IE9                   (Ver: 9.0.8112.16421)
CutePDF                  (Ver: 3.0)
GPL Ghostscript (Ver: 8.15)
Adobe reader     (Ver: 9 - XI )


Hello all,

I've been breaking my head on a print from IE9 to PDF issue.
Or better, the print to PDF part works fine, but as soon as you would like to copy/paste content from the generated PDF, the pasted text shows as a bunch of weird characters.
Such as:

%$!$  0  $ '

If you for example paste some content from this PDF (see *.pdf attachment) in Word, you can see the Fonts being used are for example something like : TT33Bt00  (see *.docx attachment)

I believe the answer below to be a good explanation on why this occurs:
(Found here: http://forums.adobe.com/thread/427945)

"It turns out that no usable encoding information is present (neither in the PDF nor in the embedded font data) to derive the meaning of the characters/glyphs that are displayed on the pages in the document.
 
The fonts actually are all embedded, but in a way that all encoding information has been removed. This is a typical example of a PDF that is syntactically fully compliant with the PDF spec but where important information about the meaning of the text in it has been thrown away during the process of making the PDF. As far as I can tell it would be very difficult to recover the encoding info. Strange as it may sound the best option may be to convert the pages to oixel and then run OCR on them...."


A possible solution for Adobe PDF printer users could be to uncheck the option 'Rely on system fonts only, do not use document fonts.' As discussed here :
http://answers.microsoft.com/en-us/ie/forum/ie9-windows_other/ie9-printing-problems-text-is-garbled-when-trying/45457b91-5472-4cf2-951d-79553fff072b
And here:
http://helpx.adobe.com/acrobat/kb/missing-or-garbled-text-printing.html

I think a similar option is provided by CutePDF: When clicking printing preferences - advanced, the item TrueType Font has 2 options: 'Substitute with device font' or 'download as softfont'. It doesn't however offer you the wanted result.

Everything works just fine when using Chrome, Firefox, or IE8 , so I think one may conclude it might just be purely IE9 related. Any hints, clues, things I forgot to test, ..  all welcome.

Thanks in advance.
Robert
Garbled-text.docx
Garbledtext.pdf
0
Comment
Question by:BankDelen
7 Comments
 
LVL 53

Accepted Solution

by:
Joe Winograd, EE MVE earned 375 total points
ID: 38779448
Hi Robert,
We had an extensive thread on a similar issue last month. I don't know if it will help you, but there are numerous ideas in it that are worth a read:
http://www.experts-exchange.com/Web_Development/Document_Imaging/Adobe_Acrobat/Q_27960233.html

Regards, Joe
0
 
LVL 3

Assisted Solution

by:IKtech
IKtech earned 25 total points
ID: 38779924
what about turning on compatability mode in ie9 for the website?  have you tried that?
0
 
LVL 16

Assisted Solution

by:DansDadUK
DansDadUK earned 100 total points
ID: 38782143
I don't have an answer (I use Windows 8 (not 7), IE10 and Chrome (not IE9), and don't have any 'print to PDF' capability on those browsers).

Just a few comments:

... fonts actually are all embedded, but in a way that all encoding information has been removed ...
This is referred to as font obfuscation.
When printing documents (to real printers), where it is known that the target printer does not have printer-resident equivalents of the fonts used in the document, one choice in the printer driver is to download equivalents of the document fonts as printer-format soft fonts.
With large fonts, the source document may only use a small number of the characters in the font, so it makes sense to only download a subset of the source font to the printer.
Most printer drivers, when subsetting such soft fonts, will obfuscate them, by using a dynamically generated character encoding which doesn't keep any concept of ASCII (or Unicode) character encodings, but only makes sense in the context of the obfuscated soft font subset.
The main reason for this obfuscation is to protect the property rights of the font designer/vendor, to prevent it being easily copied to different formats, especially where the licence restrictions in the font (which is a form of software) allows limited manipulation.


I would guess that:
Something similar is occurring when 'print to PDF' is chosen instead of printing to a real printer.
The problem only occurs when font embedding is selected, and font subsetting is also chosen.
It may be the case that the problem will not occur if font subsetting is not selected - but (not having the same environment) I don't know if this is a valid selection - and, of course, this could considerably increase the size of the generated print stream or PDF.
0
Salesforce Made Easy to Use

On-screen guidance at the moment of need enables you & your employees to focus on the core, you can now boost your adoption rates swiftly and simply with one easy tool.

 

Author Comment

by:BankDelen
ID: 38782756
@ Joe

Thanks for pointing me to that thread Joe, at this point I'm checking out your suggestion to use doPDF. Still have to go over any known issues, incompatibilities or possible security flaws, but if this turns out good, I guess we'll switch to doPDF since it has gotten the job done right out of the box.


@ IKtech

Thank you for your suggestion IKtech but I indeed had already tried it.


@ DansDadUK

Thank you for your time invested in order to get me up to speed on what's really going on behind the scene. As you already pointed out, not having the same environment makes me lack a Font embedding or font subsetting option unfortunately.


I'll be doing a background check on doPDF, in the meanwhile any suggestions naturally remain welcome.
Thanks so far everyone, I'll try to get back to you no later than tomorrow.

Kind regards,
Robert
0
 

Author Closing Comment

by:BankDelen
ID: 38793493
I would've loved to see a solution that allowed us to just change some parameters in IE9 and keep our setup unchanged, but I'll guess we'll just 'doPDF' !

Thank you for your help, time and suggestions guys.
Cheers, Robert
0
 

Expert Comment

by:Vincent Carrier
ID: 40810754
We are deploying Internet Explorer 11 in my company, and we experienced the same issue which can be reproduced easily. I resolved the issue by updating Ghostscript to the latest version.

All our computers are equipped with CutePDF Writer 3.0;
Users go on a Web site and print as PDF using CutePDF virtual printer;
Users then open the PDF file in Adobe Reader (we have version XI);
They select some text in the PDF and copy it in the clipboard (Ctrl+C);
They paste the copied text anywhere (Word, Notepad, etc.)

Before upgrading from IE8 to IE11, it worked fine. The formatting was not perfect but at least the text was there.
Once that they use IE11, the paste results in garbage characters.

I found out that CutePDF relies on Ghostscript to produce the PDF. When we install CutePDF, Ghostscript 8.15 is installed along. We can see it in the PDF file properties in Adobe Reader that the PDF Producer is "GPL Ghostscript 8.15". So I went to the Ghostscript web site and install the newest Ghostscript package, version 9.16. As soon as I did it, CutePDF started producing PDF files with this newer version of Ghostscript, and the text becomes copiable.

You can download Ghostscript from there: http://ghostscript.com/download/gsdnld.html

Even on 64-bit systems, it's the 32-bit version of Ghostscript that must be installed.

Hope it helps.


V.
0
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 40810784
Yes, "stale" copies of Ghostscript can cause grief with CutePDF. Here's an EE post from a year ago that discusses it:
http://www.experts-exchange.com/Software/Office_Productivity/Q_28433399.html#a40065809

Many PDF print drivers rely on Ghostscript, including Bullzip and CutePDF, two of the best. But that's one reason I like doPDF — it does not use Ghostscript. Regards, Joe
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

A quick guide on how to use Group Policy to create a custom power plan and set it active on Windows 7.
Having trouble getting your hands on Dynamics 365 Field Service or Project Service trial? Worry No More!!!
The viewer will learn how to simulate a series of coin tosses with the rand() function and learn how to make these “tosses” depend on a predetermined probability. Flipping Coins in Excel: Enter =RAND() into cell A2: Recalculate the random variable…
This Micro Tutorial will give you a basic overview of Windows Live Photo Gallery and show you various editing filters and touches to photos you can apply. This will be demonstrated using Windows Live Photo Gallery on Windows 7 operating system.

828 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question