Link to home
Start Free TrialLog in
Avatar of LLaurent-59
LLaurent-59

asked on

Cannot convert this pdf to text

Hello
 
I want to convert this pdf to text but I dont understand why I get no correct characters in text mode
 
any help appreciated
 
Thanks in advance
 
LLaurent
Acronyms.pdf
ASKER CERTIFIED SOLUTION
Avatar of ☠ MASQ ☠
☠ MASQ ☠

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
you can also try this online concerter :  http://convertonlinefree.com/PDFToTXTEN.aspx
Avatar of LLaurent-59
LLaurent-59

ASKER

Thanks MASQ,

that's what I imagined, but do you know any free OCR package

Thanks in advance
Thanks NOBUS

but when I try this online converter : "Text cannot be extracted..."
Dear Sir , kindly use the This one software  .its solve your issue  [  nitro pro 8 ] Thanks
Thanks again NOBUS

but again, this give me an incredible result,
if you can read it or traduce it !!! amazing !!!
Acronyms-zamzar.txt
Thanks NICE-GHAZA

Nitro Pro is a very sophisticated pdf tool, but expensive too,
I dont want to upload such as big application

I would like to find an easy ocr tool ... as suggested by MASQ
I found a solution using PDFCreator, and next Foxit Phantom PDF which have an ocr tool
Dear Sir, use any one small  tool
(1)  AbbyyFineReader8

(2) Able2Extract.Professional.v6.0.0.0-NoPE



Thanks
PDF-XChange Editor comes in Free and Pro versions:
http://www.tracker-software.com/product/pdf-xchange-editor

When you install it, select Free. Even the Free version has OCR and it handles your document perfectly. I just OCRed it with Accuracy set to High, and Output Type set to Create New Searchable PDF:

User generated image
Attached is the PDF that it created. The text copy/pastes perfectly! Regards, Joe
Acronyms-via-PDF-XChange-Editor-OCR.pdf
It just occurred to me that your doc has French, so I OCRed it again and selected French as the language:

User generated image
Also, I should have mentioned that the OCR tool is on the Document menu:

User generated image
Attached is the PDF that it created with French as the OCR language choice. Once again, the text copy/pastes perfectly!

Also, be sure to pick the Free Version when you run the installer:

User generated image
Regards, Joe
Acronyms-OCR-French.pdf
Thanks JOE

I am glad to see it works with PDF-XChange

here is a part of a selection-copy-paste of the beginning of file results :
CNRS
CNUCED
CRS
CSFI
DEA
Centre national de la recherche scientifique
Conference des Nations unies sur le commerce et le developpement
Catholic Relief Services
Centre for the Study of Financial Innovation ...

in fact it would be OK if it was the following
CNRS Centre national de la recherche scientifique
CNUCED Conférence des Nations unies sur le commerce et le développement
CRS Catholic Relief Services
CSFI Centre for the Study of Financial Innovation
DEA ...

I obtained it using Foxit PhantomPDF OCR

and I would appreciate to find an easy ocr toll for doing this

...
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
For a one off OCR job I'd make a screenshot image and use the free OCR built into Google docs (it's the engine they use for their document scanning projects)
https://support.google.com/drive/answer/176692?hl=en

As you've discovered it's pointless simply trying pdf to text tools as they can't translate the Type3 fonts into Unicode you'll just get varying shades of gibberish!
Hi LLaurent,
It's been 10 days since I documented a solution for you (using free tools only) in this post <http:#a40450183> and I'm wondering if you've had a chance to try it. It worked perfectly here (as the copy/paste at my last post shows), but if you're having any problems getting it to work, please let me know and I'll try to help you through it. Regards, Joe
I've requested that this question be deleted for the following reason:

The question has either no comments or not enough useful information to be called an "answer".
There is definitely enough useful information to be called an "answer". My solution presented in <http:#a40450183> works perfectly. As the asker requested, it uses all free tools — the free IrfanView, the free PlugIns for PDF support, and the free plug-in for OCR, including its free French language dictionary. I ran the proposed solution on the asker's actual file. It worked perfectly — I even did a copy/paste of the results from the asker's actual file into my post. I'd like to know if anyone can explain how this doesn't exactly and precisely answer the question. Thanks, Joe
Agreed

There were two parts to the question - "I want to convert this pdf to text" & "I dont understand why I get no correct characters in text mode"

Looks like both had been completely covered

In fact by #40449715 the asker seems to have found an OCR solution themselves.
MASQ makes an excellent point about <http:#a40449715>, where the asker seems to say that Nitro Pro's PDF Creator capability answers the question (but is too expensive and too big/sophisticated) and so does Foxit PhantomPDF (also an expensive, big, sophisticated product). That's when the asker (in <http:#a40449250>) made it clear that he wants a free product:
do you know any free OCR package
Regards, Joe
Hi eenookami,

I recommend that it be closed by #2, with these specific comment IDs:

(1) http:#a40449113

MASQ had it right when he suggested OCR.

(2) http:#a40450183

I showed a specific solution with all free tools that worked perfectly on the asker's actual document.

Thanks, Joe