Solved

what font behind pdf

Posted on 2016-09-27
32
159 Views
Last Modified: 2016-10-21
is it possible to find out what font is on a pdf, so that it can read legibly on the PC?

in PDF, it is fine, but when you copy it, the characters are gibberish..

thanks.
0
Comment
Question by:25112
  • 11
  • 10
  • 6
  • +3
32 Comments
 
LVL 83

Expert Comment

by:Dave Baldwin
ID: 41818557
Go to File -> Properties and click on the Fonts tab to see what fonts are being used.
PDF fonts
0
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 41818588
This 5-minute EE video Micro Tutorial should help:
Xpdf - PDFfonts - Command Line Utility to List Fonts Used in a PDF File

Note that Step 10 underneath the video is the same as what Dave posted. Regards, Joe
0
 
LVL 62

Expert Comment

by:☠ MASQ ☠
ID: 41818681
I'm guessing you don't actually have a font, just a graphical representation of one
See the explanation and suggested solutions here:

https://www.experts-exchange.com/questions/28564191/Cannot-convert-this-pdf-to-text.html
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 41818698
I'm glad MASQ remembered that thread! About a year after it, I published a 5-minute EE video Micro Tutorial on one of the free OCR tools mentioned in it (PDF-XChange Editor):
How to OCR pages in a PDF with free software

Regards, Joe
0
 
LVL 20

Expert Comment

by:viki2000
ID: 41819188
Could you upload the pdf or at least 1 page of it here?
Then we will tell you.
0
 
LVL 27

Expert Comment

by:tliotta
ID: 41829224
...find out what font is on a pdf...
"On" a .PDF? What do you mean by "on"?

If there's a referenced font in the .PDF, open the .PDF in a text editor and search for {font}.
0
 
LVL 5

Author Comment

by:25112
ID: 41831702
thanks for your patience.. I had to get permission to upload a page.. please see attached..
what is ideal is to have similar font on PC, so we can copy it and paste in actual font..
10.pdf
0
 
LVL 20

Expert Comment

by:viki2000
ID: 41831738
You did not mention is not English or Western/European font...
What language is that?
Is that Tamil?
0
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 41831760
Acrobat shows this for that file:

10pdf

PDFfonts shows this:
name                type      emb sub uni object ID
--------------------------    --- --- --- ---------
SSBRRF+TT266t00     TrueType  yes yes no      11  0
SNDABN+Helvetica    Type 1C   yes yes no      13  0

Open in new window

Regards, Joe
0
 
LVL 5

Author Comment

by:25112
ID: 41831817
yes, Tamil.

is what is required to download a font called 'TT266t00'?
0
 
LVL 20

Expert Comment

by:viki2000
ID: 41831916
I am curious who is able to provide a Word version of the document.
If I understood right, actually you are not so much interested to know the font type, but rather to be able to open or copy/paste in Word document.
Is that right?

Here is a list with Tamil fonts:
http://kandupidi.com/font_help.php
http://www.tn.nic.in/tamilsw/otf.htm
0
 
LVL 20

Expert Comment

by:viki2000
ID: 41831967
I tried some tricks, but not sure if I got it right because I do not speak Tamil.
It seems a Scripture, Gospel about Jesus.
Just check it out and tell me if I am right.
10_001.jpg.docx
0
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 41832051
@25112
> is what is required to download a font called 'TT266t00'?

Yes, or some other Tamil font and do a font substitution. The good news is that it's text, not an image, so OCR is not needed.

@viki2000
> I am curious who is able to provide a Word version of the document.

Not here, because I'm unwilling to download and install a Tamil font. In my Word 2016, the font in your document shows as Vijaya when I copy/paste a word to a new doc, but as Arima Madura when I enable editing on it. Can you explain what's going on with those fonts?
0
 
LVL 20

Expert Comment

by:viki2000
ID: 41832561
No , I can't explain now.
The fonts come from internet. They are symbols for those glyphs.

When I enable editing in Word I see Arima Madurai, but if I copy and paste in a new docx then I have Latha.

Here are more fonts:
http://indiatyping.com/index.php/download/hindi-fonts

To identify the Hindi fonts is not easy like with Latin writings as click right, Properties and read the font type.
Here is some research/methods:
http://airccse.org/journal/ijci/papers/4315ijci02.pdf
http://esatjournals.net/ijret/2014v03/i03/IJRET20140303095.pdf
0
 
LVL 20

Expert Comment

by:viki2000
ID: 41845660
So 25112, what do you say?
0
 
LVL 5

Author Comment

by:25112
ID: 41848965
thanks viki..
i checked..
Control Panel\All Control Panel Items\Fonts\
and see Latha font there already..

i searched for the other one..
and found:
https://github.com/NDISCOVER/Arima-Font/blob/master/fonts/otf/Madurai/ArimaMadurai-Bold.otf
when i put it in
Control Panel\All Control Panel Items\Fonts\
and using Word2010 to open the pdf
it asks 'select the encoding that makes your document readable : Text Encoding.. WINDOWS, MSDOS, OTHERENCODING..

can you guide where i have missed?
0
 
LVL 27

Expert Comment

by:tliotta
ID: 41849585
"Encoding" is very different from "font". The two are almost unrelated.

Word will show that message when you try to open a file that isn't in a supported format. PDF isn't a Word DOC file, so Word (2010) doesn't know what to do with it. PDFs are opened by Adobe Acrobat Reader, not by Word.

There are plug-ins for Word (2010) that allow importing of PDFs, or you might use Adobe (or similar product) to export a PDF to a Word DOC or DOCX file.

If you already have some reliable plug-in for opening PDFs with Word 2010, it's also possible that the document actually has an "encoding" problem. If it wasn't created on a similar Windows system, it's possible that it's not even an ASCII-/Unicode-encoded file. For example, it conceivably could be a mainframe EBCDIC-encoded file. If so, then "font" isn't necessarily a critical part of the problem.
0
 
LVL 20

Expert Comment

by:viki2000
ID: 41849753
@25112
My question was if you could read the text in Tamil, if you understand the language and if it is a Scripture, Gospel about Jesus.
Do you understand Tamil? Is the text a Scripture, Gospel about Jesus?
I can only tell you how I did to obtain the PDF file in Word format, from where you can easy copy paste without garbage characters, nothing more.
0
 
LVL 5

Author Comment

by:25112
ID: 41849946
>>If you already have some reliable plug-in for opening PDFs with Word 2010, it's also possible that the document actually has an "encoding" problem.
no plug-in, atm..

i don't need to use word2010 for this.. but vikki method has worked.. so would glean from it...
0
 
LVL 5

Author Comment

by:25112
ID: 41849950
>>
if it is a Scripture, Gospel about Jesus.

yes to above (in tamil language- confirmed!)


>>I can only tell you how I did to obtain the PDF file in Word format

thanks. can you guide what steps to take to make this happen.
0
 
LVL 20

Accepted Solution

by:
viki2000 earned 500 total points
ID: 41849976
It is not easy, rather ugly long, but I could not find a better method free:
- I took your original pdf file and I converted it to image high resolution, 400-600 dpi.
- Then I took the image and I uploaded it on Google Drive.
- Then click right on the image and open with Google Docs.
- Then you save it as Word .docx on your PC.

Basically I avoid the exiting OCR pdf file with its own Tamil symbols and encoding and I use Google's OCR engine to get clean recognizable fonts/symbols with known encoding.
Try it by yourself with another pdf and see if it works for  you too.
0
 
LVL 5

Author Comment

by:25112
ID: 41850167
thanks..
for Google Drive, and Google docs, all you need is a gmail account or more?
0
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 41850232
> for Google Drive, and Google docs, all you need is a gmail account or more?

No, you don't need a Gmail account. Any email account is fine. What you need is a Google Account, not Google Mail (Google Account works with a Google Mail account, but also works with any email account). You may create a Google Account (free!) here:
https://accounts.google.com/SignUp

Regards, Joe
0
 
LVL 5

Author Comment

by:25112
ID: 41853810
thanks to viki for the unconventional easy solution!

thanks to all who assisted..
0
 
LVL 5

Author Comment

by:25112
ID: 41853820
you had said:
>>It is not easy, rather ugly long, but I could not find a better method free:

what may be a alternative solution that would be simpler to use (for less tech savvy people in other developing countries.. to derive the same end result) ?
0
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 41853911
> thanks to all who assisted

You're welcome. Happy to help. Regards, Joe
0
 
LVL 20

Expert Comment

by:viki2000
ID: 41854045
I do not know now what software will work.
I tried Abbyy FineReader, but does not support Tamil.
I am thinking how can you accelerate/automate the tasks.
The main disadvantage to the proposed method is limitation to basically one time processing page.
You have a pdf with one page, then you get one picture.
If you have a pdf document with many pages, then you need a program to convert all the pages into separate pictures, so many pictures as many pages are in pdf. I used Nitro PDF. Then I could convert all pages from pdf document into individual pictures. Then have to upload them one by one. You may put original pdf in one dedicated folder and the obtained pictures in the same folder. Then at upload you click CTRL+A to select all, the CTRL+pdf to deselect the pdf. Then is automatically uploaded one by one.
But then you are in trouble with open in Google Docs and save one by one. That takes time. It is not anymore a batch operation. This seems the bottle neck.
Once you have them back in your PC as .docx, then you can merge all .docx files into a single one.
0
 
LVL 5

Author Comment

by:25112
ID: 41854059
thanks for your review with FineReader..

so at the moment we have only one sure (google) solution- for 1 page or 10, right? (for the language in question)
0
 
LVL 20

Expert Comment

by:viki2000
ID: 41854084
I guess so.
If I find any other method/program I will let you know.
0
 
LVL 5

Author Comment

by:25112
ID: 41854096
thank u indeed.
0
 
LVL 62

Expert Comment

by:☠ MASQ ☠
ID: 41854103
So in summary the solution was to capture an image and then use OCR :)
0
 
LVL 5

Author Comment

by:25112
ID: 41854105
good conclusion, MASQ.. but seems like the regular OCR we may have in common PCs was NOT up to par, and google seems to have enough tools to handle a lot..!!
0

Featured Post

Manage your data center from practically anywhere

The KN8164V features HD resolution of 1920 x 1200, FIPS 140-2 with level 1 security standards and virtual media transmissions at twice the speed. Built for reliability, the KN series provides local console and remote over IP access, ensuring 24/7 availability to all servers.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Possible fixes for Windows 7 and Windows Server 2008 updating problem. Solutions mentioned are from Microsoft themselves. I started a case with them from our Microsoft Silver Partner option to open a case and get direct support from Microsoft. If s…
Gift cards are not a new concept - it's been around for a very long time.  Undoubtedly, over the past you have received such a card or purchased one for a friend or relative.  Are you aware that you've been feeding the machine?  If not, read on :)
Notifications on Experts Exchange help you keep track of your activity and updates in one place. Watch this video to learn how to use them on the site to quickly access the content that matters to you.
We often encounter PDF files that are pure images, that is, they do not have text characters, but instead contain only raster graphics. The most common causes of this are document scanning software and faxing software/services that create image-only…

679 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question