Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 233
  • Last Modified:

what font behind pdf

is it possible to find out what font is on a pdf, so that it can read legibly on the PC?

in PDF, it is fine, but when you copy it, the characters are gibberish..

thanks.
0
25112
Asked:
25112
  • 11
  • 10
  • 6
  • +3
1 Solution
 
Dave BaldwinFixer of ProblemsCommented:
Go to File -> Properties and click on the Fonts tab to see what fonts are being used.
PDF fonts
0
 
Joe Winograd, EE MVE 2015&2016DeveloperCommented:
This 5-minute EE video Micro Tutorial should help:
Xpdf - PDFfonts - Command Line Utility to List Fonts Used in a PDF File

Note that Step 10 underneath the video is the same as what Dave posted. Regards, Joe
0
 
☠ MASQ ☠Commented:
I'm guessing you don't actually have a font, just a graphical representation of one
See the explanation and suggested solutions here:

https://www.experts-exchange.com/questions/28564191/Cannot-convert-this-pdf-to-text.html
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
Joe Winograd, EE MVE 2015&2016DeveloperCommented:
I'm glad MASQ remembered that thread! About a year after it, I published a 5-minute EE video Micro Tutorial on one of the free OCR tools mentioned in it (PDF-XChange Editor):
How to OCR pages in a PDF with free software

Regards, Joe
0
 
viki2000Commented:
Could you upload the pdf or at least 1 page of it here?
Then we will tell you.
0
 
tliottaCommented:
...find out what font is on a pdf...
"On" a .PDF? What do you mean by "on"?

If there's a referenced font in the .PDF, open the .PDF in a text editor and search for {font}.
0
 
25112Author Commented:
thanks for your patience.. I had to get permission to upload a page.. please see attached..
what is ideal is to have similar font on PC, so we can copy it and paste in actual font..
10.pdf
0
 
viki2000Commented:
You did not mention is not English or Western/European font...
What language is that?
Is that Tamil?
0
 
Joe Winograd, EE MVE 2015&2016DeveloperCommented:
Acrobat shows this for that file:

10pdf

PDFfonts shows this:
name                type      emb sub uni object ID
--------------------------    --- --- --- ---------
SSBRRF+TT266t00     TrueType  yes yes no      11  0
SNDABN+Helvetica    Type 1C   yes yes no      13  0

Open in new window

Regards, Joe
0
 
25112Author Commented:
yes, Tamil.

is what is required to download a font called 'TT266t00'?
0
 
viki2000Commented:
I am curious who is able to provide a Word version of the document.
If I understood right, actually you are not so much interested to know the font type, but rather to be able to open or copy/paste in Word document.
Is that right?

Here is a list with Tamil fonts:
http://kandupidi.com/font_help.php
http://www.tn.nic.in/tamilsw/otf.htm
0
 
viki2000Commented:
I tried some tricks, but not sure if I got it right because I do not speak Tamil.
It seems a Scripture, Gospel about Jesus.
Just check it out and tell me if I am right.
10_001.jpg.docx
0
 
Joe Winograd, EE MVE 2015&2016DeveloperCommented:
@25112
> is what is required to download a font called 'TT266t00'?

Yes, or some other Tamil font and do a font substitution. The good news is that it's text, not an image, so OCR is not needed.

@viki2000
> I am curious who is able to provide a Word version of the document.

Not here, because I'm unwilling to download and install a Tamil font. In my Word 2016, the font in your document shows as Vijaya when I copy/paste a word to a new doc, but as Arima Madura when I enable editing on it. Can you explain what's going on with those fonts?
0
 
viki2000Commented:
No , I can't explain now.
The fonts come from internet. They are symbols for those glyphs.

When I enable editing in Word I see Arima Madurai, but if I copy and paste in a new docx then I have Latha.

Here are more fonts:
http://indiatyping.com/index.php/download/hindi-fonts

To identify the Hindi fonts is not easy like with Latin writings as click right, Properties and read the font type.
Here is some research/methods:
http://airccse.org/journal/ijci/papers/4315ijci02.pdf
http://esatjournals.net/ijret/2014v03/i03/IJRET20140303095.pdf
0
 
viki2000Commented:
So 25112, what do you say?
0
 
25112Author Commented:
thanks viki..
i checked..
Control Panel\All Control Panel Items\Fonts\
and see Latha font there already..

i searched for the other one..
and found:
https://github.com/NDISCOVER/Arima-Font/blob/master/fonts/otf/Madurai/ArimaMadurai-Bold.otf
when i put it in
Control Panel\All Control Panel Items\Fonts\
and using Word2010 to open the pdf
it asks 'select the encoding that makes your document readable : Text Encoding.. WINDOWS, MSDOS, OTHERENCODING..

can you guide where i have missed?
0
 
tliottaCommented:
"Encoding" is very different from "font". The two are almost unrelated.

Word will show that message when you try to open a file that isn't in a supported format. PDF isn't a Word DOC file, so Word (2010) doesn't know what to do with it. PDFs are opened by Adobe Acrobat Reader, not by Word.

There are plug-ins for Word (2010) that allow importing of PDFs, or you might use Adobe (or similar product) to export a PDF to a Word DOC or DOCX file.

If you already have some reliable plug-in for opening PDFs with Word 2010, it's also possible that the document actually has an "encoding" problem. If it wasn't created on a similar Windows system, it's possible that it's not even an ASCII-/Unicode-encoded file. For example, it conceivably could be a mainframe EBCDIC-encoded file. If so, then "font" isn't necessarily a critical part of the problem.
0
 
viki2000Commented:
@25112
My question was if you could read the text in Tamil, if you understand the language and if it is a Scripture, Gospel about Jesus.
Do you understand Tamil? Is the text a Scripture, Gospel about Jesus?
I can only tell you how I did to obtain the PDF file in Word format, from where you can easy copy paste without garbage characters, nothing more.
0
 
25112Author Commented:
>>If you already have some reliable plug-in for opening PDFs with Word 2010, it's also possible that the document actually has an "encoding" problem.
no plug-in, atm..

i don't need to use word2010 for this.. but vikki method has worked.. so would glean from it...
0
 
25112Author Commented:
>>
if it is a Scripture, Gospel about Jesus.

yes to above (in tamil language- confirmed!)


>>I can only tell you how I did to obtain the PDF file in Word format

thanks. can you guide what steps to take to make this happen.
0
 
viki2000Commented:
It is not easy, rather ugly long, but I could not find a better method free:
- I took your original pdf file and I converted it to image high resolution, 400-600 dpi.
- Then I took the image and I uploaded it on Google Drive.
- Then click right on the image and open with Google Docs.
- Then you save it as Word .docx on your PC.

Basically I avoid the exiting OCR pdf file with its own Tamil symbols and encoding and I use Google's OCR engine to get clean recognizable fonts/symbols with known encoding.
Try it by yourself with another pdf and see if it works for  you too.
0
 
25112Author Commented:
thanks..
for Google Drive, and Google docs, all you need is a gmail account or more?
0
 
Joe Winograd, EE MVE 2015&2016DeveloperCommented:
> for Google Drive, and Google docs, all you need is a gmail account or more?

No, you don't need a Gmail account. Any email account is fine. What you need is a Google Account, not Google Mail (Google Account works with a Google Mail account, but also works with any email account). You may create a Google Account (free!) here:
https://accounts.google.com/SignUp

Regards, Joe
0
 
25112Author Commented:
thanks to viki for the unconventional easy solution!

thanks to all who assisted..
0
 
25112Author Commented:
you had said:
>>It is not easy, rather ugly long, but I could not find a better method free:

what may be a alternative solution that would be simpler to use (for less tech savvy people in other developing countries.. to derive the same end result) ?
0
 
Joe Winograd, EE MVE 2015&2016DeveloperCommented:
> thanks to all who assisted

You're welcome. Happy to help. Regards, Joe
0
 
viki2000Commented:
I do not know now what software will work.
I tried Abbyy FineReader, but does not support Tamil.
I am thinking how can you accelerate/automate the tasks.
The main disadvantage to the proposed method is limitation to basically one time processing page.
You have a pdf with one page, then you get one picture.
If you have a pdf document with many pages, then you need a program to convert all the pages into separate pictures, so many pictures as many pages are in pdf. I used Nitro PDF. Then I could convert all pages from pdf document into individual pictures. Then have to upload them one by one. You may put original pdf in one dedicated folder and the obtained pictures in the same folder. Then at upload you click CTRL+A to select all, the CTRL+pdf to deselect the pdf. Then is automatically uploaded one by one.
But then you are in trouble with open in Google Docs and save one by one. That takes time. It is not anymore a batch operation. This seems the bottle neck.
Once you have them back in your PC as .docx, then you can merge all .docx files into a single one.
0
 
25112Author Commented:
thanks for your review with FineReader..

so at the moment we have only one sure (google) solution- for 1 page or 10, right? (for the language in question)
0
 
viki2000Commented:
I guess so.
If I find any other method/program I will let you know.
0
 
25112Author Commented:
thank u indeed.
0
 
☠ MASQ ☠Commented:
So in summary the solution was to capture an image and then use OCR :)
0
 
25112Author Commented:
good conclusion, MASQ.. but seems like the regular OCR we may have in common PCs was NOT up to par, and google seems to have enough tools to handle a lot..!!
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 11
  • 10
  • 6
  • +3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now