?
Solved

converting pdf table to excel

Posted on 2014-08-30
5
Medium Priority
?
322 Views
Last Modified: 2014-09-02
which tools may help me convert a pdf which contains information formatted as tables to excel format?
would this tool be considered 100% reliable without the risk of erasing some empty colums or moving the content to a row or column that do not match the original pdf information?
My pdf files contains tables which columns have titles but not all the columns have information, in some cases column A,B and D contains information but column C only contains the column header.

best regards
smile be brave
0
Comment
Question by:smilebebrave
  • 3
  • 2
5 Comments
 
LVL 57

Assisted Solution

by:Joe Winograd, EE MVE 2015&2016
Joe Winograd, EE MVE 2015&2016 earned 1336 total points
ID: 40294901
For PDF to Excel, I've had excellent results with this free online tool:
http://www.pdftoexcel.org/

It does a good (but not perfect) job of maintaining the formatting, which is always the trick with any PDF to Excel (or Word) conversion. I don't know if it will work well on your particular PDFs, but it's worth a (free!) shot. If you do like it and would prefer a local install rather than the online tool, it is available for purchase and download (and it has a 7-day free trial):
http://www.investintech.com/prod_downloadsa2e.htm

Another product worth trying is A-PDF to Excel:
http://www.a-pdf.com/to-excel/index.htm

It's not free, but is reasonably priced at $39 USD, and it offers a free trial so you can see if it works on your PDFs before buying it.
would this tool be considered 100% reliable...
The answer is NO. I'm not aware of any PDF to Excel (or Word) conversion tool that is 100% reliable — and I've tried lots of them. Some are very good, but not perfect. After doing the conversion, you'll need to open the spreadsheet in Excel and fix the conversion errors. Regards, Joe
0
 
LVL 46

Assisted Solution

by:aikimark
aikimark earned 664 total points
ID: 40294956
You might also use a utility like pdftotext to extract a formatted version of the PDF tables and then import them.  You might also try a different output format to see if it is easier to parse or import into Excel.
http://www.foolabs.com/xpdf/download.html
0
 
LVL 46

Expert Comment

by:aikimark
ID: 40294959
What software languages do you use or are already installed?
0
 
LVL 57

Accepted Solution

by:
Joe Winograd, EE MVE 2015&2016 earned 1336 total points
ID: 40294969
That's an interesting suggestion by aikimark, but I think you'll find that pdftotext won't maintain the spreadsheet layout as well as you would want. However, if you'd like to try it, here's a 5-minute EE video Micro Tutorial that explains how to download and install the Xpdf library:
http://www.experts-exchange.com/Web_Development/Document_Imaging/VP_213.html

And here's another 5-minute EE video Micro Tutorial that specifically discusses and demonstrates pdftotext:
http://www.experts-exchange.com/Web_Development/Document_Imaging/VP_217.html

And here's an article that shows how to use the command line call with the -layout option, which is what you'll want to try:
http://www.experts-exchange.com/Software/Misc/A_11173-How-To-Rename-Move-a-Batch-of-PDF-Files-Based-on-Contents-of-the-Files.html

Although the -layout option tells pdftotext to maintain, as much as possible, the original physical layout of the text, I doubt that it's going to do what you want. I think you'll be better off with the other options I mentioned, but it can't hurt to try pdftotext — and it's free (for personal use). Regards, Joe
0
 
LVL 46

Expert Comment

by:aikimark
ID: 40294972
Thanks, Joe.
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When you see single cell contains number and text, and you have to get any date out of it seems like cracking our heads.
Steps to fix error: “Couldn’t mount the database that you specified. Specified database: HU-DB; Error code: An Active Manager operation fail”
Simple Linear Regression
Starting up a Project

579 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question