Avatar of curiouswebster
curiouswebsterFlag for United States of America

asked on 

Extracting a CSV from a PDF

I need my website to extract the data from a PDF and generate a CSV file. And I hope to do this on the front end, inside the client browser. But, if required, I could to this extraction on the back-end.

The PDF would be a month merchant credit card statement. The data I would extract to a CSV would be the numerous transactions.

What web technology can do this? And without human intervention.

Thanks.
Shell ScriptingAdobe AcrobatProgramming Languages-OtherAdobe Creative Suite CSScripting Languages

Avatar of undefined
Last Comment
Maggie JY
SOLUTION
Avatar of David Favor
David Favor
Flag of United States of America image

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
SOLUTION
Avatar of David Favor
David Favor
Flag of United States of America image

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
Avatar of curiouswebster
curiouswebster
Flag of United States of America image

ASKER

Is the human review step because of potential formatting issues? Otherwise, since this does not involve converting images to text, how might an error be introduced?
ASKER CERTIFIED SOLUTION
Avatar of David Favor
David Favor
Flag of United States of America image

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
Avatar of Maggie JY
Maggie JY
Flag of Canada image

You can actually extract data or information from PDF form when you have the right PDF tools.  Here are two samples.

1. Export form data into excel (CSV.), please see the screenshot:
https://pdfimages.wondershare.com/images/vis-2016/form-field-extraction.gif

2. Export data from scanned PDFs, please see the screenshot:
https://pdfimages.wondershare.com/images/vis-2016/Scanned-document.gif

Then you can extra PDF form data from from hundreds of identical forms into a single, accessible Excel sheet within seconds.
If your files are scanned PDFs, then see the second screenshot,  OCR technology can converts piles of paper documents into Office files, then apply the same data extraction rules to hundreds of scanned PDFs with the identical layout, and export all the data into one single spreadsheet.
Here is the full guide:
extract data
Scripting Languages
Scripting Languages

A scripting language is a programming language that supports scripts, programs written for a special run-time environment that automate the execution of tasks that could alternatively be executed one-by-one by a human operator. Scripting languages are often interpreted (rather than compiled). Primitives are usually the elementary tasks or API calls, and the language allows them to be combined into more complex programs. Environments that can be automated through scripting include software applications, web pages within a web browser, the shells of operating systems (OS), embedded systems, as well as numerous games. A scripting language can be viewed as a domain-specific language for a particular environment; in the case of scripting an application, this is also known as an extension language.

30K
Questions
--
Followers
--
Top Experts
Get a personalized solution from industry experts
Ask the experts
Read over 600 more reviews

TRUSTED BY

IBM logoIntel logoMicrosoft logoUbisoft logoSAP logo
Qualcomm logoCitrix Systems logoWorkday logoErnst & Young logo
High performer badgeUsers love us badge
LinkedIn logoFacebook logoX logoInstagram logoTikTok logoYouTube logo