?
Solved

Need a script to read pdf document info.

Posted on 2001-07-31
7
Medium Priority
?
958 Views
Last Modified: 2008-02-01
I currently have a folder on my Intranet server (IIS 4)containing many pdf files. Can anyone provide me with a "property reader" for pdf similar to the DSOLEFILE.DLL provided by Microsoft for use with MS Office documents.

http://msdn.microsoft.com/library/periodic/period00/fso.htm

Used in conjunction with the FileSystemObject you can trawl through folders, read each document's properties and generate html and hyperlinks on the fly.

Basically I want dumb users to drop their pdf's into a folder on the server, and have an asp or similar to read each file's properties and display Author, Subject and Title (with hyperlink). I've looked at adobe.com but can't find anything suitable.
0
Comment
Question by:devlinb
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
7 Comments
 
LVL 19

Expert Comment

by:webwoman
ID: 6338061
You're not going to find anything on Adobe's site, because this has nothing to do with PDF -- and everything to do with the SERVER.

YOu need to set up a form for the user to upload their files. You won't have a whole lot of control over what or how they upload, though you certainly could write something that ran on the server and deleted anything that didn't meet your specs (for filetype/size).

That dll is specifically designed to work with IIS -- it's not going to work with Apache, or on a UNIX box. It works with MS stuff because MS wrote it.
0
 
LVL 5

Expert Comment

by:raizon
ID: 6338075
I believe the point of the question was finding some way to read the properties of the PDF files dynamically to display the Author, Subject and Title of the file.

What I would do is

1.  In my upload form I would have text fields for the
Author, Subject and Title of the PDF.

2.  Create a DB with a table to hold that information and relate that table to anotherone that held the path to the file that was uploaded.

3.  When reading through the directory with the FileSystemObject query the DB to get the Information and build your page based off of that.

Raizon
0
 
LVL 1

Accepted Solution

by:
coreyti earned 300 total points
ID: 6338141
There is a Perl module that can take care of this stuff if you're able to use Perl for your project.

Checkout the PDF::Parse library at:
http://search.cpan.org/doc/ANTRO/PDF-111/PDF/Parse.pm

-corey
0
Optimize your web performance

What's in the eBook?
- Full list of reasons for poor performance
- Ultimate measures to speed things up
- Primary web monitoring types
- KPIs you should be monitoring in order to increase your ROI

 
LVL 2

Author Comment

by:devlinb
ID: 6355247
thanks coreyti,
This script works fine - except for PDF's which have any security built-in. When security is added to a PDF the document info is encrypted in some way and displays as garbage. Is there any way to get around this?
0
 
LVL 19

Expert Comment

by:webwoman
ID: 6357211
Unlikely. It's got security because it's not supposed to be accessible.
0
 
LVL 2

Author Comment

by:devlinb
ID: 6358728
I disagree - the document information is still accessible in the reader even after a pdf has been secured. Why would Adobe want to make this inaccessible when all you want to do is prevent a pdf document from being modified?
0
 
LVL 2

Author Comment

by:devlinb
ID: 6368770
When security is added to a PDF, the document info is encrypted in some way and displays as garbage. Is there any way to get around
this?
0

Featured Post

Get real performance insights from real users

Key features:
- Total Pages Views and Load times
- Top Pages Viewed and Load Times
- Real Time Site Page Build Performance
- Users’ Browser and Platform Performance
- Geographic User Breakdown
- And more

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When it comes to write a Context Sensitive Help (an online help that is obtained from a specific point in state of software to provide help with that state) ,  first we need to make the file that contains all topics, which are given exclusive IDs. …
Dramatic changes are revolutionizing how we build and use technology. Every company is automating, digitizing, and modernizing operations. We need a better, more connected way to work together as teams so we can harness the insights from our system…
The viewer will learn how to dynamically set the form action using jQuery.
Any person in technology especially those working for big companies should at least know about the basics of web accessibility. Believe it or not there are even laws in place that require businesses to provide such means for the disabled and aging p…
Suggested Courses

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question