?
Solved

Reading .PDF file

Posted on 2003-11-20
8
Medium Priority
?
331 Views
Last Modified: 2010-04-16
Hi friends,
I would like to read .PDF content and store that content in the database(SQL Server 2000). What are all the available methods do we have to achieve the above task?

Thanks,
Ramachandra

0
Comment
Question by:raama16
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
8 Comments
 
LVL 3

Expert Comment

by:barryfandango
ID: 9787417
raama16,

In principle you can open the file with C# and put it into a blob, or "image" field in sql server.  Generally this is not recommended though, as moving entire files can really slow down your SQL server.  It's often better to just store the filename and/or path and have that file kept on the hard disk.  (just a suggestion of course.)
0
 
LVL 3

Accepted Solution

by:
barryfandango earned 204 total points
ID: 9787498
using System.IO;
using System.Data;
using System.Data.SqlClient;

FileStream myFile = new FileStream(ImageFile, FileMode.Open, FileAccess.Read);
byte[] MyPDF = new byte[myFile.Length];
myFile.Read(MyPDF, 0, (int)myFile.Length);
myFile.Close();

string ConnectString = "MyDSNEtc";
SqlConnection myCon = new SqlConnection(ConnectString) )
myCon.Open();

SqlCommand myCmd = new SqlCommand("AddPDF", myCon);
myCmd.CommandType = CommandType.StoredProcedure;
myCmd.Parameters.Add(new SqlParameter("@Id", SqlDbType.Int32));
myCmd.Parameters.Add(new SqlParameter("@Data", SqlDbType.Image));
myCmd.Parameters["@Data"].Value = MyPDF;
myCmd.ExecuteNonQuery();
myCon.Close();

This uses a stored procedure that would look something like

CREATE PROCEDURE dbo.AddPDF
(
      @Id int,
      @Data image
)
AS
INSERT INTO MyPDFTable
      ( Id, Data )
VALUES
      ( @Id, @Data )
0
 
LVL 9

Assisted Solution

by:malharone
malharone earned 198 total points
ID: 9789101
i think rama means actually parsing the contents... i don't think its easily possible ... since there are many encryptions & encoding for pdfs. what i've done is let the user open the pdf file first. from the reader CTRL+A, CTRL+C - to copy all the content. then wrote a little program that does pattern recognition of the data using regex & little bit of AI. i also let the users interactively create their own pattern. and then store the parsed contents in a DB/Excel file.
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
LVL 10

Expert Comment

by:ptmcomp
ID: 9800166
If you want to extract the text buy a third party tool or use Acrobat. (The implementation would take you months - believe me!)
About the performance storing text in a database - we once zipped the text to make it faster and it got slower cause zip was slower than the database. It depends on the computer and network speed you have. Of course locally files are faster than the database over network but in a database you have transaction and locking control.
0
 
LVL 1

Author Comment

by:raama16
ID: 9829639
Hi Friends,
I am going to accept any one of the above answers. Before that, are there any way to read PDF file using Crystal Report.Net engine?

Thanks,
Ramachandra
0
 
LVL 10

Assisted Solution

by:ptmcomp
ptmcomp earned 198 total points
ID: 9830071
Don't think so since Reporting is the opposite of parsing.
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Article by: Najam
Having new technologies does not mean they will completely replace old components.  Recently I had to create WCF that will be called by VB6 component.  Here I will describe what steps one should follow while doing so, please feel free to post any qu…
Introduction Hi all and welcome to my first article on Experts Exchange. A while ago, someone asked me if i could do some tutorials on object oriented programming. I decided to do them on C#. Now you may ask me, why's that? Well, one of the re…
In this brief tutorial Pawel from AdRem Software explains how you can quickly find out which services are running on your network, or what are the IP addresses of servers responsible for each service. Software used is freeware NetCrunch Tools (https…
Monitoring a network: how to monitor network services and why? Michael Kulchisky, MCSE, MCSA, MCP, VTSP, VSP, CCSP outlines the philosophy behind service monitoring and why a handshake validation is critical in network monitoring. Software utilized …
Suggested Courses
Course of the Month12 days, 15 hours left to enroll

777 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question