Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
?
Solved

Reading .PDF file

Posted on 2003-11-20
8
Medium Priority
?
333 Views
Last Modified: 2010-04-16
Hi friends,
I would like to read .PDF content and store that content in the database(SQL Server 2000). What are all the available methods do we have to achieve the above task?

Thanks,
Ramachandra

0
Comment
Question by:raama16
6 Comments
 
LVL 3

Expert Comment

by:barryfandango
ID: 9787417
raama16,

In principle you can open the file with C# and put it into a blob, or "image" field in sql server.  Generally this is not recommended though, as moving entire files can really slow down your SQL server.  It's often better to just store the filename and/or path and have that file kept on the hard disk.  (just a suggestion of course.)
0
 
LVL 3

Accepted Solution

by:
barryfandango earned 204 total points
ID: 9787498
using System.IO;
using System.Data;
using System.Data.SqlClient;

FileStream myFile = new FileStream(ImageFile, FileMode.Open, FileAccess.Read);
byte[] MyPDF = new byte[myFile.Length];
myFile.Read(MyPDF, 0, (int)myFile.Length);
myFile.Close();

string ConnectString = "MyDSNEtc";
SqlConnection myCon = new SqlConnection(ConnectString) )
myCon.Open();

SqlCommand myCmd = new SqlCommand("AddPDF", myCon);
myCmd.CommandType = CommandType.StoredProcedure;
myCmd.Parameters.Add(new SqlParameter("@Id", SqlDbType.Int32));
myCmd.Parameters.Add(new SqlParameter("@Data", SqlDbType.Image));
myCmd.Parameters["@Data"].Value = MyPDF;
myCmd.ExecuteNonQuery();
myCon.Close();

This uses a stored procedure that would look something like

CREATE PROCEDURE dbo.AddPDF
(
      @Id int,
      @Data image
)
AS
INSERT INTO MyPDFTable
      ( Id, Data )
VALUES
      ( @Id, @Data )
0
 
LVL 9

Assisted Solution

by:malharone
malharone earned 198 total points
ID: 9789101
i think rama means actually parsing the contents... i don't think its easily possible ... since there are many encryptions & encoding for pdfs. what i've done is let the user open the pdf file first. from the reader CTRL+A, CTRL+C - to copy all the content. then wrote a little program that does pattern recognition of the data using regex & little bit of AI. i also let the users interactively create their own pattern. and then store the parsed contents in a DB/Excel file.
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
LVL 10

Expert Comment

by:ptmcomp
ID: 9800166
If you want to extract the text buy a third party tool or use Acrobat. (The implementation would take you months - believe me!)
About the performance storing text in a database - we once zipped the text to make it faster and it got slower cause zip was slower than the database. It depends on the computer and network speed you have. Of course locally files are faster than the database over network but in a database you have transaction and locking control.
0
 
LVL 1

Author Comment

by:raama16
ID: 9829639
Hi Friends,
I am going to accept any one of the above answers. Before that, are there any way to read PDF file using Crystal Report.Net engine?

Thanks,
Ramachandra
0
 
LVL 10

Assisted Solution

by:ptmcomp
ptmcomp earned 198 total points
ID: 9830071
Don't think so since Reporting is the opposite of parsing.
0

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction Hi all and welcome to my first article on Experts Exchange. A while ago, someone asked me if i could do some tutorials on object oriented programming. I decided to do them on C#. Now you may ask me, why's that? Well, one of the re…
It was really hard time for me to get the understanding of Delegates in C#. I went through many websites and articles but I found them very clumsy. After going through those sites, I noted down the points in a easy way so here I am sharing that unde…
Exchange organizations may use the Journaling Agent of the Transport Service to archive messages going through Exchange. However, if the Transport Service is integrated with some email content management application (such as an anti-spam), the admin…
When cloud platforms entered the scene, users and companies jumped on board to take advantage of the many benefits, like the ability to work and connect with company information from various locations. What many didn't foresee was the increased risk…
Suggested Courses

571 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question