Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
?
Solved

How to search a PDF File using VB.NET

Posted on 2011-02-21
6
Medium Priority
?
1,848 Views
Last Modified: 2012-05-11
Hello,

I have a table with about 5000 records, I need to find the page numbers of the PDFs when a value from the table is found in the PDF. Is there a way to loop through the table and copy the page numbers of the PDFs in a column in the same table? I am using VB.NET with ACCESS 2007.

Thanks,

Victor
0
Comment
Question by:vcharles
5 Comments
 
LVL 22

Expert Comment

by:plusone3055
ID: 34946251
0
 

Author Comment

by:vcharles
ID: 34946799
How do I modify the code below to loop through the 5000 records to achieve the same task. My project is in VB.NET, do you know a good program to convert the coide below to vb.net?

using Acrobat;
using AFORMAUTLib;                              
private void pdfRandD(string fPath)        
{ AcroPDDocClass objPages = new AcroPDDocClass();            
objPages.Open(fPath);            
long TotalPDFPages = objPages.GetNumPages();              
objPages.Close();        
AcroAVDocClass avDoc = new AcroAVDocClass();  
avDoc.Open(fPath, "Title");          
IAFormApp formApp = new AFormAppClass();            
IFields myFields = (IFields)formApp.Fields;                        
string searchWord = "Search String";            
string k = "";            
StreamWriter sw = new StreamWriter(@"D:\KCG_FileChecker_Inputs\MAC\pdf\0230_525490_23_cha17.txt", false);          
for (int p = 0; p < TotalPDFPages; p++)            
{int numWords = int.Parse(myFields.ExecuteThisJavascript("event.value=this.getPageNumWords(" + p + ");"));              
 k = "";                
for (int i = 0; i < numWords; i++) {string chkWord = myFields.ExecuteThisJavascript("event.value=this.getPageNthWord(" + p + "," + i + ", true);");                  
 k = k + " " + chkWord;}
 if(k.Trim().Contains(searchWord))                
{int pNum = int.Parse(myFields.ExecuteThisJavascript("event.value=this.getPageLabel(" + p + ",true);"));                    
sw.WriteLine("The Word " + searchWord + " is exists in " + pNum);            
 }            
}            
sw.Close();            
MessageBox.Show("Process completed");        
}


Thamnks,

Victor
0
 
LVL 23

Accepted Solution

by:
wdosanjos earned 2000 total points
ID: 34948772
Check the iTextSharp library (http://sourceforge.net/projects/itextsharp/) more specifically the PdfReader class.

You can do something like this (untested):

Dim reader as PdfReader, page As Integer, npages As Integer, content As String, buffer() As Byte

reader = New PdfReader("YourPDF.pdf")
npages = reader.NumberOfPages

For page = 1 To npages
     buffer = reader.GetPageContent(page)
     content = Encoding.UTF8.GetString(buffer, 0, buffer.Length);
     ' Search your content here
Next page

reader.Close()

Open in new window

0
 

Author Comment

by:vcharles
ID: 35093799
Thank You.
0
 
LVL 72

Expert Comment

by:Qlemo
ID: 36032357
This question has been classified as abandoned and is closed as part of the Cleanup Program. See the recommendation for more details.
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

PL/SQL can be a very powerful tool for working directly with database tables. Being able to loop will allow you to perform more complex operations, but can be a little tricky to write correctly. This article will provide examples of basic loops alon…
This article describes how to use the timestamp of existing data in a database to allow Tableau to calculate the prior work day instead of relying on case statements or if statements to calculate the days of the week.
In a question here at Experts Exchange (https://www.experts-exchange.com/questions/29062564/Adobe-acrobat-reader-DC.html), a member asked how to create a signature in Adobe Acrobat Reader DC (the free Reader product, not the paid, full Acrobat produ…
With just a little bit of  SQL and VBA, many doors open to cool things like synchronize a list box to display data relevant to other information on a form.  If you have never written code or looked at an SQL statement before, no problem! ...  give i…

571 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question