?
Solved

How to search a PDF File using VB.NET

Posted on 2011-02-21
6
Medium Priority
?
1,747 Views
Last Modified: 2012-05-11
Hello,

I have a table with about 5000 records, I need to find the page numbers of the PDFs when a value from the table is found in the PDF. Is there a way to loop through the table and copy the page numbers of the PDFs in a column in the same table? I am using VB.NET with ACCESS 2007.

Thanks,

Victor
0
Comment
Question by:vcharles
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
6 Comments
 
LVL 22

Expert Comment

by:plusone3055
ID: 34946251
0
 

Author Comment

by:vcharles
ID: 34946799
How do I modify the code below to loop through the 5000 records to achieve the same task. My project is in VB.NET, do you know a good program to convert the coide below to vb.net?

using Acrobat;
using AFORMAUTLib;                              
private void pdfRandD(string fPath)        
{ AcroPDDocClass objPages = new AcroPDDocClass();            
objPages.Open(fPath);            
long TotalPDFPages = objPages.GetNumPages();              
objPages.Close();        
AcroAVDocClass avDoc = new AcroAVDocClass();  
avDoc.Open(fPath, "Title");          
IAFormApp formApp = new AFormAppClass();            
IFields myFields = (IFields)formApp.Fields;                        
string searchWord = "Search String";            
string k = "";            
StreamWriter sw = new StreamWriter(@"D:\KCG_FileChecker_Inputs\MAC\pdf\0230_525490_23_cha17.txt", false);          
for (int p = 0; p < TotalPDFPages; p++)            
{int numWords = int.Parse(myFields.ExecuteThisJavascript("event.value=this.getPageNumWords(" + p + ");"));              
 k = "";                
for (int i = 0; i < numWords; i++) {string chkWord = myFields.ExecuteThisJavascript("event.value=this.getPageNthWord(" + p + "," + i + ", true);");                  
 k = k + " " + chkWord;}
 if(k.Trim().Contains(searchWord))                
{int pNum = int.Parse(myFields.ExecuteThisJavascript("event.value=this.getPageLabel(" + p + ",true);"));                    
sw.WriteLine("The Word " + searchWord + " is exists in " + pNum);            
 }            
}            
sw.Close();            
MessageBox.Show("Process completed");        
}


Thamnks,

Victor
0
 
LVL 23

Accepted Solution

by:
wdosanjos earned 2000 total points
ID: 34948772
Check the iTextSharp library (http://sourceforge.net/projects/itextsharp/) more specifically the PdfReader class.

You can do something like this (untested):

Dim reader as PdfReader, page As Integer, npages As Integer, content As String, buffer() As Byte

reader = New PdfReader("YourPDF.pdf")
npages = reader.NumberOfPages

For page = 1 To npages
     buffer = reader.GetPageContent(page)
     content = Encoding.UTF8.GetString(buffer, 0, buffer.Length);
     ' Search your content here
Next page

reader.Close()

Open in new window

0
 

Author Comment

by:vcharles
ID: 35093799
Thank You.
0
 
LVL 70

Expert Comment

by:Qlemo
ID: 36032357
This question has been classified as abandoned and is closed as part of the Cleanup Program. See the recommendation for more details.
0

Featured Post

Certified OpenStack Administrator Course

We just refreshed our COA course based on the Newton exam.  With 14 labs, this course goes over the different OpenStack services that are part of the certification: Dashboard, Identity Service, Image Service, Networking, Compute, Object Storage, Block Storage, and Orchestration.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

It’s quite interesting for me as I worked with Excel using vb.net for some time. Here are some topics which I know want to share with others whom this might help. First of all if you are working with Excel then you need to Download the Following …
If you find yourself in this situation “I have used SELECT DISTINCT but I’m getting duplicates” then I'm sorry to say you are using the wrong SQL technique as it only does one thing which is: produces whole rows that are unique. If the results you a…
This is my first video review of Microsoft Bookings, I will be doing a part two with a bit more information, but wanted to get this out to you folks.
Do you want to know how to make a graph with Microsoft Access? First, create a query with the data for the chart. Then make a blank form and add a chart control. This video also shows how to change what data is displayed on the graph as well as form…
Suggested Courses
Course of the Month12 days, 23 hours left to enroll

777 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question