Search pdf files in webpage woth c#

Posted on 2006-06-18
Medium Priority
Last Modified: 2008-01-09
Hello experts,

I want to code search engine that will search pdf and aspx files content.
can you give me help on how i can perform this web site.
Question by:helkayal
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 3
LVL 30

Expert Comment

ID: 16930259
www.google.com returns results from pdf content
Not sure what you're asking here...
LVL 16

Accepted Solution

OliWarner earned 1000 total points
ID: 16930411
Not sure if live-seaching is going to be the best route, but anyway...

You're going to want to look at this at some point: http://www.codeproject.com/useritems/PDFToText.asp
That's how to read a PDF into .net

What I would do, is extract all the data from all your PDFs every 3/4 days (depending on how often they change) and dumping the text in a database... It should then be quite easy to do a full text search on the database.
LVL 16

Expert Comment

ID: 16930413
Otherwise you can go with the live-search method that does the above on demand... But as noted, highly unrecommended.
Quick Cloud Training

Looking for some quick training on the cloud in 2 hours or less? Check out these how-to guides in AWS, Linux, OpenStack, Azure, and more!


Author Comment

ID: 16930479
I saw this link before but it use reference to 2 external dlls , and i want to do this without external dlls.
can any one told me how can i do that.
simply how can i search or read pdf files without using any external dlls.
LVL 16

Expert Comment

ID: 16930493
Well the component used (iTextSharp) is open souce.... You've got all the source you need right there.
LVL 30

Expert Comment

ID: 17492199
Title of the link given by  OliWarner: "Extract text from PDF in C# (100% .NET)"
Question asked: "Search pdf files in webpage woth c#"

OliWarner's last comment: "Well the component used (iTextSharp) is open souce.... You've got all the source you need right there."

RECOMMEND:   Award to OliWarner
LVL 30

Expert Comment

ID: 17492200
Makes a good PAQ too

Featured Post

Get 15 Days FREE Full-Featured Trial

Benefit from a mission critical IT monitoring with Monitis Premium or get it FREE for your entry level monitoring needs.
-Over 200,000 users
-More than 300,000 websites monitored
-Used in 197 countries
-Recommended by 98% of users

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When crafting your “Why Us” page, there are a plethora of pitfalls to avoid. Follow these five tips, and you’ll be well on your way to creating an effective page.
Dramatic changes are revolutionizing how we build and use technology. Every company is automating, digitizing, and modernizing operations. We need a better, more connected way to work together as teams so we can harness the insights from our system…
Any person in technology especially those working for big companies should at least know about the basics of web accessibility. Believe it or not there are even laws in place that require businesses to provide such means for the disabled and aging p…
Video by: Mark
This lesson goes over how to construct ordered and unordered lists and how to create hyperlinks.
Suggested Courses

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question