Solved

Search pdf files in webpage woth c#

Posted on 2006-06-18
10
768 Views
Last Modified: 2008-01-09
Hello experts,

I want to code search engine that will search pdf and aspx files content.
can you give me help on how i can perform this web site.
0
Comment
Question by:helkayal
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 3
10 Comments
 
LVL 30

Expert Comment

by:callrs
ID: 16930259
www.google.com returns results from pdf content
Not sure what you're asking here...
0
 
LVL 16

Accepted Solution

by:
OliWarner earned 250 total points
ID: 16930411
Not sure if live-seaching is going to be the best route, but anyway...

You're going to want to look at this at some point: http://www.codeproject.com/useritems/PDFToText.asp
That's how to read a PDF into .net

What I would do, is extract all the data from all your PDFs every 3/4 days (depending on how often they change) and dumping the text in a database... It should then be quite easy to do a full text search on the database.
0
 
LVL 16

Expert Comment

by:OliWarner
ID: 16930413
Otherwise you can go with the live-search method that does the above on demand... But as noted, highly unrecommended.
0
Salesforce Has Never Been Easier

Improve and reinforce salesforce training & adoption using WalkMe's digital adoption platform. Start saving on costly employee training by creating fast intuitive Walk-Thrus for Salesforce. Claim your Free Account Now

 
LVL 1

Author Comment

by:helkayal
ID: 16930479
I saw this link before but it use reference to 2 external dlls , and i want to do this without external dlls.
can any one told me how can i do that.
simply how can i search or read pdf files without using any external dlls.
0
 
LVL 16

Expert Comment

by:OliWarner
ID: 16930493
Well the component used (iTextSharp) is open souce.... You've got all the source you need right there.
0
 
LVL 30

Expert Comment

by:callrs
ID: 17492199
Title of the link given by  OliWarner: "Extract text from PDF in C# (100% .NET)"
Question asked: "Search pdf files in webpage woth c#"

OliWarner's last comment: "Well the component used (iTextSharp) is open souce.... You've got all the source you need right there."


RECOMMEND:   Award to OliWarner
0
 
LVL 30

Expert Comment

by:callrs
ID: 17492200
Makes a good PAQ too
0

Featured Post

Learn by Doing. Anytime. Anywhere.

Do you like to learn by doing?
Our labs and exercises give you the chance to do just that: Learn by performing actions on real environments.

Hands-on, scenario-based labs give you experience on real environments provided by us so you don't have to worry about breaking anything.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Learn by example how to specify CSS selectors for Selenium WebDriver test automation software.
Developer portfolios can be a bit of an enigma—how do you present yourself to employers without burying them in lines of code?  A modern portfolio is more than just work samples, it’s also a statement of how you work.
The viewer will get a basic understanding of what section 508 compliance can entail, learn about skip navigation links, alt text, transcripts, and font size controls.
The is a quite short video tutorial. In this video, I'm going to show you how to create self-host WordPress blog with free hosting service.

696 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question