• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 342
  • Last Modified:

How to Search a PDF image from a BLOB for specific data Using ASP.NET/C#

I have written a website that pulls back various fields from a database table that populates a second QueryResults.aspx page with all of the information within a database, excluding the BLOB (which is a PDF image of a receipt), onto the webpage. Once the user looks through the set of results they have pulled back, they are able to see the image they are looking for. My questions are:
1. Would it be feasible to pull back the BLOB of each PDF image and search for a receipt number located in various positions on the page, but not stored within the database table that is storing the BLOB and the other information that is displayed on the QueryResults.aspx page?
2. How would you search for words within a PDF file that is only in memory that you pull from a BLOB? (Would you just write it to a temporary location, search it, then delete it once you are done with the file? What would be the syntax for accomplishing such a task?)

I am new to ASP.NET and apologize in advance if the questions seem a bit off-kilter, but I was asked to complete this task by my boss. The database that is storing the BLOB and the other information is stored in an Oracle database.
1 Solution
Rahul GadeSr. ArchitectCommented:
You can implement solution using iTextSharp library, e.g. http://www.codeproject.com/KB/cs/PDFToText.aspx
This will help you converting PDF to text on the fly in memory and then string search is not a big deal.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: C++ 11 Fundamentals

This course will introduce you to C++ 11 and teach you about syntax fundamentals.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now