• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 339
  • Last Modified:

How to Search a PDF image from a BLOB for specific data Using ASP.NET/C#

I have written a website that pulls back various fields from a database table that populates a second QueryResults.aspx page with all of the information within a database, excluding the BLOB (which is a PDF image of a receipt), onto the webpage. Once the user looks through the set of results they have pulled back, they are able to see the image they are looking for. My questions are:
1. Would it be feasible to pull back the BLOB of each PDF image and search for a receipt number located in various positions on the page, but not stored within the database table that is storing the BLOB and the other information that is displayed on the QueryResults.aspx page?
2. How would you search for words within a PDF file that is only in memory that you pull from a BLOB? (Would you just write it to a temporary location, search it, then delete it once you are done with the file? What would be the syntax for accomplishing such a task?)

I am new to ASP.NET and apologize in advance if the questions seem a bit off-kilter, but I was asked to complete this task by my boss. The database that is storing the BLOB and the other information is stored in an Oracle database.
1 Solution
You can implement solution using iTextSharp library, e.g. http://www.codeproject.com/KB/cs/PDFToText.aspx
This will help you converting PDF to text on the fly in memory and then string search is not a big deal.


Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now