hot to to check data present in the report generated in pdf format

jklm29
jklm29 used Ask the Experts™
on
Hello,

The reports are genarated in pdf format. how to automate reports orientation is landscape /potrate? application is implemented in .net application with crystal report viewer +Infragistics controls .
and how to check header , footer , data in the pdf report.

Thanks in advance
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
It's not exactly clear how you want to check for those... if you want to extract the text and look for certain keywords etc... you can perhaps use one of the open source PDF libraries to check out the report contents... You can even use some of these libraries to rotate pages to portrtait or landscape etc etc.

have a look at these (there is a short list there): http://csharp-source.net/open-source/pdf-libraries/sharppdf

PS: I like itextsharp (http://stackoverflow.com/questions/3579058/rotating-pdf-in-c-using-itextsharp)
Mike McCrackenSenior Consultant
Most Valuable Expert 2011
Top Expert 2013

Commented:
What is it you want to check?

mlmcc

Author

Commented:
actually we export report which is in pdf format. i want to check
report page orientation is in landscape or potrate?
 header contents(label, logo, spacing between two label , text)
footer contents (page number,  no.of pages, other text, etc)
as well as other report data which is in  the tabel format
** but this data checking should done in pdf file using vbscripting & qtp


Ensure you’re charging the right price for your IT

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

Mike McCrackenSenior Consultant
Most Valuable Expert 2011
Top Expert 2013

Commented:
So this has nothing to do with Crystal other than you are generating the PDF by exporting a report?

mlmcc

Author

Commented:
yes, your are right mlmcc
i want to check  for this how to check genarated report contents using vb scripting & qtp , your help will be really appreciated if you suggest me do this
Mike McCrackenSenior Consultant
Most Valuable Expert 2011
Top Expert 2013

Commented:
I don't know anything about checking aPDF so I will bow out.

mlmcc

have a look at these (there is a short list there): http://csharp-source.net/open-source/pdf-libraries/sharppdf

PS: I like itextsharp (http://stackoverflow.com/questions/3579058/rotating-pdf-in-c-using-itextsharp)

I would therefore still suggest that you look at these. itextsharp is quite nice and you can use it to extract the text from your generated PDF and then you can parse and process the text. See an example of how to extract the text here:

http://www.codeproject.com/KB/cs/PDFToText.aspx 
http://itextpdf.com/examples/iia.php?id=277 (in Java, but you can mimic the process in c# quite easily)

Author

Commented:
i have adobe , i have tried

Set stPdfFile = "C:\testcase.pdf"

Set App = CreateObject("AcroExch.App")
Set PDDoc = CreateObject("AcroExch.PDDoc")
PDDoc.Open (stPdfFile)
 activex error displayed . so i have installed trial version of adobat  and my problem sloved .
but now main problem is that i have PDF-Xchange licenced version, which don't have "AcroExch.App" com object. so not able to execute above code. please let me know i want to automate pdf data (header, footer, logo, tabledata)checking using pdf-xchange +vbscripting & qtpo. please help me thanks in advance.

Commented:
Well if you'r willing to pay few bucks, or find a freeware to extract PDF to Text from a command line, let me know, I'll make a batch file that will do what you need.

Here is what i found so far:
http://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/
I'm pretty sure you will have to convert the PDF to text and parse out the values/lines you want.  From my years of PDF conversion work, I've found that many PDFs won't "parse" using any language; they have to be converted to text; it is just inherrent in the PDF file format.  Crystal was always a problem.  Seems like some PDFs from Crystal would work, but then others wouldn't.  I remember one converter I used that solved this problem was by a company in Florida, but I can't find the link right off the bat.  The issue was finding one that had a command line/batch mode.  I did find others that worked, but they didn't have a command line.  I'll see if I can find the link if you want to go this route.  The hard thing may be parsing the exact header/footer info from the text file; it will depend on how it converts it though.
guys, author of this question does not seem keen on using third party components. I have also recommended iTextSharp to the author and a few others actaully, that would all help to extract the text from a .NET language... then one can use regularexpressions or whatever to do the parsing of the string. another nice one to use is http://www.is-soft.de/pdfanalyzer/prod14.htm
Good point MlandaT.  Curious...has anyone here know of a batch converter that works for Crystal generated PDFs?

Also, has anyone used any programming language that successfully converted Crystal gen'd PDFs?  (and not just simple "hello world" PDFs)
James MurrellProduct Specialist

Commented:
This question has been classified as abandoned and is closed as part of the Cleanup Program. See the recommendation for more details.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial