Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

iTextSharp - create multi-page PDF batch with merge or something else?

Posted on 2013-11-08
12
2,866 Views
Last Modified: 2013-11-08
Hello all,

This is my scenario.   I have to create an Invoice Batch file.  Users will select multiple orders from a grid then I need to loop through these orders and create an invoice for each order.   Here is the caveat.   For each order there is another table called order_docs that has an order number and physical file location on the network to a scanned PDF document.   For each order I create an invoice for I need to attach 'behind' that invoice the scanned PDF document if it exists then continue for each order, invoice then behind it scanned pdf merged into the next pages for example.

I assume I may need to use a console application to handle this scenario, launch from MVC app or put in queue table then console app would create the file.  Worst case scenario I will be sending it direct to the printer within a loop.

Any suggestions on this, best tool to use aka can iTextSharp handle this SSRS etc?

Hope this makes sense looking for best possible approach.
0
Comment
Question by:sbornstein2
  • 6
  • 3
  • 3
12 Comments
 
LVL 75

Assisted Solution

by:käµfm³d 👽
käµfm³d   👽 earned 250 total points
ID: 39633699
I am currently using PDFSharp to combine single PDFs into one PDF file. When I was proofing  this, it only took a few lines of code:

using MoreLinq;
using PdfSharp.Pdf;
using PdfSharp.Pdf.IO;

namespace PdfSharpTest
{
    class Program
    {
        static void Main(string[] args)
        {
            PdfDocument pdf1 = PdfReader.Open(@"first.pdf", PdfDocumentOpenMode.Import);
            PdfDocument pdf2 = PdfReader.Open(@"second.pdf", PdfDocumentOpenMode.Import);
            PdfDocument pdf3 = new PdfDocument();

            pdf1.Pages.PagesArray.ForEach(page => pdf3.Pages.Add((PdfPage)page));
            pdf2.Pages.PagesArray.ForEach(page => pdf3.Pages.Add((PdfPage)page));

            pdf3.Save(@"result.pdf");
        }
    }
}

Open in new window


PDFSharp is available from NuGet.

You don't necessarily need MoreLINQ. If you don't install that as well, then you simply need to rework the two ForEach lines into actual foreach statements.
0
 
LVL 53

Accepted Solution

by:
Joe Winograd, EE MVE earned 250 total points
ID: 39634070
I've used the PDF Toolkit (PDFtk) to perform similar operations. It is an excellent (free!) product that has numerous features to manipulate PDFs. It comes in both command line and GUI versions. The command line version is called PDFtk Server and may be downloaded here:
http://www.pdflabs.com/tools/pdftk-server/

Don't be misled by "Server" in the name. I don't know why they called it that, but it's just an executable (pdftk.exe – with a supporting DLL, libiconv2.dll) that runs on XP, Vista, W7, and W8 (it does not have to run on a "server" OS...it also runs on Mac, but I've never used it on that).

Here's the manual (Man Page) for it:
http://www.pdflabs.com/docs/pdftk-man-page/

To do what you want, take a look at the "cat" parameter, which combines ("catentates") page ranges from multiple PDFs, and the "shuffle" parameter, which works like "cat" but processes one page at a time.

Here's an example of a program that uses the "shuffle" parameter, looping through all pages in multiple files:
http://www.experts-exchange.com/Software/Misc/A_11211-How-To-Split-Rename-Move-a-Batch-of-PDF-Files-Based-on-Contents-of-the-Files.html

It's a long article and you may not want to read the whole thing, but take a look at the code snippets that build up the "shuffle" parameter and then call <pdftk.exe> with it (via %comspec%).

I've had excellent results with this free tool. Regards, Joe
0
 

Author Comment

by:sbornstein2
ID: 39634171
awesome info.   So the invoices I need to create reports for could be many for example say 50 invoices I need to create a report page for aka PDF then use the merge as well.   Can both tools create such dynamic reports as well?   I thought iTextSharp take for example an HTML page and then creates a PDF for example.  So in my case I would have many of these.   Essentially I have a recordset of data from a SQL query and each record would need to become a PDF page or page(s).
0
Space-Age Communications Transitions to DevOps

ViaSat, a global provider of satellite and wireless communications, securely connects businesses, governments, and organizations to the Internet. Learn how ViaSat’s Network Solutions Engineer, drove the transition from a traditional network support to a DevOps-centric model.

 

Author Comment

by:sbornstein2
ID: 39634174
I will reward both Kaufmed and Joewin.   Thanks again guys I need to figure out what I should start experimenting this scenario with this weekend.
0
 

Author Comment

by:sbornstein2
ID: 39634175
Creating a batch file per say is what at the end of the day I am wanting.  New PDF pages as well as merging some existing PDF files with it.
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 39634351
I have to confess that I do not know all of the functionality available from PDFSharp--I've only started using it recently. In glancing at the source code, it appears that the library has functionality embedded for drawing PDFs yourself, so it may just be a matter or parsing the source HTML for data and outputting it to a new PDF canvas. I cannot confirm this, though.
0
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 39634391
> Can both tools create such dynamic reports as well?

I can't speak to PDFsharp...never used it. Re PDFtk, it cannot create PDF files from scratch. Its purpose is to manipulate existing PDFs, so it can merge/combine PDFs for you, but not create them (except from existing PDFs).

> take for example an HTML page and then creates a PDF

You could try a PDF print driver with command line capability. One such product is Bullzip:
http://www.bullzip.com/products/pdf/info.php

It is free and provides a command line interface to all settings, as well as a COM/ActiveX interface that you can use in programs. This free version (which is very good!) is based on the commercial version of the bioPDF PDF Writer:
http://www.biopdf.com/

There are some features in the commercial version that aren't in the free version, but I don't think you'll need them (although I'm not sure of that).

There are many free PDF print drivers out there...Bullzip, CutePDF Writer, doPDF, Nitro PDF Creator (part of the Nitro Reader install), PDFCreator, PrimoPDF, to name a few...although not many of them provide a command line or programmatic interface.

For HTML files specifically, A-PDF HTML to PDF is a non-free, but relatively inexpensive ($39) product that has a command line interface (Htmltopdf.exe).

> recordset of data from a SQL query and each record would need to become a PDF page or page(s).

Should be fine if you can "print" the results of the SQL query to the Bullzip print driver (or whatever PDF print driver you wind up using).

> I should start experimenting this scenario with this weekend.

No rush for the points here. Take the weekend or as long as you need. Looking forward to hearing how it goes for you.

> Creating a batch file per say is what at the end of the day I am wanting. New PDF pages as well as merging some existing PDF files with it.

Should be easy to create a batch file, as long as you can find the necessary command line components.

Regards, Joe
0
 

Author Comment

by:sbornstein2
ID: 39634550
Last opinion I promise then I will award 250 to each of you guys thanks so much again.  So I have been playing around today and iTextSharp I can do this for sure.  I am testing now creating a Document page then doing a PdfReader where I point to a file on disk on my local C drive share UNC path.  I loop through the pages doing a Document.NewPage() and this actually works so that is very cool.   I did this for two existing PDF documents in between created document pages I am messing with.

My worry is though doing this through an MVC application and response to the browser this PDF document and opening PDFReaders for like 50 multi-page docs may not be feasible with memory or the write flush not being able to handle it.

Any thoughts?   I am thinking should I use a console application external to the MVC application, queue it up for example then let the console app do the work.
0
 

Author Closing Comment

by:sbornstein2
ID: 39634668
thanks for the info
0
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 39634739
I haven't built any MVC apps, so I can't help you with that. Your reasoning makes sense to me on a general basis, but the specifics of an MVC app is beyond my areas of expertise. Thanks for awarding the points – much appreciated! Good luck on the project. Regards, Joe
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 39634847
I am thinking should I use a console application external to the MVC application, queue it up for example then let the console app do the work.
If you're running this console application on the same server, then you're eating memory regardless of which method you choose.
0
 

Author Comment

by:sbornstein2
ID: 39634957
Good point Kaufmed tx
0

Featured Post

Free Tool: Postgres Monitoring System

A PHP and Perl based system to collect and display usage statistics from PostgreSQL databases.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Entity Framework is a powerful tool to help you interact with the DataBase but still doesn't help much when we have a Stored Procedure that returns more than one resultset. The solution takes some of out-of-the-box thinking; read on!
International Data Corporation (IDC) prognosticates that before the current the year gets over disbursing on IT framework products to be sent in cloud environs will be $37.1B.
This video shows how to use Hyena, from SystemTools Software, to bulk import 100 user accounts from an external text file. View in 1080p for best video quality.
The Email Laundry PDF encryption service allows companies to send confidential encrypted  emails to anybody. The PDF document can also contain attachments that are embedded in the encrypted PDF. The password is randomly generated by The Email Laundr…

792 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question