Solved

How to convert html to pdf using MFC?

Posted on 2003-11-24
6
5,923 Views
Last Modified: 2013-12-02

  Hi all,
     I have a requirement in which I should convert html file to pdf file format, I am able to convert html file to doc, txt and rtf formats by using msword9.olb type library, but I want to know how to approach for converting htm file to pdf format, or any other document format to pdf format using vc++.

waiting for reply,
  hareesh.
0
Comment
Question by:jntu_hareesh
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
6 Comments
 
LVL 32

Accepted Solution

by:
jhance earned 25 total points
ID: 9809639
This is a bit like asking how to perform brain surgery assuming you have a knife...

Whether you're using VC++ or MFC or anything else, the difficult part of this is going to be for you to understand the PDF file format, then to understand how to "render" HTML into that format, and finally to do that actual coding in VC++, MFC, or whatever.

Most PDF writers (like Adobe Acrobat or [what I like] FinePrint's PDF Factory) are implemented as PRINTER DRIVERS in Windows.  So you use the existing application's (in this case IE) printer interface and then just print-to-PDF.  So the only thing you need to implement in your code is taking the Windows DC (device context) from the printer interface and writing out PDF.

There are some resources to assist you:

1) The Windows DDK includes SAMPLE printer drivers so you can understand how to do that in general.
2) There is an open source PDF writer: http://sourceforge.net/projects/pdfcreator/
3) The PDF file format is documented: http://partners.adobe.com/asn/tech/pdf/specifications.jsp

Enjoy.  This project, while certainly NOT trivial, it doable and should be interesting.
0
 
LVL 44

Assisted Solution

by:Karl Heinz Kremer
Karl Heinz Kremer earned 25 total points
ID: 9809768
First you should find out if you really have to code this in yourself in your application, or if you can use something that already does this conversion. One option of the latter group would be HTMLDoc from EasySW (http://www.easysw.com/htmldoc/).

If you want to use Acrobat (and the full version of Acrobat is installed on your system), you can first print to a PostScript file and then use the Distiller API to convert the PostScript file to PDF. The Acrobat SDK (available on the partners.adobe.com web site) does contain all the information you need to automate distiller. You can of course use other PS to PDF converters like Ghostscript (www.ghostscript.org) or Jaws PDF Creator (www.jawspdf.com)

All these a solutions however have one drawback: If you use Acrobat's Create PDF From Web page, all the link in the HTML code are converted to PDF links, so you can still click on a link in Acrobat, and if it's a link to the same document, Acrobat will jump to the new location, or if it's an external link, Acrobat will ask you if you want to open the new page in Acrobat or your web browser.

Your question about "any other format to PDF" is also something you should use the Distiller API for: As long as you can print to Postscript, you can create PDF. This however requires that you can automate the application that can consume the "other format" (e.g. MS Word for .doc files) so that you can print to Postscript.

Otherwise you have to understand all these formats and do it element by element, which is also possible, but requires documentation about these formats. The .doc format for example is not documented by Microsoft, so you would have to reverse engineer the format first.

If you look at how Acrobat converts "other formats" to PDF, they use exactly this solution (convert to Postscript first and then call the Distiller).

0
 
LVL 44

Expert Comment

by:Karl Heinz Kremer
ID: 11848135
I provided at least part of the solution.
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In a previously published article (http://www.experts-exchange.com/articles/10331/Automatic-Duplex-Scanning-in-PaperPort-Versions-11-12-14.html) here at Experts Exchange, I explained how to achieve duplex (double-sided) scanning in Nuance's PaperPor…
In this post we will learn different types of Android Layout and some basics of an Android App.
This video is the first in a two-part series that discusses PaperPort's "Send To Bar" feature . This first video tutorial explains the purpose of the Send To Bar, how to use it, and how to hide unwanted items that are automatically created on it whe…
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…

756 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question