Link to home
Start Free TrialLog in
Avatar of Tina_Bhole
Tina_Bhole

asked on

Convert pdf file to mhtml using vb.net

Hi experts,

Is there a way (free of cost/ subscription) to convert a pdf document to mhtml format in vb.net

Thanks in advance.
Avatar of Dave Baldwin
Dave Baldwin
Flag of United States of America image

This video https://acrobatusers.com/tutorials/converting-pdf-files-to-html claims to show you how to convert pdf file to html.  Although it's not exactly what you asked, maybe it will help at least tell you what needs to be done.
Avatar of Tina_Bhole
Tina_Bhole

ASKER

Hi Dave,

Thanks for the video, but I need to do this programmatically without any user interaction.

Thanks.
This search https://www.google.com/search?q=convert+pdf+file+to+html+format will bring up a lot of info.  But writing your own converter is a major project.

This http://www.aspose.com/docs/display/pdfnet/Aspose.Pdf+Product+Information appears to be a .NET library to do the job.
Hi Tina,

Xpdf is a set of eight command line executables:
http://www.foolabs.com/xpdf/

I have done numerous 5-minute EE video Micro Tutorials on the various utilities:
Xpdf - Command Line Utility for PDF Files - Part 1
Xpdf - Extract Images from PDF Files - Part 2
Xpdf - Convert PDF Files to Plain Text Files - Part 3
Xpdf - PDFinfo - Command Line Utility to Retrieve Page Count and Other Information from PDF Files
Xpdf - PDFdetach - Command Line Utility to Detach Attachments from PDF Files

The first one shows how to download/install it (not really "install" – it's a stand-alone executable). I list the other videos in case you have a need for those functions in other projects. But for this project, you'll want a utility which, unfortunately, I haven't done a video on...yet...the PDFtoHTML tool (pdftohtml.exe). Since it is a command line program, you can easily call it from your VB.net code. I don't know if it produces HTML to your satisfaction, but it's worth a spin.

In terms of licensing and cost, Xpdf is open source, licensed under the GNU General Public License (GPL) V2, with no cost stated at the website for non-commercial use. For commercial licensing, the Xpdf site says to see their parent company's site, Glyph & Cog.

Regards, Joe
ASKER CERTIFIED SOLUTION
Avatar of darbid73
darbid73
Flag of Germany image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks darbid73.
I used this utility to convert the documents and it's now working well for me.
Hi Tina,
To be clear, the link that darbid73 posted and that you accepted as the solution is based on a very old version (2.02) of the PDFtoHTML tool (pdftohtml.exe) that I mentioned in my post before darbid73's. There have been many fixes/improvements since version 2.02 — Xpdf is now on version 3.04. So when you run into problems with the 2.02-based tool, I recommend using the 3.04 version, the link for which is in my previous post. Regards, Joe