Convert pdf file to mhtml using vb.net

Tina_Bhole
Tina_Bhole used Ask the Experts™
on
Hi experts,

Is there a way (free of cost/ subscription) to convert a pdf document to mhtml format in vb.net

Thanks in advance.
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Dave BaldwinFixer of Problems
Most Valuable Expert 2014

Commented:
This video https://acrobatusers.com/tutorials/converting-pdf-files-to-html claims to show you how to convert pdf file to html.  Although it's not exactly what you asked, maybe it will help at least tell you what needs to be done.

Author

Commented:
Hi Dave,

Thanks for the video, but I need to do this programmatically without any user interaction.

Thanks.
Dave BaldwinFixer of Problems
Most Valuable Expert 2014

Commented:
This search https://www.google.com/search?q=convert+pdf+file+to+html+format will bring up a lot of info.  But writing your own converter is a major project.

This http://www.aspose.com/docs/display/pdfnet/Aspose.Pdf+Product+Information appears to be a .NET library to do the job.
Ensure you’re charging the right price for your IT

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

Joe WinogradDeveloper
Fellow 2017
Most Valuable Expert 2018

Commented:
Hi Tina,

Xpdf is a set of eight command line executables:
http://www.foolabs.com/xpdf/

I have done numerous 5-minute EE video Micro Tutorials on the various utilities:
Xpdf - Command Line Utility for PDF Files - Part 1
Xpdf - Extract Images from PDF Files - Part 2
Xpdf - Convert PDF Files to Plain Text Files - Part 3
Xpdf - PDFinfo - Command Line Utility to Retrieve Page Count and Other Information from PDF Files
Xpdf - PDFdetach - Command Line Utility to Detach Attachments from PDF Files

The first one shows how to download/install it (not really "install" – it's a stand-alone executable). I list the other videos in case you have a need for those functions in other projects. But for this project, you'll want a utility which, unfortunately, I haven't done a video on...yet...the PDFtoHTML tool (pdftohtml.exe). Since it is a command line program, you can easily call it from your VB.net code. I don't know if it produces HTML to your satisfaction, but it's worth a spin.

In terms of licensing and cost, Xpdf is open source, licensed under the GNU General Public License (GPL) V2, with no cost stated at the website for non-commercial use. For commercial licensing, the Xpdf site says to see their parent company's site, Glyph & Cog.

Regards, Joe
Just offering an alternative which seems to be based on the above xpdf and have similar license conditions pdftoHTML link.

Author

Commented:
Thanks darbid73.
I used this utility to convert the documents and it's now working well for me.
Joe WinogradDeveloper
Fellow 2017
Most Valuable Expert 2018

Commented:
Hi Tina,
To be clear, the link that darbid73 posted and that you accepted as the solution is based on a very old version (2.02) of the PDFtoHTML tool (pdftohtml.exe) that I mentioned in my post before darbid73's. There have been many fixes/improvements since version 2.02 — Xpdf is now on version 3.04. So when you run into problems with the 2.02-based tool, I recommend using the 3.04 version, the link for which is in my previous post. Regards, Joe

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial