Tina_Bhole
asked on
Convert pdf file to mhtml using vb.net
Hi experts,
Is there a way (free of cost/ subscription) to convert a pdf document to mhtml format in vb.net
Thanks in advance.
Is there a way (free of cost/ subscription) to convert a pdf document to mhtml format in vb.net
Thanks in advance.
This video https://acrobatusers.com/tutorials/converting-pdf-files-to-html claims to show you how to convert pdf file to html. Although it's not exactly what you asked, maybe it will help at least tell you what needs to be done.
ASKER
Hi Dave,
Thanks for the video, but I need to do this programmatically without any user interaction.
Thanks.
Thanks for the video, but I need to do this programmatically without any user interaction.
Thanks.
This search https://www.google.com/search?q=convert+pdf+file+to+html+format will bring up a lot of info. But writing your own converter is a major project.
This http://www.aspose.com/docs/display/pdfnet/Aspose.Pdf+Product+Information appears to be a .NET library to do the job.
This http://www.aspose.com/docs/display/pdfnet/Aspose.Pdf+Product+Information appears to be a .NET library to do the job.
Hi Tina,
Xpdf is a set of eight command line executables:
http://www.foolabs.com/xpdf/
I have done numerous 5-minute EE video Micro Tutorials on the various utilities:
Xpdf - Command Line Utility for PDF Files - Part 1
Xpdf - Extract Images from PDF Files - Part 2
Xpdf - Convert PDF Files to Plain Text Files - Part 3
Xpdf - PDFinfo - Command Line Utility to Retrieve Page Count and Other Information from PDF Files
Xpdf - PDFdetach - Command Line Utility to Detach Attachments from PDF Files
The first one shows how to download/install it (not really "install" – it's a stand-alone executable). I list the other videos in case you have a need for those functions in other projects. But for this project, you'll want a utility which, unfortunately, I haven't done a video on...yet...the PDFtoHTML tool (pdftohtml.exe). Since it is a command line program, you can easily call it from your VB.net code. I don't know if it produces HTML to your satisfaction, but it's worth a spin.
In terms of licensing and cost, Xpdf is open source, licensed under the GNU General Public License (GPL) V2, with no cost stated at the website for non-commercial use. For commercial licensing, the Xpdf site says to see their parent company's site, Glyph & Cog.
Regards, Joe
Xpdf is a set of eight command line executables:
http://www.foolabs.com/xpdf/
I have done numerous 5-minute EE video Micro Tutorials on the various utilities:
Xpdf - Command Line Utility for PDF Files - Part 1
Xpdf - Extract Images from PDF Files - Part 2
Xpdf - Convert PDF Files to Plain Text Files - Part 3
Xpdf - PDFinfo - Command Line Utility to Retrieve Page Count and Other Information from PDF Files
Xpdf - PDFdetach - Command Line Utility to Detach Attachments from PDF Files
The first one shows how to download/install it (not really "install" – it's a stand-alone executable). I list the other videos in case you have a need for those functions in other projects. But for this project, you'll want a utility which, unfortunately, I haven't done a video on...yet...the PDFtoHTML tool (pdftohtml.exe). Since it is a command line program, you can easily call it from your VB.net code. I don't know if it produces HTML to your satisfaction, but it's worth a spin.
In terms of licensing and cost, Xpdf is open source, licensed under the GNU General Public License (GPL) V2, with no cost stated at the website for non-commercial use. For commercial licensing, the Xpdf site says to see their parent company's site, Glyph & Cog.
Regards, Joe
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks darbid73.
I used this utility to convert the documents and it's now working well for me.
I used this utility to convert the documents and it's now working well for me.
Hi Tina,
To be clear, the link that darbid73 posted and that you accepted as the solution is based on a very old version (2.02) of the PDFtoHTML tool (pdftohtml.exe) that I mentioned in my post before darbid73's. There have been many fixes/improvements since version 2.02 — Xpdf is now on version 3.04. So when you run into problems with the 2.02-based tool, I recommend using the 3.04 version, the link for which is in my previous post. Regards, Joe
To be clear, the link that darbid73 posted and that you accepted as the solution is based on a very old version (2.02) of the PDFtoHTML tool (pdftohtml.exe) that I mentioned in my post before darbid73's. There have been many fixes/improvements since version 2.02 — Xpdf is now on version 3.04. So when you run into problems with the 2.02-based tool, I recommend using the 3.04 version, the link for which is in my previous post. Regards, Joe