Solved

DelphiXE2: How to load pure plain text from local MHT file FAST?

Posted on 2013-06-04
9
363 Views
Last Modified: 2013-09-12
How can pure plain text (without tags and formatting stuff) be loaded from a local MHT file FAST? (MHT file saved from MS Internet Explorer).
0
Comment
Question by:PeterDelphin
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
9 Comments
 
LVL 83

Expert Comment

by:Dave Baldwin
ID: 39219660
MHT's are text files that are formatted like HTML Multi-Part emails are.  They use MIME boundaries to separate the sections and images are encoded in base64 like they would be in emails.  Here is the first few lines from an MHT I made from a phpinfo page.  If you just want the content between the tags in the HTML body, you'll have to have something 'parse' it out.

From: "Saved by Windows Internet Explorer 8"
Subject: phpinfo()
Date: Mon, 14 Mar 2011 15:44:56 -0400
MIME-Version: 1.0
Content-Type: multipart/related;
	type="text/html";
	boundary="----=_NextPart_000_0000_01CBE25E.C4F42950"
X-MimeOLE: Produced By Microsoft MimeOLE V6.0.6002.18263

This is a multi-part message in MIME format.

------=_NextPart_000_0000_01CBE25E.C4F42950
Content-Type: text/html;
	charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable
Content-Location: http://bbpweb:811/getinfo.php

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" =
"http://www.w3c.org/TR/1999/REC-html401-19991224/loose.dtd">
<HTML><HEAD><TITLE>phpinfo()</TITLE>
<META content=3D"text/html; charset=3Dwindows-1252" =
http-equiv=3DContent-Type>
<STYLE type=3Dtext/css>BODY {
	BACKGROUND-COLOR: #ffffff; COLOR: #000000
}

Open in new window

0
 

Author Comment

by:PeterDelphin
ID: 39221664
DaveBaldwin, if I knew the MHT parsing format (which I think is rather complex) I would have already written a parsing function for this purpose.
0
 
LVL 83

Expert Comment

by:Dave Baldwin
ID: 39223121
It is not that complex, it looks just like the source for an email that uses the MIME boundaries to separate the parts.  You can open an *.mht file in Notepad or any other text editor and see everything.

http://www.sitepoint.com/forums/showthread.php?613263-What-is-mht-format

http://en.wikipedia.org/wiki/MIME#Multipart_messages
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:PeterDelphin
ID: 39224055
I have looked at the source code of a lot of different MHT files. They don't have enough similar structural information to make it possible to parse them.

If you are so sure about how easy it is to parse MHT files, please show me how to do it.
0
 
LVL 83

Accepted Solution

by:
Dave Baldwin earned 500 total points
ID: 39224142
0
 

Author Comment

by:PeterDelphin
ID: 39233468
I am still evaluating the Chilkat components.
0
 
LVL 83

Expert Comment

by:Dave Baldwin
ID: 39233491
Take your time, I'm in no hurry.
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

In this tutorial I will show you how to use the Windows Speech API in Delphi. I will only cover basic functions such as text to speech and controlling the speed of the speech. SAPI Installation First you need to install the SAPI type library, th…
Hello everybody This Article will show you how to validate number with TEdit control, What's the TEdit control? TEdit is a standard Windows edit control on a form, it allows to user to write, read and copy/paste single line of text. Usua…
Nobody understands Phishing better than an anti-spam company. That’s why we are providing Phishing Awareness Training to our customers. According to a report by Verizon, only 3% of targeted users report malicious emails to management. With compan…
The Email Laundry PDF encryption service allows companies to send confidential encrypted  emails to anybody. The PDF document can also contain attachments that are embedded in the encrypted PDF. The password is randomly generated by The Email Laundr…

737 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question