?
Solved

DelphiXE2: How to load pure plain text from local MHT file FAST?

Posted on 2013-06-04
9
Medium Priority
?
371 Views
Last Modified: 2013-09-12
How can pure plain text (without tags and formatting stuff) be loaded from a local MHT file FAST? (MHT file saved from MS Internet Explorer).
0
Comment
Question by:PeterDelphin
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
9 Comments
 
LVL 83

Expert Comment

by:Dave Baldwin
ID: 39219660
MHT's are text files that are formatted like HTML Multi-Part emails are.  They use MIME boundaries to separate the sections and images are encoded in base64 like they would be in emails.  Here is the first few lines from an MHT I made from a phpinfo page.  If you just want the content between the tags in the HTML body, you'll have to have something 'parse' it out.

From: "Saved by Windows Internet Explorer 8"
Subject: phpinfo()
Date: Mon, 14 Mar 2011 15:44:56 -0400
MIME-Version: 1.0
Content-Type: multipart/related;
	type="text/html";
	boundary="----=_NextPart_000_0000_01CBE25E.C4F42950"
X-MimeOLE: Produced By Microsoft MimeOLE V6.0.6002.18263

This is a multi-part message in MIME format.

------=_NextPart_000_0000_01CBE25E.C4F42950
Content-Type: text/html;
	charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable
Content-Location: http://bbpweb:811/getinfo.php

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" =
"http://www.w3c.org/TR/1999/REC-html401-19991224/loose.dtd">
<HTML><HEAD><TITLE>phpinfo()</TITLE>
<META content=3D"text/html; charset=3Dwindows-1252" =
http-equiv=3DContent-Type>
<STYLE type=3Dtext/css>BODY {
	BACKGROUND-COLOR: #ffffff; COLOR: #000000
}

Open in new window

0
 

Author Comment

by:PeterDelphin
ID: 39221664
DaveBaldwin, if I knew the MHT parsing format (which I think is rather complex) I would have already written a parsing function for this purpose.
0
 
LVL 83

Expert Comment

by:Dave Baldwin
ID: 39223121
It is not that complex, it looks just like the source for an email that uses the MIME boundaries to separate the parts.  You can open an *.mht file in Notepad or any other text editor and see everything.

http://www.sitepoint.com/forums/showthread.php?613263-What-is-mht-format

http://en.wikipedia.org/wiki/MIME#Multipart_messages
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 

Author Comment

by:PeterDelphin
ID: 39224055
I have looked at the source code of a lot of different MHT files. They don't have enough similar structural information to make it possible to parse them.

If you are so sure about how easy it is to parse MHT files, please show me how to do it.
0
 
LVL 83

Accepted Solution

by:
Dave Baldwin earned 2000 total points
ID: 39224142
0
 

Author Comment

by:PeterDelphin
ID: 39233468
I am still evaluating the Chilkat components.
0
 
LVL 83

Expert Comment

by:Dave Baldwin
ID: 39233491
Take your time, I'm in no hurry.
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction The parallel port is a very commonly known port, it was widely used to connect a printer to the PC, if you look at the back of your computer, for those who don't have newer computers, there will be a port with 25 pins and a small print…
Creating an auto free TStringList The TStringList is a basic and frequently used object in Delphi. On many occasions, you may want to create a temporary list, process some items in the list and be done with the list. In such cases, you have to…
Michael from AdRem Software outlines event notifications and Automatic Corrective Actions in network monitoring. Automatic Corrective Actions are scripts, which can automatically run upon discovery of a certain undesirable condition in your network.…
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…
Suggested Courses

777 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question