Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

What is the best way to extract and process information from an HTML email?

Posted on 2006-06-18
2
621 Views
Last Modified: 2010-04-04
I get notices from a courier company in HTML format, and in that email are one or more table rows that contain information about a failed delivery.  I would like to pick out that information and use it to automatically send my own email to the person whose delivery has failed giving them the same information but pulling their details from my own database.

Can anyone suggest the best way of going about this?

Example HTML follows:

==================================================

<!DOCTYPE html PUBLIC -//W3C//DTD HTML 4.01 Transitional//EN>
<HTML>
<HEAD>
<META content=text/html;charset=iso-8859-1
http-equiv=Content-Type>
<style>

a:link  {text-decoration: none; }
a:hover {text-decoration: underline;}

.notes
{
    padding:        3px;
    margin:            10px;
    border:            1px solid #000000;
    background-color:                #779580;
    color:                #FFFFFF;        
    font-family: Verdana, Arial, Helvetica, sans-serif;
    font-size: 11px;
}
h1
{

        font-family:       Verdana, Arial, Helvetica, sans-serif;
        font-size:         16px;
        font-weight:       bold;
        text-align:        left;
        color:             #000000;
        margin:            5px;
        margin-top:        5px;
        margin-bottom:     15px;
        padding:           3px;
        border-top:        1px solid #00662F;
        border-bottom:     2px solid #00662F;
}

.cellpalegreenb {
background-color: #C8F6A0;
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 12px;
font-weight: bold;
color: #000000;
padding: 5px;
}

.cellpalegreen {
background-color: #C8F6A0;
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 10px;
color: #000000;
padding: 5px;
}

.cellpaleyellowb {
background-color: #F7F8B8;
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 12px;
font-weight: bold;
color: #000000;
padding: 5px;
}

.cellpaleyellow {
background-color: #F7F8B8;
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 10px;
color: #000000;
padding: 5px;
}

.cellwhite {
background-color: #FFFFFF;
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 10px;
color: #000000;
padding: 5px;
}

.tableheader
{
    background-color:  #E7E5BB;
    padding:           2px;
    margin:            0px;
    font-family: Verdana, Arial, Helvetica, sans-serif;
    font-size:         13px;
    font-weight:       bold;
   
}

.titleBox {
border-color: #FFFFFF black;
background-color: #23A700;
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 20px;
color: #FFFFFF;
border-style: solid;
border-top-width: 1px;
border-right-width: 0px;
border-bottom-width: 1px;
border-left-width: 0px;
padding: 5px;
}

.tabletop {
 border-color: #FFFFFF black;
 background-color: #23A700;
 font-family: Verdana, Arial, Helvetica, sans-serif;
 font-size: 15px;
 color: #FFFFFF;
 border-style: solid;
 border-top-width: 1px;
 border-right-width: 1px;
 border-bottom-width: 1px;
 border-left-width: 1px;
 padding: 5px;
}

.BodyBlack {
 font-family: Verdana, Arial, Helvetica, sans-serif;
 font-size: 11px;
 color: #000000;
}

.BodyWhite {
 font-family: Verdana, Arial, Helvetica, sans-serif;
 font-size: 11px;
 color: #FFFFFF;
}

.carded {
 font-family: Verdana, Arial, Helvetica, sans-serif;
 font-size: 10px;
 color: #FF0000;
}

.collection {
 font-family: Verdana, Arial, Helvetica, sans-serif;
 font-size: 10px;
 color: #099000;
 }

.international {
 font-family: Verdana, Arial, Helvetica, sans-serif;
 font-size: 10px;
 color: #CC33CC;
}

.delivery {
 font-family: Verdana, Arial, Helvetica, sans-serif;
 font-size: 10px;
 color: #000099;
}

.refused {
 font-family: Verdana, Arial, Helvetica, sans-serif;
 font-size: 10px;
 color: #4A809F;
}

.alert {
 font-family: Verdana, Arial, Helvetica, sans-serif;
 font-size: 11px;
 color: #FF0000;
 font-weight: bold;
}

.footer {
 font-family: Verdana, Arial, Helvetica, sans-serif;
 font-size: 10px;
 text-align: center;
 color: #3F874C;
}

.copyright {
 font-family: Verdana, Arial, Helvetica, sans-serif;
 font-size: 10px;
 color: #3F874C;
 position: absolute;
 bottom: 10px;
 }

</style>
</HEAD>
<BODY>
<table cellpadding=5 cellspacing=0 border=0 align=center WIDTH=100%>
<tr><td><h1>MailTrack Advice</h1></td>
<td width=178><a href="http://www.city-link.co.uk"><img src=cid:image1.230506.030303 alt="flying c" height=50 width=172 border=0></a></td>
</tr></table>
<DIV class=notes>Please find below the messages generated on your deliveries since your last mail at 13:50   today.
<BR>All of these messages were received in the last 60       minutes. <BR>
<BR>New Messages Received: 2</div><BR>
<TABLE cellpadding=2 cellspacing=2 border=0 align=CENTER <TR><TD class=tableheader>Account No.</TD>
<TD class=tableheader>Our Ref</TD>
<TD class=tableheader>Your Ref</TD>
<TD class=tableheader>Del Addr</TD>
<TD class=tableheader>Message (Click message header to REPLY)</TD>
</TR>
<TR><td  class=cellpaleyellow>875093</td>
<td  class=cellpaleyellow>PS906873</td>
<td  class=cellpaleyellow>LD15255</td>
<td  class=cellpaleyellow>MICHAEL BETTS<br>19 STRATHBURN GARDENS<br>INVERURIE<br><br><br>AB51 4RY<br></td>
<TD class=cellpaleyellow>
<TABLE cellpadding=2 cellspacing=0 border=0 width=100%
<TR><TD width=50%><a class=carded href=mailto:hailsham@city-link.co.uk?subject=Instructions%20for%20Job:%20PS906873%2014:22%200502 PREMISES CLOSED - TIME CARDED>
PREMISES CLOSED - TIME CARDED  (1) </a>
</TD>
<TD  width=50% class=carded>Door Description: BROWN DOOR<BR>
Carded Time: 14:22<BR>
LOG: 4333062<BR>
</TD></TR></TABLE></TD></TR><TR><td  class=cellpaleyellow>875093</td>
<td  class=cellpaleyellow>PS906867</td>
<td  class=cellpaleyellow>LD15262</td>
<td  class=cellpaleyellow>HIGHGATE STATIONERS & PRINTERS<br>5A CROGSLAND ROAD<br>LONDON<br><br><br>NW1 8AY<br></td>
<TD class=cellpaleyellow>
<TABLE cellpadding=2 cellspacing=0 border=0 width=100%
<TR><TD width=50%><a class=carded href=mailto:hailsham@city-link.co.uk?subject=Instructions%20for%20Job:%20PS906867%2016:18%200502 PREMISES CLOSED - TIME CARDED>
PREMISES CLOSED - TIME CARDED  (1) </a>
</TD>
<TD  width=50% class=carded>Door Description: GREEN<BR>
Carded Time: 16:18<BR>
LOG: 3821551<BR>
</TD></TR></TABLE></TD></TR></TABLE><BR><BR>
<CENTER><FONT class=footer>MailTrack V1.60 (04/05/2006) &copy; Initial City Link 2001-2006</FONT></CENTER>

<BR>
_____________________________________________________________________<BR>
The information contained in this e-mail is intended only for the<BR>
individual to whom it is addressed. It may contain privileged and<BR>
confidential information. If you have received this message in<BR>
error or there are any problems, please notify the sender<BR>
immediately and delete the message from your computer. The<BR>
unauthorised use, disclosure, copying or alteration of this<BR>
message is forbidden. This message has been checked for all<BR>
known viruses by Initial City Link prior to sending.<BR>
</BODY>
</HTML>

==============================================================

In this example two parcel deliveries have failed, and I want to extract the two sets of information and email them directly to my customer.

Any advice gratefully received.

Chris Bray.


0
Comment
Question by:chrisbray
  • 2
2 Comments
 
LVL 17

Expert Comment

by:TheRealLoki
ID: 16930924
Ideally, You should use an HTML parser.
Unfortunately, the only "free" one I've seen is in the Jedi component (TjvHTMLParser)
However, this does not seem to work as simply as you would like.

If you expect the html to *always* be in the format you describe above, it is simple enough to write your own routine to get the data out, using POS()
0
 
LVL 17

Accepted Solution

by:
TheRealLoki earned 125 total points
ID: 16931259
actually, there are many free html parsers available
http://www.torry.net/pages.php?id=216
I chose 1 at random ( THyperparser V1.0 http://www.torry.net/vcl/internet/html/hparse.zip )
put your html code in, and got enough info to parse your code in a nice fashion
I set a flag, and waited until I had seen 2 'TABLE' Tags
I then waited until the Header row was done ('/TR ' tag )
Every Row ( between 'TR' and '/TR') is a delivery line, so I treated each table cell as a field I wanted by using the Text values between 'TD' and '/TD'
if I found another 'TR' before the final '/TABLE', thenI knew there was "another delivery row"
When I saw a '/TABLE' - i stopped processing
Like I said, i chose 1 at random, but it did the trick. It was fast enough(although the demo displays slowly because it is not doing a .beginupdate or .endupdate for the TMemo)
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Delphi cmd execution 6 66
control image tags in a string ? 12 140
Tembedded WB animatid gifs not animated on some pcs 2 83
DBCtrlGrid, Delphi, Scroll 8 35
Creating an auto free TStringList The TStringList is a basic and frequently used object in Delphi. On many occasions, you may want to create a temporary list, process some items in the list and be done with the list. In such cases, you have to…
Hello everybody This Article will show you how to validate number with TEdit control, What's the TEdit control? TEdit is a standard Windows edit control on a form, it allows to user to write, read and copy/paste single line of text. Usua…
This video shows how to use Hyena, from SystemTools Software, to bulk import 100 user accounts from an external text file. View in 1080p for best video quality.
The Email Laundry PDF encryption service allows companies to send confidential encrypted  emails to anybody. The PDF document can also contain attachments that are embedded in the encrypted PDF. The password is randomly generated by The Email Laundr…

791 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question