chrisbray
asked on
What is the best way to extract and process information from an HTML email?
I get notices from a courier company in HTML format, and in that email are one or more table rows that contain information about a failed delivery. I would like to pick out that information and use it to automatically send my own email to the person whose delivery has failed giving them the same information but pulling their details from my own database.
Can anyone suggest the best way of going about this?
Example HTML follows:
========================== ========== ========== ====
<!DOCTYPE html PUBLIC -//W3C//DTD HTML 4.01 Transitional//EN>
<HTML>
<HEAD>
<META content=text/html;charset= iso-8859-1
http-equiv=Content-Type>
<style>
a:link {text-decoration: none; }
a:hover {text-decoration: underline;}
.notes
{
padding: 3px;
margin: 10px;
border: 1px solid #000000;
background-color: #779580;
color: #FFFFFF;
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 11px;
}
h1
{
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 16px;
font-weight: bold;
text-align: left;
color: #000000;
margin: 5px;
margin-top: 5px;
margin-bottom: 15px;
padding: 3px;
border-top: 1px solid #00662F;
border-bottom: 2px solid #00662F;
}
.cellpalegreenb {
background-color: #C8F6A0;
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 12px;
font-weight: bold;
color: #000000;
padding: 5px;
}
.cellpalegreen {
background-color: #C8F6A0;
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 10px;
color: #000000;
padding: 5px;
}
.cellpaleyellowb {
background-color: #F7F8B8;
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 12px;
font-weight: bold;
color: #000000;
padding: 5px;
}
.cellpaleyellow {
background-color: #F7F8B8;
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 10px;
color: #000000;
padding: 5px;
}
.cellwhite {
background-color: #FFFFFF;
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 10px;
color: #000000;
padding: 5px;
}
.tableheader
{
background-color: #E7E5BB;
padding: 2px;
margin: 0px;
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 13px;
font-weight: bold;
}
.titleBox {
border-color: #FFFFFF black;
background-color: #23A700;
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 20px;
color: #FFFFFF;
border-style: solid;
border-top-width: 1px;
border-right-width: 0px;
border-bottom-width: 1px;
border-left-width: 0px;
padding: 5px;
}
.tabletop {
border-color: #FFFFFF black;
background-color: #23A700;
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 15px;
color: #FFFFFF;
border-style: solid;
border-top-width: 1px;
border-right-width: 1px;
border-bottom-width: 1px;
border-left-width: 1px;
padding: 5px;
}
.BodyBlack {
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 11px;
color: #000000;
}
.BodyWhite {
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 11px;
color: #FFFFFF;
}
.carded {
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 10px;
color: #FF0000;
}
.collection {
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 10px;
color: #099000;
}
.international {
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 10px;
color: #CC33CC;
}
.delivery {
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 10px;
color: #000099;
}
.refused {
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 10px;
color: #4A809F;
}
.alert {
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 11px;
color: #FF0000;
font-weight: bold;
}
.footer {
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 10px;
text-align: center;
color: #3F874C;
}
.copyright {
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 10px;
color: #3F874C;
position: absolute;
bottom: 10px;
}
</style>
</HEAD>
<BODY>
<table cellpadding=5 cellspacing=0 border=0 align=center WIDTH=100%>
<tr><td><h1>MailTrack Advice</h1></td>
<td width=178><a href="http://www.city-link.co.uk"><img src=cid:image1.230506.0303 03 alt="flying c" height=50 width=172 border=0></a></td>
</tr></table>
<DIV class=notes>Please find below the messages generated on your deliveries since your last mail at 13:50 today.
<BR>All of these messages were received in the last 60 minutes. <BR>
<BR>New Messages Received: 2</div><BR>
<TABLE cellpadding=2 cellspacing=2 border=0 align=CENTER <TR><TD class=tableheader>Account No.</TD>
<TD class=tableheader>Our Ref</TD>
<TD class=tableheader>Your Ref</TD>
<TD class=tableheader>Del Addr</TD>
<TD class=tableheader>Message (Click message header to REPLY)</TD>
</TR>
<TR><td class=cellpaleyellow>87509 3</td>
<td class=cellpaleyellow>PS906 873</td>
<td class=cellpaleyellow>LD152 55</td>
<td class=cellpaleyellow>MICHA EL BETTS<br>19 STRATHBURN GARDENS<br>INVERURIE<br><b r><br>AB51 4RY<br></td>
<TD class=cellpaleyellow>
<TABLE cellpadding=2 cellspacing=0 border=0 width=100%
<TR><TD width=50%><a class=carded href=mailto:hailsham@city- link.co.uk ?subject=I nstruction s%20for%20 Job:%20PS9 06873%2014 :22%200502 PREMISES CLOSED - TIME CARDED>
PREMISES CLOSED - TIME CARDED (1) </a>
</TD>
<TD width=50% class=carded>Door Description: BROWN DOOR<BR>
Carded Time: 14:22<BR>
LOG: 4333062<BR>
</TD></TR></TABLE></TD></T R><TR><td class=cellpaleyellow>87509 3</td>
<td class=cellpaleyellow>PS906 867</td>
<td class=cellpaleyellow>LD152 62</td>
<td class=cellpaleyellow>HIGHG ATE STATIONERS & PRINTERS<br>5A CROGSLAND ROAD<br>LONDON<br><br><br> NW1 8AY<br></td>
<TD class=cellpaleyellow>
<TABLE cellpadding=2 cellspacing=0 border=0 width=100%
<TR><TD width=50%><a class=carded href=mailto:hailsham@city- link.co.uk ?subject=I nstruction s%20for%20 Job:%20PS9 06867%2016 :18%200502 PREMISES CLOSED - TIME CARDED>
PREMISES CLOSED - TIME CARDED (1) </a>
</TD>
<TD width=50% class=carded>Door Description: GREEN<BR>
Carded Time: 16:18<BR>
LOG: 3821551<BR>
</TD></TR></TABLE></TD></T R></TABLE> <BR><BR>
<CENTER><FONT class=footer>MailTrack V1.60 (04/05/2006) © Initial City Link 2001-2006</FONT></CENTER>
<BR>
__________________________ __________ __________ __________ __________ ___<BR>
The information contained in this e-mail is intended only for the<BR>
individual to whom it is addressed. It may contain privileged and<BR>
confidential information. If you have received this message in<BR>
error or there are any problems, please notify the sender<BR>
immediately and delete the message from your computer. The<BR>
unauthorised use, disclosure, copying or alteration of this<BR>
message is forbidden. This message has been checked for all<BR>
known viruses by Initial City Link prior to sending.<BR>
</BODY>
</HTML>
========================== ========== ========== ========== ======
In this example two parcel deliveries have failed, and I want to extract the two sets of information and email them directly to my customer.
Any advice gratefully received.
Chris Bray.
Can anyone suggest the best way of going about this?
Example HTML follows:
==========================
<!DOCTYPE html PUBLIC -//W3C//DTD HTML 4.01 Transitional//EN>
<HTML>
<HEAD>
<META content=text/html;charset=
http-equiv=Content-Type>
<style>
a:link {text-decoration: none; }
a:hover {text-decoration: underline;}
.notes
{
padding: 3px;
margin: 10px;
border: 1px solid #000000;
background-color: #779580;
color: #FFFFFF;
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 11px;
}
h1
{
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 16px;
font-weight: bold;
text-align: left;
color: #000000;
margin: 5px;
margin-top: 5px;
margin-bottom: 15px;
padding: 3px;
border-top: 1px solid #00662F;
border-bottom: 2px solid #00662F;
}
.cellpalegreenb {
background-color: #C8F6A0;
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 12px;
font-weight: bold;
color: #000000;
padding: 5px;
}
.cellpalegreen {
background-color: #C8F6A0;
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 10px;
color: #000000;
padding: 5px;
}
.cellpaleyellowb {
background-color: #F7F8B8;
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 12px;
font-weight: bold;
color: #000000;
padding: 5px;
}
.cellpaleyellow {
background-color: #F7F8B8;
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 10px;
color: #000000;
padding: 5px;
}
.cellwhite {
background-color: #FFFFFF;
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 10px;
color: #000000;
padding: 5px;
}
.tableheader
{
background-color: #E7E5BB;
padding: 2px;
margin: 0px;
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 13px;
font-weight: bold;
}
.titleBox {
border-color: #FFFFFF black;
background-color: #23A700;
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 20px;
color: #FFFFFF;
border-style: solid;
border-top-width: 1px;
border-right-width: 0px;
border-bottom-width: 1px;
border-left-width: 0px;
padding: 5px;
}
.tabletop {
border-color: #FFFFFF black;
background-color: #23A700;
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 15px;
color: #FFFFFF;
border-style: solid;
border-top-width: 1px;
border-right-width: 1px;
border-bottom-width: 1px;
border-left-width: 1px;
padding: 5px;
}
.BodyBlack {
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 11px;
color: #000000;
}
.BodyWhite {
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 11px;
color: #FFFFFF;
}
.carded {
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 10px;
color: #FF0000;
}
.collection {
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 10px;
color: #099000;
}
.international {
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 10px;
color: #CC33CC;
}
.delivery {
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 10px;
color: #000099;
}
.refused {
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 10px;
color: #4A809F;
}
.alert {
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 11px;
color: #FF0000;
font-weight: bold;
}
.footer {
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 10px;
text-align: center;
color: #3F874C;
}
.copyright {
font-family: Verdana, Arial, Helvetica, sans-serif;
font-size: 10px;
color: #3F874C;
position: absolute;
bottom: 10px;
}
</style>
</HEAD>
<BODY>
<table cellpadding=5 cellspacing=0 border=0 align=center WIDTH=100%>
<tr><td><h1>MailTrack Advice</h1></td>
<td width=178><a href="http://www.city-link.co.uk"><img src=cid:image1.230506.0303
</tr></table>
<DIV class=notes>Please find below the messages generated on your deliveries since your last mail at 13:50 today.
<BR>All of these messages were received in the last 60 minutes. <BR>
<BR>New Messages Received: 2</div><BR>
<TABLE cellpadding=2 cellspacing=2 border=0 align=CENTER <TR><TD class=tableheader>Account No.</TD>
<TD class=tableheader>Our Ref</TD>
<TD class=tableheader>Your Ref</TD>
<TD class=tableheader>Del Addr</TD>
<TD class=tableheader>Message (Click message header to REPLY)</TD>
</TR>
<TR><td class=cellpaleyellow>87509
<td class=cellpaleyellow>PS906
<td class=cellpaleyellow>LD152
<td class=cellpaleyellow>MICHA
<TD class=cellpaleyellow>
<TABLE cellpadding=2 cellspacing=0 border=0 width=100%
<TR><TD width=50%><a class=carded href=mailto:hailsham@city-
PREMISES CLOSED - TIME CARDED (1) </a>
</TD>
<TD width=50% class=carded>Door Description: BROWN DOOR<BR>
Carded Time: 14:22<BR>
LOG: 4333062<BR>
</TD></TR></TABLE></TD></T
<td class=cellpaleyellow>PS906
<td class=cellpaleyellow>LD152
<td class=cellpaleyellow>HIGHG
<TD class=cellpaleyellow>
<TABLE cellpadding=2 cellspacing=0 border=0 width=100%
<TR><TD width=50%><a class=carded href=mailto:hailsham@city-
PREMISES CLOSED - TIME CARDED (1) </a>
</TD>
<TD width=50% class=carded>Door Description: GREEN<BR>
Carded Time: 16:18<BR>
LOG: 3821551<BR>
</TD></TR></TABLE></TD></T
<CENTER><FONT class=footer>MailTrack V1.60 (04/05/2006) © Initial City Link 2001-2006</FONT></CENTER>
<BR>
__________________________
The information contained in this e-mail is intended only for the<BR>
individual to whom it is addressed. It may contain privileged and<BR>
confidential information. If you have received this message in<BR>
error or there are any problems, please notify the sender<BR>
immediately and delete the message from your computer. The<BR>
unauthorised use, disclosure, copying or alteration of this<BR>
message is forbidden. This message has been checked for all<BR>
known viruses by Initial City Link prior to sending.<BR>
</BODY>
</HTML>
==========================
In this example two parcel deliveries have failed, and I want to extract the two sets of information and email them directly to my customer.
Any advice gratefully received.
Chris Bray.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Unfortunately, the only "free" one I've seen is in the Jedi component (TjvHTMLParser)
However, this does not seem to work as simply as you would like.
If you expect the html to *always* be in the format you describe above, it is simple enough to write your own routine to get the data out, using POS()