[a|A] [h|H][r|R][e|E][f|F]["|=][
"|=]?(\w+?
\.mpg|\w+?
\.wmv)
[a|A] [h|H][r|R][e|E][f|F]["|=][
"|=]?[\.\.
/]?(\w+?\.
mpg|\w+?\.
wmv)
I am using a regex string to parse html files. There are several variations (strategies) I want to parse. I want to combine them in to one string.
string myRegex = "[a|A] [h|H][r|R][e|E][f|F]["|=][
"|=][\.]?[
\.]?[/]?(.
+\.xls|.+\
.dat)";
The goal is to parse every ".xls" and ".dat" file from html.
My test data:
<BODY BGCOLOR="000033" TOPMARGIN="0">
<DIV ALIGN="CENTER"><TABLE WIDTH="805" BORDER="0" CELLSPACING="0" CELLPADDING="0"><TR><TD HEIGHT="41" BGCOLOR="000033"><IMG SRC="images/top-header2.jp
g" WIDTH="805" HEIGHT="300" USEMAP="#Map" BORDER="0"><MAP NAME="Map"><AREA SHAPE="rect" COORDS="321,6,500,108" HREF="
http://www.elli.com/track/MzoxNTox/"></M
AP></TD></
TR><TR><TD
HEIGHT="2" BGCOLOR="000033"><A HREF="
http://join.elli.com/track/MzoxNTox/"><I
MG SRC="images/top3.jpg" WIDTH="805" HEIGHT="37" BORDER="0"></A></TD></TR><
TR><TD HEIGHT="416" BGCOLOR="#660066"><TABLE WIDTH="805" BORDER="0" CELLSPACING="0" CELLPADDING="0" BGCOLOR="#660066"><TR><TD HEIGHT="241"><DIV ALIGN="CENTER"><A HREF="budget01.xls"><IMG SRC="p1.jpg" WIDTH="400" HEIGHT="300" border="0"></A></DIV></TD>
<TD HEIGHT="241"><DIV ALIGN="CENTER"><A HREF="budget02.xls"><IMG SRC="p2.jpg" WIDTH="400" HEIGHT="300" border="0"></A></DIV></TD>
</TR><TR><
TD HEIGHT="38"><DIV ALIGN="CENTER"><FONT FACE="Tahoma" SIZE="4"><B><FONT COLOR="#66CC00"
Should output:
matches[0].Groups[1] == "budget01.xls"
matches[1].Groups[1] == "budget02.xls"
But instead outputs:
matches[0].Groups[1] == "
http://join.elli.com/track/MzoxNTox/"><IMG SRC="images/top3.jpg" WIDTH="805" HEIGHT="37" BORDER="0"></A></TD></TR><
TR><TD HEIGHT="416" BGCOLOR="#660066"><TABLE WIDTH="805" BORDER="0" CELLSPACING="0" CELLPADDING="0" BGCOLOR="#660066"><TR><TD HEIGHT="241"><DIV ALIGN="CENTER"><A HREF="budget01.xls"><IMG SRC="p1.jpg" WIDTH="400" HEIGHT="300" border="0"></A></DIV></TD>
<TD HEIGHT="241"><DIV ALIGN="CENTER"><A HREF="budget02.xls"
Why does the regex include too much? I do not understand how to control the "greediness".
I have been struggling to understand how I can correct this. An experts suggestion would be greatly appreciated.
Start Free Trial