Squeakrz44
asked on
Extract Table Info from a web Page Using mshtml
I am trying to extract the player information from the Boston Celtics Roster page. I have save the roster page into a text file named BostonRosterHtml.txt. I know how to extract this data form the table using Regex. But I am trying to learn how to do this extraction using the mshtml objects. This is where I am unsure how to go about this. I am tryng to extract the player's infor from the page and storing each player's information in a list. ie
Dim strNumber as New List(Of Integer)
Dim strPlayer as new List(Of String)
Dim strPosition as new List(Of String
Dim strHeight as new List(Of String)
Dim str Weight as new List(Of String)
Dim strDOB as new List(Of String)
Dim strFROM as new List(Of String)
Dim strYearsInPros as new List(Of string)
Below is the html snippet where the column Heading are :
</tr>
<tr class=gSGSectionColumnHead ings>
<td NOWRAP width=40 class=gSGSectionColumnHead ings><b>NU M</b></td>
<td NOWRAP width=160 align=left class=gSGSectionColumnHead ings><b>PL AYER</b></ td>
<td NOWRAP width=40 align=left class=gSGSectionColumnHead ings><b>PO S</b></td>
<td NOWRAP width=40 align=left class=gSGSectionColumnHead ings><b>HT </b></td>
<td NOWRAP width=40 align=left class=gSGSectionColumnHead ings><b>WT </b></td>
<td NOWRAP width=65 align=left class=gSGSectionColumnHead ings><b>DO B</b></td>
<td NOWRAP width=185 align=left class=gSGSectionColumnHead ings><b>&n bsp; FROM</b></ td>
<td NOWRAP width=30 align=left class=gSGSectionColumnHead ings><b>YR S</b></td>
</tr>
Below in code is the html roster page that has been shortened to only include the area where the player roster table is. This has been save in a text file.
If some one could help me with the code to extract this inormation from the table in this webpage using the ms html objects I would greatly appreciate it.
If anyone needs more information to help please feel free to ask
Dim strNumber as New List(Of Integer)
Dim strPlayer as new List(Of String)
Dim strPosition as new List(Of String
Dim strHeight as new List(Of String)
Dim str Weight as new List(Of String)
Dim strDOB as new List(Of String)
Dim strFROM as new List(Of String)
Dim strYearsInPros as new List(Of string)
Below is the html snippet where the column Heading are :
</tr>
<tr class=gSGSectionColumnHead
<td NOWRAP width=40 class=gSGSectionColumnHead
<td NOWRAP width=160 align=left class=gSGSectionColumnHead
<td NOWRAP width=40 align=left class=gSGSectionColumnHead
<td NOWRAP width=40 align=left class=gSGSectionColumnHead
<td NOWRAP width=40 align=left class=gSGSectionColumnHead
<td NOWRAP width=65 align=left class=gSGSectionColumnHead
<td NOWRAP width=185 align=left class=gSGSectionColumnHead
<td NOWRAP width=30 align=left class=gSGSectionColumnHead
</tr>
Below in code is the html roster page that has been shortened to only include the area where the player roster table is. This has been save in a text file.
If some one could help me with the code to extract this inormation from the table in this webpage using the ms html objects I would greatly appreciate it.
If anyone needs more information to help please feel free to ask
</div>
<div id="article_content_wrapper">
<div id="rosterContainer" class="team_stats_grid">
<!--/frags/celtics/celticsAboveRoster.html SSI include-->
<TABLE border="0" cellPadding="0" cellSpacing="0">
<TR>
<TD align="left" class="cBCompRoster" height="100%" vAlign="top" width="80%">
<div style="margin-left:5px"></div></TD>
</TR>
<TR>
<TD>
<TABLE border="0" cellPadding="0" cellSpacing="0" width="600">
<TR>
<TD class="cBSpacing" colSpan="3">
<IMG height=1 src="/images/blank.gif"></TD>
</TR>
<TR>
<TD class="cBTop" colSpan="3">
<table cellPadding="0" cellSpacing="0" width="100%"><tr><td align="left" class="cBTop">
<DIV class="cBTitle">
Celtics Roster
</DIV>
</td><td align="right" class="cBTop">
</td></tr></table></TD>
</TD>
</TR>
<TR>
<TD class="cBSide" noWrap><BR></TD>
<TD align="left" class="cBComp" height="100%" vAlign="top" width="100%">
<!--sc-->
<table border="0" cellpadding="3" cellspacing="0" class=" gSGTable" >
<!-- the title -->
<tr>
<td class="bar gSGSectionTitle" colspan="8">
2010-11 Roster
</td>
</tr>
<tr class=gSGSectionColumnHeadings>
<td NOWRAP width=40 class=gSGSectionColumnHeadings><b>NUM</b></td>
<td NOWRAP width=160 align=left class=gSGSectionColumnHeadings><b>PLAYER</b></td>
<td NOWRAP width=40 align=left class=gSGSectionColumnHeadings><b>POS</b></td>
<td NOWRAP width=40 align=left class=gSGSectionColumnHeadings><b>HT</b></td>
<td NOWRAP width=40 align=left class=gSGSectionColumnHeadings><b>WT</b></td>
<td NOWRAP width=65 align=left class=gSGSectionColumnHeadings><b>DOB</b></td>
<td NOWRAP width=185 align=left class=gSGSectionColumnHeadings><b> FROM</b></td>
<td NOWRAP width=30 align=left class=gSGSectionColumnHeadings><b>YRS</b></td>
</tr>
<tr>
<td class="gSGRowEven"> 20</td>
<td class="gSGRowEven">
<a class=gSGPlayerLink href="/playerfile/ray_allen/index.html?nav=page" class=gSGLink>
Ray Allen</a>
</td>
<td class="gSGRowEven">
G </td>
<td class="gSGRowEven">
6-5</td>
<td class="gSGRowEven">
205</td>
<td class="gSGRowEven">
07/20/1975</td>
<td class="gSGRowEven">
Connecticut</td>
<td align="center" class="gSGRowEven">
14</td>
</tr>
<tr>
<td class="gSGRowOdd"> 0</td>
<td class="gSGRowOdd">
<a class=gSGPlayerLink href="/playerfile/avery_bradley/index.html?nav=page" class=gSGLink>
Avery Bradley</a>
</td>
<td class="gSGRowOdd">
G </td>
<td class="gSGRowOdd">
6-2</td>
<td class="gSGRowOdd">
180</td>
<td class="gSGRowOdd">
11/26/1990</td>
<td class="gSGRowOdd">
Texas</td>
<td align="center" class="gSGRowOdd">
R</td>
</tr>
<tr>
<td class="gSGRowEven"> 8</td>
<td class="gSGRowEven">
<a class=gSGPlayerLink href="/playerfile/marquis_daniels/index.html?nav=page" class=gSGLink>
Marquis Daniels</a>
</td>
<td class="gSGRowEven">
F-G </td>
<td class="gSGRowEven">
6-6</td>
<td class="gSGRowEven">
200</td>
<td class="gSGRowEven">
01/07/1981</td>
<td class="gSGRowEven">
Auburn</td>
<td align="center" class="gSGRowEven">
7</td>
</tr>
<tr>
<td class="gSGRowOdd"> 11</td>
<td class="gSGRowOdd">
<a class=gSGPlayerLink href="/playerfile/glen_davis/index.html?nav=page" class=gSGLink>
Glen Davis</a>
</td>
<td class="gSGRowOdd">
C-F </td>
<td class="gSGRowOdd">
6-9</td>
<td class="gSGRowOdd">
289</td>
<td class="gSGRowOdd">
01/01/1986</td>
<td class="gSGRowOdd">
Louisiana State</td>
<td align="center" class="gSGRowOdd">
3</td>
</tr>
<tr>
<td class="gSGRowEven"> 86</td>
<td class="gSGRowEven">
<a class=gSGPlayerLink href="/playerfile/semih_erden/index.html?nav=page" class=gSGLink>
Semih Erden</a>
</td>
<td class="gSGRowEven">
C </td>
<td class="gSGRowEven">
6-11</td>
<td class="gSGRowEven">
240</td>
<td class="gSGRowEven">
07/28/1986</td>
<td class="gSGRowEven">
Turkey</td>
<td align="center" class="gSGRowEven">
R</td>
</tr>
<tr>
<td class="gSGRowOdd"> 5</td>
<td class="gSGRowOdd">
<a class=gSGPlayerLink href="/playerfile/kevin_garnett/index.html?nav=page" class=gSGLink>
Kevin Garnett</a>
</td>
<td class="gSGRowOdd">
F </td>
<td class="gSGRowOdd">
6-11</td>
<td class="gSGRowOdd">
253</td>
<td class="gSGRowOdd">
05/19/1976</td>
<td class="gSGRowOdd">
Farragut Academy HS (IL)</td>
<td align="center" class="gSGRowOdd">
15</td>
</tr>
<tr>
<td class="gSGRowEven"> 55</td>
<td class="gSGRowEven">
<a class=gSGPlayerLink href="/playerfile/luke_harangody/index.html?nav=page" class=gSGLink>
Luke Harangody</a>
</td>
<td class="gSGRowEven">
F </td>
<td class="gSGRowEven">
6-7</td>
<td class="gSGRowEven">
251</td>
<td class="gSGRowEven">
01/02/1988</td>
<td class="gSGRowEven">
Notre Dame</td>
<td align="center" class="gSGRowEven">
R</td>
</tr>
<tr>
<td class="gSGRowOdd"> 7</td>
<td class="gSGRowOdd">
<a class=gSGPlayerLink href="/playerfile/jermaine_oneal/index.html?nav=page" class=gSGLink>
Jermaine O'Neal</a>
</td>
<td class="gSGRowOdd">
C-F </td>
<td class="gSGRowOdd">
6-11</td>
<td class="gSGRowOdd">
255</td>
<td class="gSGRowOdd">
10/13/1978</td>
<td class="gSGRowOdd">
Eau Claire HS (SC)</td>
<td align="center" class="gSGRowOdd">
14</td>
</tr>
<tr>
<td class="gSGRowEven"> 36</td>
<td class="gSGRowEven">
<a class=gSGPlayerLink href="/playerfile/shaquille_oneal/index.html?nav=page" class=gSGLink>
Shaquille O'Neal</a>
</td>
<td class="gSGRowEven">
C </td>
<td class="gSGRowEven">
7-1</td>
<td class="gSGRowEven">
325</td>
<td class="gSGRowEven">
03/06/1972</td>
<td class="gSGRowEven">
Louisiana State</td>
<td align="center" class="gSGRowEven">
18</td>
</tr>
<tr>
<td class="gSGRowOdd"> 43</td>
<td class="gSGRowOdd">
<a class=gSGPlayerLink href="/playerfile/kendrick_perkins/index.html?nav=page" class=gSGLink>
Kendrick Perkins</a>
</td>
<td class="gSGRowOdd">
C </td>
<td class="gSGRowOdd">
6-10</td>
<td class="gSGRowOdd">
280</td>
<td class="gSGRowOdd">
11/10/1984</td>
<td class="gSGRowOdd">
Clifton J. Ozen HS (TX)</td>
<td align="center" class="gSGRowOdd">
7</td>
</tr>
<tr>
<td class="gSGRowEven"> 34</td>
<td class="gSGRowEven">
<a class=gSGPlayerLink href="/playerfile/paul_pierce/index.html?nav=page" class=gSGLink>
Paul Pierce</a> - C
</td>
<td class="gSGRowEven">
F </td>
<td class="gSGRowEven">
6-7</td>
<td class="gSGRowEven">
235</td>
<td class="gSGRowEven">
10/13/1977</td>
<td class="gSGRowEven">
Kansas</td>
<td align="center" class="gSGRowEven">
12</td>
</tr>
<tr>
<td class="gSGRowOdd"> 4</td>
<td class="gSGRowOdd">
<a class=gSGPlayerLink href="/playerfile/nate_robinson/index.html?nav=page" class=gSGLink>
Nate Robinson</a>
</td>
<td class="gSGRowOdd">
G </td>
<td class="gSGRowOdd">
5-9</td>
<td class="gSGRowOdd">
180</td>
<td class="gSGRowOdd">
05/31/1984</td>
<td class="gSGRowOdd">
Washington</td>
<td align="center" class="gSGRowOdd">
5</td>
</tr>
<tr>
<td class="gSGRowEven"> 9</td>
<td class="gSGRowEven">
<a class=gSGPlayerLink href="/playerfile/rajon_rondo/index.html?nav=page" class=gSGLink>
Rajon Rondo</a>
</td>
<td class="gSGRowEven">
G </td>
<td class="gSGRowEven">
6-1</td>
<td class="gSGRowEven">
171</td>
<td class="gSGRowEven">
02/22/1986</td>
<td class="gSGRowEven">
Kentucky</td>
<td align="center" class="gSGRowEven">
4</td>
</tr>
<tr>
<td class="gSGRowOdd"> 12</td>
<td class="gSGRowOdd">
<a class=gSGPlayerLink href="/playerfile/von_wafer/index.html?nav=page" class=gSGLink>
Von Wafer</a>
</td>
<td class="gSGRowOdd">
G </td>
<td class="gSGRowOdd">
6-5</td>
<td class="gSGRowOdd">
209</td>
<td class="gSGRowOdd">
07/21/1985</td>
<td class="gSGRowOdd">
Florida State</td>
<td align="center" class="gSGRowOdd">
5</td>
</tr>
<tr>
<td class="gSGRowEven"> 13</td>
<td class="gSGRowEven">
<a class=gSGPlayerLink href="/playerfile/delonte_west/index.html?nav=page" class=gSGLink>
Delonte West</a>
</td>
<td class="gSGRowEven">
G </td>
<td class="gSGRowEven">
6-3</td>
<td class="gSGRowEven">
180</td>
<td class="gSGRowEven">
07/26/1983</td>
<td class="gSGRowEven">
Saint Joseph's</td>
<td align="center" class="gSGRowEven">
6</td>
</tr>
</table>
<br>
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Glad to help :-)
ASKER