Solved

Extract Table Info from a web Page Using mshtml

Posted on 2011-02-13
3
549 Views
Last Modified: 2012-08-14
I am trying to extract the player information from the Boston Celtics Roster page. I have save the roster page into a text file named BostonRosterHtml.txt. I know how to extract this data form the table using Regex. But I am trying to learn how to do this extraction using the mshtml objects. This is where I am unsure how to go about this. I am tryng to extract the player's infor from the page and storing each player's information in a list. ie

Dim strNumber as New  List(Of Integer)
Dim strPlayer as new List(Of String)
Dim  strPosition as new List(Of String
Dim strHeight as new List(Of String)
Dim str Weight as new List(Of String)
Dim strDOB as new List(Of String)
Dim strFROM as new List(Of String)
Dim strYearsInPros as new List(Of string)

Below is the html snippet where the column Heading are :
</tr>

<tr class=gSGSectionColumnHeadings>
<td NOWRAP width=40 class=gSGSectionColumnHeadings><b>NUM</b></td>
<td NOWRAP width=160 align=left class=gSGSectionColumnHeadings><b>PLAYER</b></td>
<td NOWRAP width=40 align=left class=gSGSectionColumnHeadings><b>POS</b></td>
<td NOWRAP width=40 align=left class=gSGSectionColumnHeadings><b>HT</b></td>
<td NOWRAP width=40 align=left class=gSGSectionColumnHeadings><b>WT</b></td>
<td NOWRAP width=65 align=left class=gSGSectionColumnHeadings><b>DOB</b></td>
<td NOWRAP width=185 align=left class=gSGSectionColumnHeadings><b>&nbsp;&nbsp;FROM</b></td>
<td NOWRAP width=30 align=left class=gSGSectionColumnHeadings><b>YRS</b></td>
</tr>

Below in code is the html  roster page that has been shortened to only include the area where the player roster  table is. This has been save in a text file.

If some one could help me with the code to extract this inormation from the table in this webpage  using the ms html objects I would greatly appreciate it.
If anyone needs more information to help please feel free to ask
</div>

	
					
					<div id="article_content_wrapper">
						
						
						
						<div id="rosterContainer" class="team_stats_grid">
							  
		<!--/frags/celtics/celticsAboveRoster.html SSI include-->
		
	
							
							
							 

	<TABLE border="0" cellPadding="0" cellSpacing="0">
		<TR>
			<TD align="left" class="cBCompRoster" height="100%" vAlign="top" width="80%">
			<div style="margin-left:5px"></div></TD>
		</TR>
		<TR>
			<TD>
			<TABLE border="0" cellPadding="0" cellSpacing="0" width="600">
			<TR>
				<TD class="cBSpacing" colSpan="3">
				<IMG height=1 src="/images/blank.gif"></TD>
			</TR>
			<TR>
				<TD class="cBTop" colSpan="3">
				<table cellPadding="0" cellSpacing="0" width="100%"><tr><td align="left"  class="cBTop">
				<DIV class="cBTitle">
				
					Celtics Roster
				
				</DIV>
				</td><td align="right" class="cBTop">
				
				</td></tr></table></TD>
				</TD>
			</TR>
			<TR>
				<TD class="cBSide" noWrap><BR></TD>
				<TD align="left" class="cBComp" height="100%" vAlign="top" width="100%">
				
				
					








  
<!--sc-->





<table border="0" cellpadding="3" cellspacing="0" class=" gSGTable"  >
<!-- the title -->
<tr>
	<td class="bar gSGSectionTitle" colspan="8">
		&nbsp;2010-11 Roster
	</td>
</tr>

<tr class=gSGSectionColumnHeadings>
<td NOWRAP width=40 class=gSGSectionColumnHeadings><b>NUM</b></td>
<td NOWRAP width=160 align=left class=gSGSectionColumnHeadings><b>PLAYER</b></td>
<td NOWRAP width=40 align=left class=gSGSectionColumnHeadings><b>POS</b></td>
<td NOWRAP width=40 align=left class=gSGSectionColumnHeadings><b>HT</b></td>
<td NOWRAP width=40 align=left class=gSGSectionColumnHeadings><b>WT</b></td>
<td NOWRAP width=65 align=left class=gSGSectionColumnHeadings><b>DOB</b></td>
<td NOWRAP width=185 align=left class=gSGSectionColumnHeadings><b>&nbsp;&nbsp;FROM</b></td>
<td NOWRAP width=30 align=left class=gSGSectionColumnHeadings><b>YRS</b></td>
</tr>



    
    
    <tr>
    <td class="gSGRowEven"> 20</td>
    <td class="gSGRowEven">
    		
    		
            <a class=gSGPlayerLink href="/playerfile/ray_allen/index.html?nav=page" class=gSGLink>
             Ray Allen</a> 
    </td>
    <td class="gSGRowEven">
        G </td>
    <td class="gSGRowEven">
         6-5</td>
    <td class="gSGRowEven">
         205</td>
    <td class="gSGRowEven">
         07/20/1975</td>
    <td class="gSGRowEven">
         &nbsp; Connecticut</td>
    <td align="center" class="gSGRowEven">
         14</td>
    </tr>
    
    
    <tr>
    <td class="gSGRowOdd"> 0</td>
    <td class="gSGRowOdd">
    		
    		
            <a class=gSGPlayerLink href="/playerfile/avery_bradley/index.html?nav=page" class=gSGLink>
             Avery Bradley</a> 
    </td>
    <td class="gSGRowOdd">
        G </td>
    <td class="gSGRowOdd">
         6-2</td>
    <td class="gSGRowOdd">
         180</td>
    <td class="gSGRowOdd">
         11/26/1990</td>
    <td class="gSGRowOdd">
         &nbsp; Texas</td>
    <td align="center" class="gSGRowOdd">
         R</td>
    </tr>
    
    
    <tr>
    <td class="gSGRowEven"> 8</td>
    <td class="gSGRowEven">
    		
    		
            <a class=gSGPlayerLink href="/playerfile/marquis_daniels/index.html?nav=page" class=gSGLink>
             Marquis Daniels</a> 
    </td>
    <td class="gSGRowEven">
        F-G </td>
    <td class="gSGRowEven">
         6-6</td>
    <td class="gSGRowEven">
         200</td>
    <td class="gSGRowEven">
         01/07/1981</td>
    <td class="gSGRowEven">
         &nbsp; Auburn</td>
    <td align="center" class="gSGRowEven">
         7</td>
    </tr>
    
    
    <tr>
    <td class="gSGRowOdd"> 11</td>
    <td class="gSGRowOdd">
    		
    		
            <a class=gSGPlayerLink href="/playerfile/glen_davis/index.html?nav=page" class=gSGLink>
             Glen Davis</a> 
    </td>
    <td class="gSGRowOdd">
        C-F </td>
    <td class="gSGRowOdd">
         6-9</td>
    <td class="gSGRowOdd">
         289</td>
    <td class="gSGRowOdd">
         01/01/1986</td>
    <td class="gSGRowOdd">
         &nbsp; Louisiana State</td>
    <td align="center" class="gSGRowOdd">
         3</td>
    </tr>
    
    
    <tr>
    <td class="gSGRowEven"> 86</td>
    <td class="gSGRowEven">
    		
    		
            <a class=gSGPlayerLink href="/playerfile/semih_erden/index.html?nav=page" class=gSGLink>
             Semih Erden</a> 
    </td>
    <td class="gSGRowEven">
        C </td>
    <td class="gSGRowEven">
         6-11</td>
    <td class="gSGRowEven">
         240</td>
    <td class="gSGRowEven">
         07/28/1986</td>
    <td class="gSGRowEven">
         &nbsp; Turkey</td>
    <td align="center" class="gSGRowEven">
         R</td>
    </tr>
    
    
    <tr>
    <td class="gSGRowOdd"> 5</td>
    <td class="gSGRowOdd">
    		
    		
            <a class=gSGPlayerLink href="/playerfile/kevin_garnett/index.html?nav=page" class=gSGLink>
             Kevin Garnett</a> 
    </td>
    <td class="gSGRowOdd">
        F </td>
    <td class="gSGRowOdd">
         6-11</td>
    <td class="gSGRowOdd">
         253</td>
    <td class="gSGRowOdd">
         05/19/1976</td>
    <td class="gSGRowOdd">
         &nbsp; Farragut Academy HS (IL)</td>
    <td align="center" class="gSGRowOdd">
         15</td>
    </tr>
    
    
    <tr>
    <td class="gSGRowEven"> 55</td>
    <td class="gSGRowEven">
    		
    		
            <a class=gSGPlayerLink href="/playerfile/luke_harangody/index.html?nav=page" class=gSGLink>
             Luke Harangody</a> 
    </td>
    <td class="gSGRowEven">
        F </td>
    <td class="gSGRowEven">
         6-7</td>
    <td class="gSGRowEven">
         251</td>
    <td class="gSGRowEven">
         01/02/1988</td>
    <td class="gSGRowEven">
         &nbsp; Notre Dame</td>
    <td align="center" class="gSGRowEven">
         R</td>
    </tr>
    
    
    <tr>
    <td class="gSGRowOdd"> 7</td>
    <td class="gSGRowOdd">
    		
    		
            <a class=gSGPlayerLink href="/playerfile/jermaine_oneal/index.html?nav=page" class=gSGLink>
             Jermaine O'Neal</a> 
    </td>
    <td class="gSGRowOdd">
        C-F </td>
    <td class="gSGRowOdd">
         6-11</td>
    <td class="gSGRowOdd">
         255</td>
    <td class="gSGRowOdd">
         10/13/1978</td>
    <td class="gSGRowOdd">
         &nbsp; Eau Claire HS (SC)</td>
    <td align="center" class="gSGRowOdd">
         14</td>
    </tr>
    
    
    <tr>
    <td class="gSGRowEven"> 36</td>
    <td class="gSGRowEven">
    		
    		
            <a class=gSGPlayerLink href="/playerfile/shaquille_oneal/index.html?nav=page" class=gSGLink>
             Shaquille O'Neal</a> 
    </td>
    <td class="gSGRowEven">
        C </td>
    <td class="gSGRowEven">
         7-1</td>
    <td class="gSGRowEven">
         325</td>
    <td class="gSGRowEven">
         03/06/1972</td>
    <td class="gSGRowEven">
         &nbsp; Louisiana State</td>
    <td align="center" class="gSGRowEven">
         18</td>
    </tr>
    
    
    <tr>
    <td class="gSGRowOdd"> 43</td>
    <td class="gSGRowOdd">
    		
    		
            <a class=gSGPlayerLink href="/playerfile/kendrick_perkins/index.html?nav=page" class=gSGLink>
             Kendrick Perkins</a> 
    </td>
    <td class="gSGRowOdd">
        C </td>
    <td class="gSGRowOdd">
         6-10</td>
    <td class="gSGRowOdd">
         280</td>
    <td class="gSGRowOdd">
         11/10/1984</td>
    <td class="gSGRowOdd">
         &nbsp; Clifton J. Ozen HS (TX)</td>
    <td align="center" class="gSGRowOdd">
         7</td>
    </tr>
    
    
    <tr>
    <td class="gSGRowEven"> 34</td>
    <td class="gSGRowEven">
    		
    		
            <a class=gSGPlayerLink href="/playerfile/paul_pierce/index.html?nav=page" class=gSGLink>
             Paul Pierce</a> - C
    </td>
    <td class="gSGRowEven">
        F </td>
    <td class="gSGRowEven">
         6-7</td>
    <td class="gSGRowEven">
         235</td>
    <td class="gSGRowEven">
         10/13/1977</td>
    <td class="gSGRowEven">
         &nbsp; Kansas</td>
    <td align="center" class="gSGRowEven">
         12</td>
    </tr>
    
    
    <tr>
    <td class="gSGRowOdd"> 4</td>
    <td class="gSGRowOdd">
    		
    		
            <a class=gSGPlayerLink href="/playerfile/nate_robinson/index.html?nav=page" class=gSGLink>
             Nate Robinson</a> 
    </td>
    <td class="gSGRowOdd">
        G </td>
    <td class="gSGRowOdd">
         5-9</td>
    <td class="gSGRowOdd">
         180</td>
    <td class="gSGRowOdd">
         05/31/1984</td>
    <td class="gSGRowOdd">
         &nbsp; Washington</td>
    <td align="center" class="gSGRowOdd">
         5</td>
    </tr>
    
    
    <tr>
    <td class="gSGRowEven"> 9</td>
    <td class="gSGRowEven">
    		
    		
            <a class=gSGPlayerLink href="/playerfile/rajon_rondo/index.html?nav=page" class=gSGLink>
             Rajon Rondo</a> 
    </td>
    <td class="gSGRowEven">
        G </td>
    <td class="gSGRowEven">
         6-1</td>
    <td class="gSGRowEven">
         171</td>
    <td class="gSGRowEven">
         02/22/1986</td>
    <td class="gSGRowEven">
         &nbsp; Kentucky</td>
    <td align="center" class="gSGRowEven">
         4</td>
    </tr>
    
    
    <tr>
    <td class="gSGRowOdd"> 12</td>
    <td class="gSGRowOdd">
    		
    		
            <a class=gSGPlayerLink href="/playerfile/von_wafer/index.html?nav=page" class=gSGLink>
             Von Wafer</a> 
    </td>
    <td class="gSGRowOdd">
        G </td>
    <td class="gSGRowOdd">
         6-5</td>
    <td class="gSGRowOdd">
         209</td>
    <td class="gSGRowOdd">
         07/21/1985</td>
    <td class="gSGRowOdd">
         &nbsp; Florida State</td>
    <td align="center" class="gSGRowOdd">
         5</td>
    </tr>
    
    
    <tr>
    <td class="gSGRowEven"> 13</td>
    <td class="gSGRowEven">
    		
    		
            <a class=gSGPlayerLink href="/playerfile/delonte_west/index.html?nav=page" class=gSGLink>
             Delonte West</a> 
    </td>
    <td class="gSGRowEven">
        G </td>
    <td class="gSGRowEven">
         6-3</td>
    <td class="gSGRowEven">
         180</td>
    <td class="gSGRowEven">
         07/26/1983</td>
    <td class="gSGRowEven">
         &nbsp; Saint Joseph&#039;s</td>
    <td align="center" class="gSGRowEven">
         6</td>
    </tr>

</table>
<br>

Open in new window

0
Comment
Question by:Squeakrz44
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
3 Comments
 
LVL 83

Accepted Solution

by:
CodeCruiser earned 500 total points
ID: 34883693
0
 

Author Closing Comment

by:Squeakrz44
ID: 34901414
Thanks so much greatly helped and appreciated.
0
 
LVL 83

Expert Comment

by:CodeCruiser
ID: 34905867
Glad to help :-)
0

Featured Post

Salesforce Has Never Been Easier

Improve and reinforce salesforce training & adoption using WalkMe's digital adoption platform. Start saving on costly employee training by creating fast intuitive Walk-Thrus for Salesforce. Claim your Free Account Now

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

There is an easy way, in .NET, to centralize the treatment of all unexpected errors. First of all, instead of launching the application directly in a Form, you need first to write a Sub called Main, in a module. Then, set the Startup Object to th…
A long time ago (May 2011), I have written an article showing you how to create a DLL using Visual Studio 2005 to be hosted in SQL Server 2005. That was valid at that time and it is still valid if you are still using these versions. You can still re…
The goal of the tutorial is to teach the user how to use functions in C++. The video will cover how to define functions, how to call functions and how to create functions prototypes. Microsoft Visual C++ 2010 Express will be used as a text editor an…
Viewers will learn how to properly install Eclipse with the necessary JDK, and will take a look at an introductory Java program. Download Eclipse installation zip file: Extract files from zip file: Download and install JDK 8: Open Eclipse and …

687 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question