Solved

Parse External HTML/CGI Table for my webpage

Posted on 2004-08-26
7
310 Views
Last Modified: 2013-12-16
I am putting together a gaming webpage. I would like to parse an HTML table on a webpage: http://www.battlerank.net/cgi-bin/viewer.cgi?outfit=6447&worldID=15 to get first a list of members sorted by date. And then a way that I can have an achievments section. So when someone's BR gets to 20 it prints a string on my webpage that says "Date: Name recieved his [whatever and quantity]". So 8-26-04: Someone recieved his BR20 if thats what the webpage shows. Or 8-26-04 Someoneelse recieved his CR5. Or 20,000 kills. Or whatever.

Right now I wrote a simple parser. I save that webpage as a text file. I then parse it line by line. It spits it out on my webpage. But i dont wnat to save that page as a file. Plus I cant get peices of a table, which is a long string as text.
I know regular expressions may be the way to go, but i'd like it easier, or know what to do.

If you need more info tell me please.
0
Comment
Question by:BiSHGoD
  • 2
7 Comments
 
LVL 49

Accepted Solution

by:
Roonaan earned 125 total points
ID: 11910439
I wrote a complete step by step example describing the way I should tackle this problem.

<?php
//We can directly load our file from the webserver when you php has access to the internet:
$htmlbody = implode('', file('http://www.battlerank.net/cgi-bin/viewer.cgi?outfit=6447&worldID=15'));
//You could replace the url with your filename when you like to use that instead of live info.

/* The table containing player data
     starts at a section indicated by a
     roster-anchor (view the htmlsource) */
$roster_start = strpos($htmlbody, '<a name="roster">');
if($roster_start === false) exit('#roster-tag not found');
$htmlbody = substr($htmlbody, $roster_start);

/* We would only want to have the table containing player
   data, and therefor search for the appropriate <table>
   and </table> element. */
$table_start = strpos($htmlbody, '<table');
if($table_start === false) exit('table-start not found');
$htmlbody = substr($htmlbody, $table_start);

$table_end = strpos($htmlbody, '</table>');
if($table_end === false) exit('table-end not found');
$htmlbody = substr($htmlbody, 0, $table_end + 8);

/* At this point, our htmlbody would only contain the table
   with player data
 
   The next step is to break the body into lines where
   each line contains a single table row. */

$table_data = explode('<tr',$htmlbody);

/* The <table> statement and any html preceeding the first
   <tr> tag won't be of any use to us, so we drop it: */

$table_data = array_slice($table_data, 1);

/* Each array-item now contains a single row, we would like
   to further parse that data to obtain the players BR and
   other info.
   I noticed the first row contains all columnnames like
   'BR', 'Player Name', etc. We will use that knowledge to
   store each members data into a array where
   $member_data['BR'] contains the members BR.
   We use a array $player_info to store the data of all
   members. You can use this array to retrieve the data you
   require, or to make selections.
 */
$player_info = array(); //allocate a array to store our data
$col_names = array();   //allocate a array to store colnames

foreach($table_data as $row_number => $table_row)
{
  //Chop any leading remainings of <tr bgcolor= etc data
  $table_row = substr($table_row, strpos($table_row,'>')+1);
  //Remove any trailing data, remaining everything until
  //the end of row-> until </tr>
  $table_row =
          substr($table_row, 0, strpos($table_row, '</tr'));
 
  /* Each cell is seperated by its <td> and </td> tags. We
     could explode $table_row to split into cells */
 
  $row_cells = explode('<td', $table_row);
  //drop the first array-item which contains all data until
  //the first <td occurence and in our case often is empty
  $row_cells = array_slice($row_cells,1);
 
  //We then loop through each cell to remove any html data
  //If you do not want to remove html to preserve hyperlinks
  //you can remove the line containing strip_tags, but you
  //shouldn't remove the if-statement!
  foreach($row_cells as $col => $celldata)
  {
    $celldata = strip_tags($celldata);
    if(($s = strpos($celldata, '>')) !== false)
      $celldata = substr($celldata, $s+1);
   
    //writeback into the array
    $row_cells[$col] = $celldata;
  }
 
  //The first row of the table contains all the field names
  //and should be handled seperately.
  if($row_number==0)
  {
    $colnames = $row_cells;
  }
  else
  {
    /* We want the playerdata to be available as a associative
       array, where $player_data['BR'] contains the BR and
       $player_data['Player Name'] contains the players name
     */
    $player_data = array();
    foreach($colnames as $col => $colname)
    {
      if(isset($row_cells[$col]))
        $player_data[$colname] = $row_cells[$col];
    }
    // Add the player data to the array containing all
    // players data
    $player_info[] = $player_data;
  }
}
/* Show all our player information gathered */
var_export($player_info);
//You can use print_r($player_info) when you have a older
//version of php.
?>

-r-
0
 

Author Comment

by:BiSHGoD
ID: 12299521
Question was not answered. Please refund.
0
 
LVL 20

Expert Comment

by:Venabili
ID: 12525502
BiSHGoD,

You cannot get refund unless you explain what exactly is not working in the above code. if you fail in doing this, I will repost my original recommendation

Venabili
0
 

Author Comment

by:BiSHGoD
ID: 12626711
I would love to say if it worked or not but the webpage it was to be parsing is now down. I don't know if it works or not, I may not need it if it never comes back up. So I am unsure if my question has been answered.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Foreword (July, 2015) Since I first wrote this article, years ago, a great many more people have begun using the internet.  They are coming online from every part of the globe, learning, reading, shopping and spending money at an ever-increasing ra…
Build an array called $myWeek which will hold the array elements Today, Yesterday and then builds up the rest of the week by the name of the day going back 1 week.   (CODE) (CODE) Then you just need to pass your date to the function. If i…
The purpose of this video is to demonstrate how to update a WordPress Site’s version. WordPress releases new versions of its software frequently and it is important to update frequently in order to keep your site secure, and to get new WordPress…
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.

911 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now