Solved

Parse External HTML/CGI Table for my webpage

Posted on 2004-08-26
7
308 Views
Last Modified: 2013-12-16
I am putting together a gaming webpage. I would like to parse an HTML table on a webpage: http://www.battlerank.net/cgi-bin/viewer.cgi?outfit=6447&worldID=15 to get first a list of members sorted by date. And then a way that I can have an achievments section. So when someone's BR gets to 20 it prints a string on my webpage that says "Date: Name recieved his [whatever and quantity]". So 8-26-04: Someone recieved his BR20 if thats what the webpage shows. Or 8-26-04 Someoneelse recieved his CR5. Or 20,000 kills. Or whatever.

Right now I wrote a simple parser. I save that webpage as a text file. I then parse it line by line. It spits it out on my webpage. But i dont wnat to save that page as a file. Plus I cant get peices of a table, which is a long string as text.
I know regular expressions may be the way to go, but i'd like it easier, or know what to do.

If you need more info tell me please.
0
Comment
Question by:BiSHGoD
  • 2
7 Comments
 
LVL 49

Accepted Solution

by:
Roonaan earned 125 total points
ID: 11910439
I wrote a complete step by step example describing the way I should tackle this problem.

<?php
//We can directly load our file from the webserver when you php has access to the internet:
$htmlbody = implode('', file('http://www.battlerank.net/cgi-bin/viewer.cgi?outfit=6447&worldID=15'));
//You could replace the url with your filename when you like to use that instead of live info.

/* The table containing player data
     starts at a section indicated by a
     roster-anchor (view the htmlsource) */
$roster_start = strpos($htmlbody, '<a name="roster">');
if($roster_start === false) exit('#roster-tag not found');
$htmlbody = substr($htmlbody, $roster_start);

/* We would only want to have the table containing player
   data, and therefor search for the appropriate <table>
   and </table> element. */
$table_start = strpos($htmlbody, '<table');
if($table_start === false) exit('table-start not found');
$htmlbody = substr($htmlbody, $table_start);

$table_end = strpos($htmlbody, '</table>');
if($table_end === false) exit('table-end not found');
$htmlbody = substr($htmlbody, 0, $table_end + 8);

/* At this point, our htmlbody would only contain the table
   with player data
 
   The next step is to break the body into lines where
   each line contains a single table row. */

$table_data = explode('<tr',$htmlbody);

/* The <table> statement and any html preceeding the first
   <tr> tag won't be of any use to us, so we drop it: */

$table_data = array_slice($table_data, 1);

/* Each array-item now contains a single row, we would like
   to further parse that data to obtain the players BR and
   other info.
   I noticed the first row contains all columnnames like
   'BR', 'Player Name', etc. We will use that knowledge to
   store each members data into a array where
   $member_data['BR'] contains the members BR.
   We use a array $player_info to store the data of all
   members. You can use this array to retrieve the data you
   require, or to make selections.
 */
$player_info = array(); //allocate a array to store our data
$col_names = array();   //allocate a array to store colnames

foreach($table_data as $row_number => $table_row)
{
  //Chop any leading remainings of <tr bgcolor= etc data
  $table_row = substr($table_row, strpos($table_row,'>')+1);
  //Remove any trailing data, remaining everything until
  //the end of row-> until </tr>
  $table_row =
          substr($table_row, 0, strpos($table_row, '</tr'));
 
  /* Each cell is seperated by its <td> and </td> tags. We
     could explode $table_row to split into cells */
 
  $row_cells = explode('<td', $table_row);
  //drop the first array-item which contains all data until
  //the first <td occurence and in our case often is empty
  $row_cells = array_slice($row_cells,1);
 
  //We then loop through each cell to remove any html data
  //If you do not want to remove html to preserve hyperlinks
  //you can remove the line containing strip_tags, but you
  //shouldn't remove the if-statement!
  foreach($row_cells as $col => $celldata)
  {
    $celldata = strip_tags($celldata);
    if(($s = strpos($celldata, '>')) !== false)
      $celldata = substr($celldata, $s+1);
   
    //writeback into the array
    $row_cells[$col] = $celldata;
  }
 
  //The first row of the table contains all the field names
  //and should be handled seperately.
  if($row_number==0)
  {
    $colnames = $row_cells;
  }
  else
  {
    /* We want the playerdata to be available as a associative
       array, where $player_data['BR'] contains the BR and
       $player_data['Player Name'] contains the players name
     */
    $player_data = array();
    foreach($colnames as $col => $colname)
    {
      if(isset($row_cells[$col]))
        $player_data[$colname] = $row_cells[$col];
    }
    // Add the player data to the array containing all
    // players data
    $player_info[] = $player_data;
  }
}
/* Show all our player information gathered */
var_export($player_info);
//You can use print_r($player_info) when you have a older
//version of php.
?>

-r-
0
 

Author Comment

by:BiSHGoD
ID: 12299521
Question was not answered. Please refund.
0
 
LVL 20

Expert Comment

by:Venabili
ID: 12525502
BiSHGoD,

You cannot get refund unless you explain what exactly is not working in the above code. if you fail in doing this, I will repost my original recommendation

Venabili
0
 

Author Comment

by:BiSHGoD
ID: 12626711
I would love to say if it worked or not but the webpage it was to be parsing is now down. I don't know if it works or not, I may not need it if it never comes back up. So I am unsure if my question has been answered.
0

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

This article discusses how to create an extensible mechanism for linked drop downs.
Since pre-biblical times, humans have sought ways to keep secrets, and share the secrets selectively.  This article explores the ways PHP can be used to hide and encrypt information.
The purpose of this video is to demonstrate how to insert an Iframe into WordPress. This will be demonstrated using a Windows 8 PC. Go to your WordPress login page. This will look like the following: mywebsite.com/wp-login.php : Open Page or Post…
The viewer will learn how to count occurrences of each item in an array.

706 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

22 Experts available now in Live!

Get 1:1 Help Now