Improve company productivity with a Business Account.Sign Up

x
?
Solved

Bot page for 2.6mil items, need help with this one

Posted on 2011-03-23
2
Medium Priority
?
171 Views
Last Modified: 2012-05-11
We have a script that we formerly used to let bots "crawl" our site. This is the only thing they were allowed to crawl on our site.

Now we have 2.6mil records in a database table, although there are only 2.06mil distinct records, and we want to allow the bots to "see" each of the "inventory_part_number" value only for each of those records.

This has caused a problem with the current script because of multiple bots accessing the script at the same time. It literally brings mySQL down to it's knees. Here is a shortened version, I've taken out all the HTML and added comments where needed.

How can I improve this to where multiple bots won't cause a problem? Or, on another hand, how can I turn this into a script that can create static HTML pages for each page of data in a particular directory that we will let the bots have access to?

<?php
  // Removed HTML before the php start tag above

  set_time_limit(0);
    
  $connection = @mysql_connect("localhost", "db_admin", "db_pwd");
  if($connection===FALSE)
    die('Could not connect: ' . mysql_error());
  
  if(!mysql_select_db("listings", $connection))
    die('Could not connect: ' . mysql_error());

  // how many rows to show per page
  $rowsPerPage = 10000;
  
  // by default we show first page
  $pageNum = 1;
  
  // if $_GET['page'] defined, use it as page number
  if(isset($_GET['page'])) {
    $pageNum = $_GET['page'];
  }

  // counting the offset
  $offset = ($pageNum - 1) * $rowsPerPage;
  
  $query = "SELECT inventory_part_number FROM linv_inventory
            ORDER BY inventory_part_number LIMIT $offset, $rowsPerPage";

  $result = mysql_query($query) or die('Error, main select query failed');

  echo "<table width=\"100%\" cellspacing=\"2\" cellpadding=\"3\">\n";
  echo "  <tr>\n";
  while($row = mysql_fetch_array($result)) {
    echo "  <tr>\n";
    echo '<td>' . $row['inventory_part_number'] . '</td>';$row = mysql_fetch_array($result);
    echo '<td>' . $row['inventory_part_number'] . '</td>';$row = mysql_fetch_array($result);
    echo '<td>' . $row['inventory_part_number'] . '</td>';$row = mysql_fetch_array($result);
    echo '<td>' . $row['inventory_part_number'] . '</td>';$row = mysql_fetch_array($result);
    echo '<td>' . $row['inventory_part_number'] . '</td>';$row = mysql_fetch_array($result);
    echo '<td>' . $row['inventory_part_number'] . '</td>';
    echo "  </tr>\n";
  } // while($row = mysql_fetch_array($result))

  echo "</table>\n";

  // how many rows we have in database
  $query   = "SELECT COUNT(inventory_part_number) AS numrows FROM linv_inventory";
  $result  = mysql_query($query) or die('Error, count query failed');
  $row     = mysql_fetch_array($result, MYSQL_ASSOC);
  $numrows = $row['numrows'];

  // how many pages we have when using paging?
  $maxPage = ceil($numrows/$rowsPerPage);
  
  // print the link to access each page
  $self = $_SERVER['PHP_SELF'];
  $nav  = '';
  
  for($page = 1; $page <= $maxPage; $page++) {
     if ($page == $pageNum) {
       $nav .= " $page "; // no need to create a link to current page
     } else {
       $nav .= " <a href=\"$self?page=$page\">$page</a> ";
     } // if ($page == $pageNum)
  } // for($page = 1; $page <= $maxPage; $page++)
  
  // creating previous and next link
  // plus the link to go straight to
  // the first and last page
  if ($pageNum > 1) {
    $page  = $pageNum - 1;
    $prev  = " <a href=\"$self?page=$page\">[Prev]</a> ";
    $first = " <a href=\"$self?page=1\">[First Page]</a> ";
  } else {
    $prev  = '&nbsp;'; // we're on page one, don't print previous link
    $first = '&nbsp;'; // nor the first page link
  } // if ($pageNum > 1)
  
  if ($pageNum < $maxPage) {
    $page = $pageNum + 1;
    $next = " <a href=\"$self?page=$page\">[Next]</a> ";
    $last = " <a href=\"$self?page=$maxPage\">[Last Page]</a> ";
  } else {
    $next = '&nbsp;'; // we're on the last page, don't print next link
    $last = '&nbsp;'; // nor the last page link
  } // if ($pageNum < $maxPage)
  
  // print the navigation link
  echo  $prev . $nav . $next ;
  
  // and close the database connection
  mysql_close($connection);
  
  // Removed HTML after the php end tag below
?>

Open in new window

0
Comment
Question by:Eddie Shipman
2 Comments
 
LVL 5

Accepted Solution

by:
tsmgeek earned 2000 total points
ID: 35201401
dont let bots access the script directly as this is sure to end up in problems

1. make bots access a static file instead of the php file
2. edit script so it first writes all its output into a tmp file
3. rename the tmp file into the other static file which the bots access directly, as you are doing a rename its just an inode change and happens imediatly and the next time
4. put script in crontab so its run at set intervals
0
 
LVL 26

Author Closing Comment

by:Eddie Shipman
ID: 35235586
Building static pages from CSV create using select into OUTFILE
0

Featured Post

Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Since pre-biblical times, humans have sought ways to keep secrets, and share the secrets selectively.  This article explores the ways PHP can be used to hide and encrypt information.
Laravel is the most sought after web development framework. It comes with ample amount of features that make it easy for developers to work around it. Know about its features in detail.
The viewer will learn how to dynamically set the form action using jQuery.
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.

607 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question