Solved

Bot page for 2.6mil items, need help with this one

Posted on 2011-03-23
2
158 Views
Last Modified: 2012-05-11
We have a script that we formerly used to let bots "crawl" our site. This is the only thing they were allowed to crawl on our site.

Now we have 2.6mil records in a database table, although there are only 2.06mil distinct records, and we want to allow the bots to "see" each of the "inventory_part_number" value only for each of those records.

This has caused a problem with the current script because of multiple bots accessing the script at the same time. It literally brings mySQL down to it's knees. Here is a shortened version, I've taken out all the HTML and added comments where needed.

How can I improve this to where multiple bots won't cause a problem? Or, on another hand, how can I turn this into a script that can create static HTML pages for each page of data in a particular directory that we will let the bots have access to?

<?php
  // Removed HTML before the php start tag above

  set_time_limit(0);
    
  $connection = @mysql_connect("localhost", "db_admin", "db_pwd");
  if($connection===FALSE)
    die('Could not connect: ' . mysql_error());
  
  if(!mysql_select_db("listings", $connection))
    die('Could not connect: ' . mysql_error());

  // how many rows to show per page
  $rowsPerPage = 10000;
  
  // by default we show first page
  $pageNum = 1;
  
  // if $_GET['page'] defined, use it as page number
  if(isset($_GET['page'])) {
    $pageNum = $_GET['page'];
  }

  // counting the offset
  $offset = ($pageNum - 1) * $rowsPerPage;
  
  $query = "SELECT inventory_part_number FROM linv_inventory
            ORDER BY inventory_part_number LIMIT $offset, $rowsPerPage";

  $result = mysql_query($query) or die('Error, main select query failed');

  echo "<table width=\"100%\" cellspacing=\"2\" cellpadding=\"3\">\n";
  echo "  <tr>\n";
  while($row = mysql_fetch_array($result)) {
    echo "  <tr>\n";
    echo '<td>' . $row['inventory_part_number'] . '</td>';$row = mysql_fetch_array($result);
    echo '<td>' . $row['inventory_part_number'] . '</td>';$row = mysql_fetch_array($result);
    echo '<td>' . $row['inventory_part_number'] . '</td>';$row = mysql_fetch_array($result);
    echo '<td>' . $row['inventory_part_number'] . '</td>';$row = mysql_fetch_array($result);
    echo '<td>' . $row['inventory_part_number'] . '</td>';$row = mysql_fetch_array($result);
    echo '<td>' . $row['inventory_part_number'] . '</td>';
    echo "  </tr>\n";
  } // while($row = mysql_fetch_array($result))

  echo "</table>\n";

  // how many rows we have in database
  $query   = "SELECT COUNT(inventory_part_number) AS numrows FROM linv_inventory";
  $result  = mysql_query($query) or die('Error, count query failed');
  $row     = mysql_fetch_array($result, MYSQL_ASSOC);
  $numrows = $row['numrows'];

  // how many pages we have when using paging?
  $maxPage = ceil($numrows/$rowsPerPage);
  
  // print the link to access each page
  $self = $_SERVER['PHP_SELF'];
  $nav  = '';
  
  for($page = 1; $page <= $maxPage; $page++) {
     if ($page == $pageNum) {
       $nav .= " $page "; // no need to create a link to current page
     } else {
       $nav .= " <a href=\"$self?page=$page\">$page</a> ";
     } // if ($page == $pageNum)
  } // for($page = 1; $page <= $maxPage; $page++)
  
  // creating previous and next link
  // plus the link to go straight to
  // the first and last page
  if ($pageNum > 1) {
    $page  = $pageNum - 1;
    $prev  = " <a href=\"$self?page=$page\">[Prev]</a> ";
    $first = " <a href=\"$self?page=1\">[First Page]</a> ";
  } else {
    $prev  = '&nbsp;'; // we're on page one, don't print previous link
    $first = '&nbsp;'; // nor the first page link
  } // if ($pageNum > 1)
  
  if ($pageNum < $maxPage) {
    $page = $pageNum + 1;
    $next = " <a href=\"$self?page=$page\">[Next]</a> ";
    $last = " <a href=\"$self?page=$maxPage\">[Last Page]</a> ";
  } else {
    $next = '&nbsp;'; // we're on the last page, don't print next link
    $last = '&nbsp;'; // nor the last page link
  } // if ($pageNum < $maxPage)
  
  // print the navigation link
  echo  $prev . $nav . $next ;
  
  // and close the database connection
  mysql_close($connection);
  
  // Removed HTML after the php end tag below
?>

Open in new window

0
Comment
Question by:EddieShipman
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
2 Comments
 
LVL 5

Accepted Solution

by:
tsmgeek earned 500 total points
ID: 35201401
dont let bots access the script directly as this is sure to end up in problems

1. make bots access a static file instead of the php file
2. edit script so it first writes all its output into a tmp file
3. rename the tmp file into the other static file which the bots access directly, as you are doing a rename its just an inode change and happens imediatly and the next time
4. put script in crontab so its run at set intervals
0
 
LVL 26

Author Closing Comment

by:EddieShipman
ID: 35235586
Building static pages from CSV create using select into OUTFILE
0

Featured Post

On Demand Webinar - Networking for the Cloud Era

This webinar discusses:
-Common barriers companies experience when moving to the cloud
-How SD-WAN changes the way we look at networks
-Best practices customers should employ moving forward with cloud migration
-What happens behind the scenes of SteelConnect’s one-click button

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Nothing in an HTTP request can be trusted, including HTTP headers and form data.  A form token is a tool that can be used to guard against request forgeries (CSRF).  This article shows an improved approach to form tokens, making it more difficult to…
Introduction This article is intended for those who are new to PHP error handling (https://www.experts-exchange.com/articles/11769/And-by-the-way-I-am-New-to-PHP.html).  It addresses one of the most common problems that plague beginning PHP develop…
The viewer will learn how to dynamically set the form action using jQuery.
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.

717 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question