Solved

PHP html-File search

Posted on 2004-08-11
6
253 Views
Last Modified: 2010-04-17
Hello,

I need an PHP Script which open all HTML files on the public_html directory ( including subdirectorys ) and Search for an "string" ( non case sensitiv ) and the "title" of the page ( every html page have an title like: <title>Home</title>. It schould count how often this "string" appears on this site and print the result as an Link sorted by the number of hits.

For example my webspace contains 3 HTML pages:
- index.htm
- misc.htm
- kontakt.htm
I search for the word "Images", it exist in the Page "index.htm" once, in the misc.htm 8 of times. The output of the script should be:
Miscellaneous (8)
-> <a href="misc.htm">View Site</a>
Home (1)
-> <a href="misc.htm">View Site</a>

Some Information:
- Webserver Apache 1.3.31
- PHP-version: 4.x
- complete directory for local search: /home5/f6721805/public_html/ ( only files of the "public_html" will be accessable on the internet )

I have never written an php programm ( only other languages like Simatic S7 for Siemens SPS, HTML/CSS, VBS) so i think you peoples are faster in writing it than me. I think I'm able to integrate the script into my HTML Files.

----------------------------------------------------------
Sorry for my bad english, if i have something written incomprehensible please tell me. I will try to explain it again/better.
0
Comment
Question by:Kakashi
  • 3
  • 2
6 Comments
 
LVL 3

Expert Comment

by:thecode101
Comment Utility
Try this out, if nothing else it is a good start:

<?php
$dir = "";
$search = "";

 if ($handle = opendir($dir)) {
      while (false !== ($file = readdir($handle))) {
      $contents = "";
              if ($file != "." && $file != "..") {
                         $fp = fopen($dir."/".$file, "r");
                         $contents = fread($fp, filesize($dir."/".$file));
                         fclose($fp);
                         $split = explode ("<title>",$contents);
                         $split = explode ("</title>",$split[1]);
                         $title = $split[0];
                         $numOfOccurences = substr_count ($contents,$search);
                         echo $title."(".$numOfOccurences.")"."<a href='".$dir."/".$file."'>View Site</a><br>";
              }
      }
}
closedir($handle);
?>
0
 
LVL 3

Expert Comment

by:Sasho
Comment Utility
Here is more code to look at for ideas. But please remember that thecode101 answered first and his code should work as well.
<?PHP

function recursive_listdir($base) {
   static $filelist = array();
   static $dirlist = array();

   if(is_dir($base)) {
       $dh = opendir($base);
       while (false !== ($dir = readdir($dh))) {
           if (is_dir($base ."/". $dir) && $dir !== '.' && $dir !== '..') {
               $subbase = $base ."/". $dir;
               $dirlist[] = $subbase;
               $subdirlist = recursive_listdir($subbase);
           } elseif(is_file($base ."/". $dir) && $dir !== '.' && $dir !== '..') {
               $filelist[] = $base ."/". $dir;
           }
       }
       closedir($dh);
   }
   $array['dirs'] = $dirlist;
   $array['files'] = $filelist;
   return $array;
 }


$directory_structure=recursive_listdir(".");
foreach ($directory_structure['files'] as $file){
      $handle = fopen($file, "r");
      $lines = fread($handle, filesize($file));
      fclose($handle);

      $count=preg_match_all("/hello/i",$lines,$matches);
      preg_match("/<title>(.*)<\/title>/i",$lines, $matches);
      $title = $matches[1];

      print("$title($count) <a href=\"$file\">View Site</a><br>");
}

?>
0
 

Author Comment

by:Kakashi
Comment Utility
@ Sahso

wow your code is good. but i need your help.

- first i get some warnings look at http://www.synapstix.de/suche.php.
- Next thing this script search every file how can i confine the search only to *.htm|*.html files ?

- First thing i have done is that i have change your result output:
  if($count != 0){
       print("$title($count) <a href=\"$file\">View Site</a><br>");
   }
0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 
LVL 3

Expert Comment

by:Sasho
Comment Utility
Here is the fix to only do htm and html files:
<?PHP

function recursive_listdir($base) {
   static $filelist = array();
   static $dirlist = array();

   if(is_dir($base)) {
       $dh = opendir($base);
       while (false !== ($dir = readdir($dh))) {
           if (is_dir($base ."/". $dir) && $dir !== '.' && $dir !== '..') {
               $subbase = $base ."/". $dir;
               $dirlist[] = $subbase;
               $subdirlist = recursive_listdir($subbase);
           } elseif(is_file($base ."/". $dir) && $dir !== '.' && $dir !== '..') {
               $filelist[] = $base ."/". $dir;
           }
       }
       closedir($dh);
   }
   $array['dirs'] = $dirlist;
   $array['files'] = $filelist;
   return $array;
 }


$directory_structure=recursive_listdir(".");
//print_r($directory_structure['files']);

foreach ($directory_structure['files'] as $file){


      if (preg_match("/(.*)\.htm[l]*$/",$file) != 0 ){
            $handle = fopen($file, "r");
            $lines = fread($handle, filesize($file));
            fclose($handle);

            $count=preg_match_all("/hello/i",$lines,$matches);
            preg_match("/<title>(.*)<\/title>/i",$lines, $matches);
            $title = $matches[1];

            print("$title($count) <a href=\"$file\">View Site</a><br>");
      }
}

?>
0
 
LVL 3

Accepted Solution

by:
Sasho earned 500 total points
Comment Utility
Try this version to see if your Warnings go away:
<?PHP

function recursive_listdir($base) {
   static $filelist = array();
   static $dirlist = array();

   if(is_dir($base)) {
       $dh = opendir($base);
       while (false !== ($dir = readdir($dh))) {
           if (is_dir($base ."/". $dir) && $dir !== '.' && $dir !== '..') {
               $subbase = $base ."/". $dir;
               $dirlist[] = $subbase;
               $subdirlist = recursive_listdir($subbase);
           } elseif(is_file($base ."/". $dir) && $dir !== '.' && $dir !== '..') {
               $filelist[] = $base ."/". $dir;
           }
       }
       closedir($dh);
   }
   $array['dirs'] = $dirlist;
   $array['files'] = $filelist;
   return $array;
 }


$directory_structure=recursive_listdir(".");
//print_r($directory_structure['files']);

foreach ($directory_structure['files'] as $file){


      if (preg_match("/(.*)\.htm[l]*$/",$file) != 0 ){
            $count = 0;
            $lines='';
            $handle = fopen($file, "r");
            if (filesize($file)!=0){
                  $lines = fread($handle, filesize($file));
            }
            fclose($handle);

            $count=preg_match_all("/hello/i",$lines,$matches);
            preg_match("/<title>(.*)<\/title>/i",$lines, $matches);
            $title = $matches[1];

            if($count != 0){
                   print("$title($count) <a href=\"$file\">View Site</a><br>");
               }
      }
}

?>
0
 

Author Comment

by:Kakashi
Comment Utility
Sasho Thanks a lot, you get the points ^^
0

Featured Post

How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
bigHeights  challenge 13 55
SPLUNK REST  API call to Splunk to create and index? 2 65
count8 challlenge 13 84
topping2 challenge 13 55
Whether you've completed a degree in computer sciences or you're a self-taught programmer, writing your first lines of code in the real world is always a challenge. Here are some of the most common pitfalls for new programmers.
Whether you’re a college noob or a soon-to-be pro, these tips are sure to help you in your journey to becoming a programming ninja and stand out from the crowd.
An introduction to basic programming syntax in Java by creating a simple program. Viewers can follow the tutorial as they create their first class in Java. Definitions and explanations about each element are given to help prepare viewers for future …
Viewers will learn how to properly install Eclipse with the necessary JDK, and will take a look at an introductory Java program. Download Eclipse installation zip file: Extract files from zip file: Download and install JDK 8: Open Eclipse and …

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

6 Experts available now in Live!

Get 1:1 Help Now