Solved

loop an array and replace instances in file

Posted on 2006-06-16
8
406 Views
Last Modified: 2008-02-01
Hello all!

This is a somewhat complex issue, so I will do my best to explain it clearly.  

We wish (for reasons of SEO and loading speed) to nightly convert our PHP files into HTML. We will do so using the PEAR::HTTP_Request class.  It reads the source of the PHP file as a string into a variable.

I can do this perfectly fine.  The issue now becomes editing the string so that all the HTML files link to one another.  

The possible replacements (.html URLs) are held in an array ($URLArray - it also holds the .html file name, and the save-to path).   How would I search the string for all instances of links, and if the url of the link matches a link from the array, replace it?  I am assuming I would have to loop through the array and use preg_match or preg_match_all to find them.  

Any input you give would be greatly appreciated.

Here are a few bits of the array - its huge:

Array
(
    [0] => Array
        (
            [0] => Array
                (
                    [0] => http://www.buffalotraders.com/product_info.php?products_id=569
                    [1] => adit-600-base-and-2-v35-ports-part-number-02-aa1-00000c-00.html
                    [2] => /buffalotraders/test/dan/firsttry/products/
                )

            [1] => Array
                (
                    [0] => http://www.buffalotraders.com/product_info.php?products_id=54
                    [1] => cisco-cp-7940g-ip-phone.html
                    [2] => /buffalotraders/test/dan/firsttry/products/
                )

            [2] => Array
                (
                    [0] => http://www.buffalotraders.com/product_info.php?products_id=570
                    [1] => pw130-8023af-power-over-ethernet-poe-injector.html
                    [2] => /buffalotraders/test/dan/firsttry/products/
                )


here is the pertinent code:

public function buildFiles($URLArray)
    {
      //for($m = 0; $m < count($URLArray[0]); $m++)
      //{
         for($l = 0; $l < count($URLArray[0]); $l++)
         {
            $URL  = $URLArray[0][$l][0];
            $Name = $URLArray[0][$l][1];
            $Path = $URLArray[0][$l][2];

            $req =& new HTTP_Request($URL);
            if(!PEAR::isError($req->sendRequest()))
            {
              $html_out = $req->getResponseBody();
            }
            /// The entire URL is now held in as a string in the $html_out variable.
            /// The next step is to find all links on the page, and determine if they
            /// match with a URL/Name pair from URLArray.  If they do, we will then
            /// replace the link with the correct .html Name.

            $content = editLinks($URLArray,$html_out);

            $write_to_file = $Path.$Name;
            /// Now that all links have been edited properly, we can now write the
            /// source to the Name in the Path using the writeFiles() function.
            echo $URL ."-----". $write_to_file ."\n\n";
            $this->writeFiles($write_to_file,$content);
          }      
       //}
    }

    /// The following function receives the URL/Name/Path array and the source of the
    /// html file (in the $html_out variable).
    public function editLinks($URLArray,$html_out)
    {
      //$regex = '@<a(.*?)>(.*?)</a>@i';
      $regex = '@<a[^>]*href=(")([^"]+)"[^>]*>(.*?)</a>@i';
      preg_match_all($regex,$html_out,$matches);
     
      for($n = 0; $n < count($URLArray[0]); $n++)
      {
           
      }
    }
0
Comment
Question by:flow79
  • 5
  • 2
8 Comments
 
LVL 49

Accepted Solution

by:
Roonaan earned 125 total points
ID: 16925684
You can also change the $URLArray into a input-array for str_replace:

    public function editLinks($URLArray,$html_out)
    {
        $replace_from = array();
        $replace_to     = array();
        foreach($URLArray[0] as $info) {
           list($from, $to_file, $to_path) = $info;
           $replace_from[] = $from;
           $replace_to[] = $to_path.$to_file;
        }      
        return str_replace($replace_from, $replace_to, $html_out);
    }

-r-
0
 
LVL 14

Expert Comment

by:ThG
ID: 16932868

This is probably off topic, but i would suggest a better approach:
- Use url rewriting to make yourself able to use .html links even on "live" php pages. This way you can edit your html templates to actually point to the .html link
- Dump the whole contents with PEAR::HTTP_Request in the same way
- You no longer need the above script and you have a cleaner architecture.

(this also make you able to build a transparent cache)
0
 
LVL 13

Author Comment

by:flow79
ID: 16933950
ThG,

I am always willing to try new techniques, but you will need to give me more information on how do complete that.  (i'm somewhat new to PHP, sorry).  

If you could either edit my above code to do so, or give me an in-depth example, I can probably make it work.
0
 
LVL 13

Author Comment

by:flow79
ID: 16933975
Roonaan,

Will the code you provided above find all instances of links, and then pick out the appropriate new .html file name based on the original URL from the array?

I guess i dont completely follow how your code works.

Any more information you could provide would be greatly appreciated.

- Dan
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 13

Author Comment

by:flow79
ID: 16934036
i used your code in my function Roonaan, and it does not edit the links at all.  any thoughts?
0
 
LVL 13

Author Comment

by:flow79
ID: 16934869
ThG,

I have been looking into URL rewriting using the mod_rewrite module in Apache and such.  It seems like a very nice functionality, but half the reason we want to make true HTML files is for loading speed (both for the user and spiders/crawlers).  It seems that the mod_rewrite only rewrites the way the URL is displayed, but does not actually re-create the file as static HTML.

That is our goal here.  Any assistance anyone can provide is greatly appreciated, as this project is fairly time-important.

Thanks!
0
 
LVL 13

Author Comment

by:flow79
ID: 16937343
actually, now that i look closer, i do believe your code is working properly Roonaan - thanks so much!!!!

0
 
LVL 14

Expert Comment

by:ThG
ID: 16942446
flow79: in fact, you understood well how mod_rewrite works. Please read again my above post...

I said that url_rewrite can help you to use the .html-links even on "live" php pages. After that, you still need to concretize html files by executing your script, but this time it works automatically without parsing again the saved files...
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Foreword (July, 2015) Since I first wrote this article, years ago, a great many more people have begun using the internet.  They are coming online from every part of the globe, learning, reading, shopping and spending money at an ever-increasing ra…
These days socially coordinated efforts have turned into a critical requirement for enterprises.
The viewer will learn how to count occurrences of each item in an array.
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.

911 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now